dify/docs/eu-ai-act-compliance.md

13 KiB

EU AI Act Compliance Guide for Dify Deployers

Dify is an LLMOps platform for building RAG pipelines, agents, and AI workflows. If you deploy Dify in the EU — whether self-hosted or using a cloud provider — the EU AI Act applies to your deployment. This guide covers what the regulation requires and how Dify's architecture maps to those requirements.

Is your system in scope?

The detailed obligations in Articles 12, 13, and 14 only apply to high-risk AI systems as defined in Annex III of the EU AI Act. A Dify application is high-risk if it is used for:

  • Recruitment and HR — screening candidates, evaluating employee performance, allocating tasks
  • Credit scoring and insurance — assessing creditworthiness or setting premiums
  • Law enforcement — profiling, criminal risk assessment, border control
  • Critical infrastructure — managing energy, water, transport, or telecommunications systems
  • Education assessment — grading students, determining admissions
  • Essential public services — evaluating eligibility for benefits, housing, or emergency services

Most Dify deployments (customer-facing chatbots, internal knowledge bases, content generation workflows) are not high-risk. If your Dify application does not fall into one of the categories above:

  • Article 50 (end-user transparency) still applies if users interact with your application directly. See the Article 50 section below.
  • GDPR still applies if you process personal data. See the GDPR section below.
  • The high-risk obligations (Articles 9-15) are less likely to apply, but risk classification is context-dependent. Do not self-classify without legal review. Focus on Article 50 (transparency) and GDPR (data protection) as your baseline obligations.

If you are unsure whether your use case qualifies as high-risk, consult a qualified legal professional before proceeding.

Self-hosted vs cloud: different compliance profiles

Deployment Your role Dify's role Who handles compliance?
Self-hosted Provider and deployer Framework provider — obligations under Article 25 apply only if Dify is placed on the market or put into service as part of a complete AI system bearing its name or trademark You
Dify Cloud Deployer Provider and processor Shared — Dify handles SOC 2 and GDPR for the platform; you handle AI Act obligations for your specific use case

Dify Cloud already has SOC 2 Type II and GDPR compliance for the platform itself. But the EU AI Act adds obligations specific to AI systems that SOC 2 does not cover: risk classification, technical documentation, transparency, and human oversight.

Supported providers and services

Dify integrates with a broad range of AI providers and data stores. The following are the key ones relevant to compliance:

  • AI providers: HuggingFace (core), plus integrations with OpenAI, Anthropic, Google, and 100+ models via provider plugins
  • Model identifiers include: gpt-4o, gpt-3.5-turbo, claude-3-opus, gemini-2.5-flash, whisper-1, and others
  • Vector database connections: Extensive RAG infrastructure supporting numerous vector stores

Dify's plugin architecture means actual provider usage depends on your configuration. Document which providers and models are active in your deployment.

Data flow diagram

A typical Dify RAG deployment:

graph LR
    USER((User)) -->|query| DIFY[Dify Platform]
    DIFY -->|prompts| LLM([LLM Provider])
    LLM -->|responses| DIFY
    DIFY -->|documents| EMBED([Embedding Model])
    EMBED -->|vectors| DIFY
    DIFY -->|store/retrieve| VS[(Vector Store)]
    DIFY -->|knowledge| KB[(Knowledge Base)]
    DIFY -->|response| USER

    classDef processor fill:#60a5fa,stroke:#1e40af,color:#000
    classDef controller fill:#4ade80,stroke:#166534,color:#000
    classDef app fill:#a78bfa,stroke:#5b21b6,color:#000
    classDef user fill:#f472b6,stroke:#be185d,color:#000

    class USER user
    class DIFY app
    class LLM processor
    class EMBED processor
    class VS controller
    class KB controller

GDPR roles (providers are typically processors for customer-submitted data, but the exact role depends on each provider's terms of service and processing purpose; deployers should review each provider's DPA):

  • Cloud LLM providers (OpenAI, Anthropic, Google) typically act as processors — requires DPA.
  • Cloud embedding services typically act as processors — requires DPA.
  • Self-hosted vector stores (Weaviate, Qdrant, pgvector): Your organization remains the controller — no third-party transfer.
  • Cloud vector stores (Pinecone, Zilliz Cloud) typically act as processors — requires DPA.
  • Knowledge base documents: Your organization is the controller — stored in your infrastructure.

Article 11: Technical documentation

High-risk systems need Annex IV documentation. For Dify deployments, key sections include:

Section What Dify provides What you must document
General description Platform capabilities, supported models Your specific use case, intended users, deployment context
Development process Dify's architecture, plugin system Your RAG pipeline design, prompt engineering, knowledge base curation
Monitoring Dify's built-in logging and analytics Your monitoring plan, alert thresholds, incident response
Performance metrics Dify's evaluation features Your accuracy benchmarks, quality thresholds, bias testing
Risk management Risk assessment for your specific use case

Some sections can be derived from Dify's architecture and your deployment configuration, as shown in the table above. The remaining sections require your input.

Article 12: Record-keeping

Dify's built-in logging covers several Article 12 requirements:

Requirement Dify Feature Status
Conversation logs Full conversation history with timestamps Covered
Model tracking Model name recorded per interaction Covered
Token usage Token counts per message Covered
Cost tracking Cost per conversation (if provider reports it) Partial
Document retrieval RAG source documents logged Covered
User identification User session tracking Covered
Error logging Failed generation logs Covered
Data retention Configurable Your responsibility

Retention periods: The required retention period depends on your role under the Act. Article 18 requires providers of high-risk systems to retain logs and technical documentation for 10 years after market placement. Article 26(6) requires deployers to retain logs for at least 6 months. If you self-host Dify and have substantially modified the system, you may be classified as a provider rather than a deployer. Confirm the applicable retention period with legal counsel.

Article 13: Transparency to deployers

Article 13 requires providers of high-risk AI systems to supply deployers with the information needed to understand and operate the system correctly. This is a documentation obligation, not a logging obligation. For Dify deployments, this means the upstream LLM and embedding providers must give you:

  • Instructions for use, including intended purpose and known limitations
  • Accuracy metrics and performance benchmarks
  • Known or foreseeable risks and residual risks after mitigation
  • Technical specifications: input/output formats, training data characteristics, model architecture details

As a deployer, collect model cards, system documentation, and accuracy reports from each AI provider your Dify application uses. Maintain these as part of your Annex IV technical documentation.

Dify's platform features provide supporting evidence that can inform Article 13 documentation, but they do not satisfy Article 13 on their own:

  • Source attribution — Dify's RAG citation feature shows which documents informed the response, supporting deployer-side auditing
  • Model identification — Dify logs which LLM model generates responses, providing evidence for system documentation
  • Conversation logs — execution history helps compile performance and behavior evidence

You must independently produce system documentation covering how your specific Dify deployment uses AI, its intended purpose, performance characteristics, and residual risks.

Article 50: End-user transparency

Article 50 requires deployers to inform end users that they are interacting with an AI system. This is a separate obligation from Article 13 and applies even to limited-risk systems.

For Dify applications serving end users:

  1. Disclose AI involvement — tell users they are interacting with an AI system
  2. AI-generated content labeling — identify AI-generated content as such (e.g., clear labeling in the UI)

Dify's "citation" feature also supports end-user transparency by showing users which knowledge base documents informed the answer.

Note: Article 50 applies to chatbots and systems interacting directly with natural persons. It has a separate scope from the high-risk designation under Annex III — it applies even to limited-risk systems.

Article 14: Human oversight

Article 14 requires that high-risk AI systems be designed so that natural persons can effectively oversee them. Dify provides automated technical safeguards that support human oversight, but they are not a substitute for it:

Dify Feature What It Does Oversight Role
Annotation/feedback system Human review of AI outputs Direct oversight — humans evaluate and correct AI responses
Content moderation Built-in filtering before responses reach users Automated safeguard — reduces harmful outputs but does not replace human judgment on edge cases
Rate limiting Controls on API usage Automated safeguard — bounds system behavior, supports overseer's ability to maintain control
Workflow control Insert human review steps between AI generation and output Oversight enabler — allows building approval gates into the pipeline

These automated controls are necessary building blocks, but Article 14 compliance requires human oversight procedures on top of them:

  • Escalation procedures — define what happens when moderation triggers or edge cases arise (who is notified, what action is taken)
  • Human review pipeline — for high-stakes decisions, route AI outputs to a qualified person before they take effect
  • Override mechanism — a human must be able to halt AI responses or override the system's output
  • Competence requirements — the human overseer must understand the system's capabilities, limitations, and the context of its outputs

For high-risk use cases (HR, legal, medical), configure your Dify workflow to require human approval before the AI response is delivered to the end user or acted upon.

Knowledge base compliance

Dify's knowledge base feature has specific compliance implications:

  1. Data provenance: Document where your knowledge base documents come from. Article 10 requires data governance for training data; knowledge bases are analogous.
  2. Update tracking: When you add, remove, or update documents in the knowledge base, log the change. The AI system's behavior changes with its knowledge base.
  3. PII in documents: If knowledge base documents contain personal data, GDPR applies to the entire RAG pipeline. Implement access controls and consider PII redaction before indexing.
  4. Copyright: Ensure you have the right to use the documents in your knowledge base for AI-assisted generation.

GDPR considerations

  1. Legal basis (Article 6): Document why AI processing of user queries is necessary
  2. Data Processing Agreements (Article 28): Required for each cloud LLM and embedding provider
  3. Data minimization: Only include necessary context in prompts; avoid sending entire documents when a relevant excerpt suffices
  4. Right to erasure: If a user requests deletion, ensure their conversations are removed from Dify's logs AND any vector store entries derived from their data
  5. Cross-border transfers: Providers based outside the EEA — including US-based providers (OpenAI, Anthropic), and any other non-EEA providers you route to — require Standard Contractual Clauses (SCCs) or equivalent safeguards under Chapter V of the GDPR. Review each provider's transfer mechanism individually.

Resources


This is not legal advice. Consult a qualified professional for compliance decisions.