-
Notifications
You must be signed in to change notification settings - Fork 294
Description
Why are we doing this?
To ensure GPT-RAG enforces document-level access control at query time, so users only retrieve content they are explicitly authorized to see. This is critical for enterprise, regulated, and compliance-driven deployments where data exposure must follow identity, group membership, and resource-level permissions.
By integrating Azure AI Search’s native ACL and RBAC-based security filtering, GPT-RAG can provide secure multi-tenant and enterprise-grade search experiences without duplicating authorization logic in the application layer. This reduces the risk of data leakage, simplifies compliance, and aligns GPT-RAG with Microsoft-recommended security patterns.
What does it do?
- Native Azure AI Search ACL enforcement – Uses built-in query-time access control based on
userIds,groupIds, and/orrbacScopemetadata. - Permission-aware indexing – Ensures permission metadata is ingested alongside content (via indexers or push APIs).
- Automatic security filters – Relies on Azure AI Search to dynamically append internal security filters at query time.
- Identity propagation – Passes the user token through GPT-RAG to Azure AI Search using
x-ms-query-source-authorization. - Multi-source compatibility – Supports ADLS Gen2, Blob Storage, and SharePoint permission models.
- Safe-by-default behavior – Prevents unauthorized results from being returned even when service keys are used.
- Debug and troubleshooting support – Enables elevated-read mode for administrators to diagnose permission-related issues.
Technical Guidelines
-
Permission metadata must be stored in filterable string fields in the index.
-
GPT-RAG must propagate the end-user identity token to Azure AI Search using
x-ms-query-source-authorization. -
The Orchestrator must not implement custom ACL filtering logic; authorization must be enforced by Azure AI Search.
-
Public content must be explicitly modeled (for example, “Everyone” or equivalent).
-
Queries without a valid user token must not return ACL-protected content.
-
Elevated-read mode (
x-ms-enable-elevated-read: true) must only be used for debugging and require a dedicated custom role. -
Indexers or ingestion pipelines must normalize and validate ACL metadata format.
-
The solution must support both:
- POSIX-style ACLs (user/group permissions)
- RBAC scopes (container-level access)
References