-
Notifications
You must be signed in to change notification settings - Fork 836
feat(query): prevent dropping in-use security policies #18918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- raise constraint errors when masking/row access policies remain bound to tables - introduce dedicated masking/row-access policy error types and surface them via APIs - extend SQL logic tests to cover drop failures for active policies
a94a180 to
920f807
Compare
- Add a KeysWithPrefix transaction condition to masking/row policy drops so freshly added bindings abort the txn instead of leaving dangling references. - Replace the duplicated usage structs/functions with a shared collect_policy_usage, reusing the gathered table updates and avoiding the extra clones. - Introduce txn_cond_eq_keys_with_prefix helper for prefix cardinality checks we can reuse elsewhere.
drmingdrmer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AI has no idea what it's actually building. You're still the one who needs to understand the problem and verify the solution makes sense.
@drmingdrmer reviewed 3 of 13 files at r1, all commit messages.
Reviewable status: 3 of 13 files reviewed, all discussions resolved (waiting on @dantengsky and @KKould)
a0a7c91 to
7ab5725
Compare
645e77c to
5de5f1f
Compare
drmingdrmer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@drmingdrmer reviewed 1 of 13 files at r1, 4 of 11 files at r2, all commit messages.
Reviewable status: 7 of 15 files reviewed, 3 unresolved discussions (waiting on @dantengsky and @KKould)
src/meta/api/src/data_mask_api_impl.rs line 354 at r2 (raw file):
) -> Result<MaskPolicyUsage, MetaTxnError> { collect_policy_usage(kv_api, tenant, policy_id).await }
This wrapper function is weird. Why do we need that?
Code quote:
async fn collect_mask_policy_usage(
kv_api: &(impl kvapi::KVApi<Error = MetaError> + ?Sized),
tenant: &Tenant,
policy_id: u64,
) -> Result<MaskPolicyUsage, MetaTxnError> {
collect_policy_usage(kv_api, tenant, policy_id).await
}src/meta/api/src/security_policy_usage.rs line 177 at r2 (raw file):
K: PolicyBinding + Clone + Send + Sync + 'static, K::ValueType: FromToProto + Send, E: From<MetaError>,
This function is used only once. Why is the generic type parameter is necessary?
Code quote:
pub(crate) async fn collect_policy_usage<K, E>(
kv_api: &(impl kvapi::KVApi<Error = MetaError> + ?Sized),
tenant: &Tenant,
policy_id: u64,
) -> Result<PolicyUsage<K>, E>
where
K: PolicyBinding + Clone + Send + Sync + 'static,
K::ValueType: FromToProto + Send,
E: From<MetaError>,src/query/ee/src/row_access_policy/row_access_policy_handler.rs line 56 at r2 (raw file):
req: DropRowAccessPolicyReq, ) -> Result<()> { let dropped = (meta_api.drop_row_access(&req.name).await?)?;
The braces is weird. Why do we need that?
TCeason
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 7 of 15 files reviewed, 3 unresolved discussions (waiting on @dantengsky, @drmingdrmer, and @KKould)
src/meta/api/src/data_mask_api_impl.rs line 354 at r2 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
This wrapper function is weird. Why do we need that?
If not , call collect_policy_usage in mask will write as collect_policy_usage::<MaskPolicyTableIdIdent, MetaTxnError>(self, tenant, policy_id).await?. This enables automatic type deduction. It can be changed to manually specify the type.
src/meta/api/src/security_policy_usage.rs line 177 at r2 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
This function is used only once. Why is the generic type parameter is necessary?
This function is used by:
- Mask policies via MaskPolicyTableIdIdent binding type
- Row access policies via RowAccessPolicyTableIdIdent binding type
src/query/ee/src/row_access_policy/row_access_policy_handler.rs line 56 at r2 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
The braces is weird. Why do we need that?
will modify as wait??
drmingdrmer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to be critical of AI-generated code. This PR is a perfect example—it's full of unclear concepts that add no value and should be removed.
@drmingdrmer reviewed all commit messages.
Reviewable status: 5 of 15 files reviewed, 5 unresolved discussions (waiting on @dantengsky and @KKould)
src/meta/api/src/security_policy_usage.rs line 33 at r3 (raw file):
/// Represents a binding record that links a security policy to a table. #[derive(Clone)] pub(crate) struct PolicyBindingEntry<I> {
and is PolicyBindingEntry really needed? it seems like just a duplication of SeqV.
Code quote:
PolicyBindingEntrysrc/meta/api/src/security_policy_usage.rs line 175 at r3 (raw file):
) -> Result<PolicyUsage<K>, MetaError> where K: PolicyBinding + Clone + Send + Sync + 'static,
And why is PolicyBinding required? it is used only once.
TCeason
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 5 of 15 files reviewed, 5 unresolved discussions (waiting on @dantengsky, @drmingdrmer, and @KKould)
src/meta/api/src/security_policy_usage.rs line 33 at r3 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
and is PolicyBindingEntry really needed? it seems like just a duplication of SeqV.
Yes we can directly use for (key, seq) in bindings replace it. No need to create a new struct
src/meta/api/src/security_policy_usage.rs line 175 at r3 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
And why is PolicyBinding required? it is used only once.
twice in drop_data_mask and collect_policy_usage
Move the table-policy reference lifecycle management from drop policy operations to drop/undrop table operations. This simplifies the drop policy logic and improves performance. Key changes: - drop_table: delete all table-policy references when marking table as dropped - undrop_table: verify policies still exist, cleanup stale references, restore valid ones - drop_policy: simplified from incremental cleanup (N/100+1 txns) to single existence check (1 txn) All operations are protected by seq conditions for concurrent safety using optimistic locking pattern.
drmingdrmer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@drmingdrmer reviewed 1 of 11 files at r2, 2 of 6 files at r3, 1 of 4 files at r4, 4 of 6 files at r5, 7 of 13 files at r6.
Reviewable status: 14 of 22 files reviewed, 6 unresolved discussions (waiting on @dantengsky and @KKould)
src/meta/api/src/data_mask_api_impl.rs line 187 at r5 (raw file):
// Ensure no new references were created txn.condition .push(txn_cond_eq_keys_with_prefix(&table_policy_ref_prefix, 0));
much cleaner! great move!
Code quote:
let tenant = name_ident.tenant();
let table_policy_ref_prefix = DirName::new(MaskPolicyTableIdIdent::new_generic(
tenant.clone(),
MaskPolicyIdTableId {
policy_id,
table_id: 0,
},
));
// List all table-policy references
let table_policy_refs = self.list_pb_vec(&table_policy_ref_prefix).await?;
// Policy is in use - cannot drop
if !table_policy_refs.is_empty() {
return Ok(Err(MaskingPolicyError::policy_in_use(
tenant.tenant_name().to_string(),
name_ident.data_mask_name().to_string(),
)));
}
// No references - drop the policy
let id_ident = seq_id.data.into_t_ident(tenant);
let mut txn = TxnRequest::default();
// Ensure no new references were created
txn.condition
.push(txn_cond_eq_keys_with_prefix(&table_policy_ref_prefix, 0));src/meta/api/src/schema_api.rs line 256 at r5 (raw file):
table_id, }, )));
There may be duplicated entries, need to de-dup the entries to delete
Code quote:
for policy_map in tb_meta.column_mask_policy_columns_ids.values() {
txn.if_then
.push(txn_op_del(&MaskPolicyTableIdIdent::new_generic(
tenant.clone(),
MaskPolicyIdTableId {
policy_id: policy_map.policy_id,
table_id,
},
)));src/meta/api/src/schema_api.rs line 597 at r5 (raw file):
} async fn cleanup_missing_policies_on_undrop(
this function should be removed right?
TCeason
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 14 of 22 files reviewed, 6 unresolved discussions (waiting on @dantengsky, @drmingdrmer, and @KKould)
src/meta/api/src/schema_api.rs line 256 at r5 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
There may be duplicated entries, need to de-dup the entries to delete
Good optimize. I will fix it.
src/meta/api/src/schema_api.rs line 597 at r5 (raw file):
Previously, drmingdrmer (张炎泼) wrote…
this function should be removed right?
cleanup_missing_policies_on_undrop is still required. When a table is dropped we delete the policy→table reference keys (MaskPolicyTableIdIdent / RowAccessPolicyTableIdIdent). During undrop the table meta we revive still lists those policy IDs, so we must recreate any surviving
references (and prune ones whose policies were removed meanwhile). That reconciliation happens exclusively inside cleanup_missing_policies_on_undrop (src/meta/api/src/schema_api. rs:597), and it also wires the necessary txn conditions/ops for the undrop transaction (src/meta/api/src/
schema_api.rs:520). Removing it would leave undropped tables with stale policy metadata and missing reverse-reference keys.
drmingdrmer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@drmingdrmer reviewed 1 of 13 files at r1, 1 of 4 files at r4, 4 of 13 files at r6, 2 of 3 files at r7, 1 of 1 files at r10, all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on @dantengsky and @KKould)
I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/
Summary
Prevents dropping security policies (masking and row access policies) that are still in use by tables, with robust concurrency protection.
Changes
Prevent dropping in-use policies: Add constraint checks and dedicated error types (
MaskingPolicyError,RowAccessPolicyError) to block dropping policies still bound to tablesMove reference management to table operations:
drop_table: delete table-policy references when droppingundrop_table: verify policies exist, cleanup stale references, restore valid onesdrop_policy: simplified from incremental cleanup (N/100+1 txns) to single check (1 txn)txn_cond_eq_keys_with_prefixconditions to guard against race conditions during policy dropTests
Type of change
This change is