-
Notifications
You must be signed in to change notification settings - Fork 1.7k
fix: SortPreservingMerge sanity check rejects valid ORDER BY with CASE expression #18342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: SortPreservingMerge sanity check rejects valid ORDER BY with CASE expression #18342
Conversation
|
I've also confirmed this fixes my issue with the following query issued by npgsql (nb: there is a correlated subquery which I fixup to use SELECT ns.nspname, t.oid, t.typname, t.typtype, t.typnotnull, t.elemtypoid
FROM (
-- Arrays have typtype=b - this subquery identifies them by their typreceive and converts their typtype to a
-- We first do this for the type (innerest-most subquery), and then for its element type
-- This also returns the array element, range subtype and domain base type as elemtypoid
SELECT
typ.oid, typ.typnamespace, typ.typname, typ.typtype, typ.typrelid, typ.typnotnull, typ.relkind,
elemtyp.oid AS elemtypoid, elemtyp.typname AS elemtypname, elemcls.relkind AS elemrelkind,
CASE WHEN elemproc.proname='array_recv' THEN 'a' ELSE elemtyp.typtype END AS elemtyptype
FROM (
SELECT typ.oid, typnamespace, typname, typrelid, typnotnull, relkind, typelem AS elemoid,
CASE WHEN proc.proname='array_recv' THEN 'a' ELSE typ.typtype END AS typtype,
CASE
WHEN proc.proname='array_recv' THEN typ.typelem
WHEN typ.typtype='r' THEN rngsubtype
WHEN typ.typtype='m' THEN (SELECT rngtypid FROM pg_range WHERE rngmultitypid = typ.oid)
WHEN typ.typtype='d' THEN typ.typbasetype
END AS elemtypoid
FROM pg_type AS typ
LEFT JOIN pg_class AS cls ON (cls.oid = typ.typrelid)
LEFT JOIN pg_proc AS proc ON proc.oid = typ.typreceive
LEFT JOIN pg_range ON (pg_range.rngtypid = typ.oid)
) AS typ
LEFT JOIN pg_type AS elemtyp ON elemtyp.oid = elemtypoid
LEFT JOIN pg_class AS elemcls ON (elemcls.oid = elemtyp.typrelid)
LEFT JOIN pg_proc AS elemproc ON elemproc.oid = elemtyp.typreceive
) AS t
JOIN pg_namespace AS ns ON (ns.oid = typnamespace)
WHERE
typtype IN ('b', 'r', 'm', 'e', 'd') OR -- Base, range, multirange, enum, domain
(typtype = 'c' AND relkind='c') OR -- User-defined free-standing composites (not table composites) by default
(typtype = 'p' AND typname IN ('record', 'void')) OR -- Some special supported pseudo-types
(typtype = 'a' AND ( -- Array of...
elemtyptype IN ('b', 'r', 'm', 'e', 'd') OR -- Array of base, range, multirange, enum, domain
(elemtyptype = 'p' AND elemtypname IN ('record', 'void')) OR -- Arrays of special supported pseudo-types
(elemtyptype = 'c' AND elemrelkind='c') -- Array of user-defined free-standing composites (not table composites) by default
))
ORDER BY CASE
WHEN typtype IN ('b', 'e', 'p') THEN 0 -- First base types, enums, pseudo-types
WHEN typtype = 'r' THEN 1 -- Ranges after
WHEN typtype = 'm' THEN 2 -- Multiranges after
WHEN typtype = 'c' THEN 3 -- Composites after
WHEN typtype = 'd' AND elemtyptype <> 'a' THEN 4 -- Domains over non-arrays after
WHEN typtype = 'a' THEN 5 -- Arrays after
WHEN typtype = 'd' AND elemtyptype = 'a' THEN 6 -- Domains over arrays last
END |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I have left a few comments, but consider that I am new to Datafusion so they might be off-mark due to missing context
| WITH typ(oid, typnamespace, typname, typtype) AS ( | ||
| SELECT * FROM (VALUES (1, 10, 't1', 'b')) | ||
| UNION ALL SELECT * FROM (VALUES (2, NULL, 't2', 'b')) | ||
| UNION ALL SELECT * FROM (VALUES (3, 12, 't3', NULL)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: maybe adding a non-NULL value which is retained by the filter but not matching any WHEN branch like m? We are exercising the implicit ELSE null with rows having typtype as NULL value, but it might be worth checking with explicit values too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added in dffb0c2
| // TODO: bounds MAY be certain if either LHS or RHS is certainly false | ||
|
|
||
| // Return UNCERTAIN when intervals don't have concrete boolean bounds | ||
| _ => Ok(Self::UNCERTAIN), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This make sense to me, under the classic 3-valued logic, in such cases it's fine to return unknown.
This method should be used with care as it would probably be unsafe for simplifications, as the TODO suggests we are not handling cases like FALSE AND UNKNOWN which is FALSE, or TRUE OR UNKNOWN which is TRUE.
Depending on where we invoke this method, we need to be aware of the unknown-as-false semantics (filters) and unknown-as-null semantics (order by, projections, etc.), and decide if it is safe to use this method or not.
Since we were previously erroring out, I think it's in any case an improvement but just wanted to point that out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to find places that would be confused by this change, but there is quite a bit of logic around Interval and tracing the implications are a bit beyond me as a first time datafusion contributor. In a good way it seems Interval::UNCERTAIN is handled uniformly from what I found. There also seems to be test coverage in the optimizer if those TODOs were executed on (test_simplify_with_guarantee, test_inequalities_maybe_null, etc).
@alamb @asolimando I'm happy to explore adding those bounds here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very interesting...I started by adding tests for and/or:
#[test]
fn test_and_null_boolean_intervals() -> Result<()> {
let null_interval =
Interval::try_new(ScalarValue::Boolean(None), ScalarValue::Boolean(None))?;
let and_result = null_interval.and(&Interval::CERTAINLY_FALSE)?;
assert_eq!(and_result, Interval::CERTAINLY_FALSE);
let and_result = null_interval.and(&Interval::CERTAINLY_TRUE)?;
assert_eq!(and_result, Interval::UNCERTAIN);
let and_result = null_interval.and(&null_interval)?;
assert_eq!(and_result, Interval::UNCERTAIN);
Ok(())
}
#[test]
fn test_or_null_boolean_intervals() -> Result<()> {
let null_interval =
Interval::try_new(ScalarValue::Boolean(None), ScalarValue::Boolean(None))?;
let or_result = null_interval.or(&Interval::CERTAINLY_FALSE)?;
assert_eq!(or_result, Interval::UNCERTAIN);
let or_result = null_interval.or(&Interval::CERTAINLY_TRUE)?;
assert_eq!(or_result, Interval::CERTAINLY_TRUE);
let or_result = null_interval.or(&null_interval)?;
assert_eq!(or_result, Interval::UNCERTAIN);
Ok(())
}And they already worked without any code changes, so I think my TODO statements were premature and unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@watford-ep sounds good! Can you also add the reverse (true or null where you now test null or true) just to make sure that we short-circuit the right way independently from the operands' order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
…e-sanity-check-failure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @watford-ep and @asolimando -- I agree this looks like a good fix.
I had one question about one of the tests which seems unrelated to the rest of this PR, but otherwise 👍
I also took the liberty of pushing a commit that moves the test to sqllogictests as I think that is easier to understand in the context of this project
| let df = ctx.sql(sql).await?; | ||
| let plan = df.create_physical_plan().await; | ||
|
|
||
| assert!(plan.is_ok(), "planning failed: {:?}", plan.err()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I verified this test fails with an assert without the code in this PR
---- physical_optimizer::sanity_checker::test_order_by_case_requirement stdout ----
thread 'physical_optimizer::sanity_checker::test_order_by_case_requirement' panicked at datafusion/core/tests/physical_optimizer/sanity_checker.rs:404:5:
planning failed: Some(Context("SanityCheckPlan", Plan("Plan: [\"SortPreservingMergeExec: [CASE WHEN typtype@3 = b OR typtype@3 = e OR typtype@3 = p THEN 0 WHEN typtype@3 = r THEN 1 END ASC NULLS LAST]\", \" SortExec: expr=[CASE WHEN typtype@3 = b OR typtype@3 = e OR typtype@3 = p THEN 0 WHEN typtype@3 = r THEN 1 END ASC NULLS LAST], preserve_partitioning=[true]\", \" CoalesceBatchesExec: target_batch_size=8192\", \" HashJoinExec: mode=CollectLeft, join_type=Inner, on=[(oid@0, typnamespace@1)], projection=[nspname@1, oid@2, typname@4, typtype@5]\", \" ProjectionExec: expr=[column1@0 as oid, column2@1 as nspname]\", \" DataSourceExec: partitions=1, partition_sizes=[1]\", \" ProjectionExec: expr=[column1@0 as oid, column2@1 as typnamespace, column3@2 as typname, column4@3 as typtype]\", \" UnionExec\", \" CoalesceBatchesExec: target_batch_size=8192\", \" FilterExec: column4@3 IN (SET) ([b, r, m, e, d])\", \" DataSourceExec: partitions=1, partition_sizes=[1]\", \" ProjectionExec: expr=[column1@0 as column1, CAST(column2@1 AS Int64) as column2, column3@2 as column3, column4@3 as column4]\", \" RepartitionExec: partitioning=RoundRobinBatch(16), input_partitions=1\", \" CoalesceBatchesExec: target_batch_size=8192\", \" FilterExec: column4@3 IN (SET) ([b, r, m, e, d])\", \" DataSourceExec: partitions=1, partition_sizes=[1]\", \" ProjectionExec: expr=[column1@0 as column1, column2@1 as column2, column3@2 as column3, CAST(column4@3 AS Utf8) as column4]\", \" RepartitionExec: partitioning=RoundRobinBatch(16), input_partitions=1\", \" CoalesceBatchesExec: target_batch_size=8192\", \" FilterExec: CAST(column4@3 AS Utf8) IN (SET) ([b, r, m, e, d])\", \" DataSourceExec: partitions=1, partition_sizes=[1]\"] does not satisfy order requirements: [CASE WHEN typtype@3 = b OR typtype@3 = e OR typtype@3 = p THEN 0 WHEN typtype@3 = r THEN 1 END ASC NULLS LAST]. Child-0 order: [[CASE WHEN typtype@3 = b OR typtype@3 = e OR typtype@3 = p THEN 0 WHEN typtype@3 = r THEN 1 END ASC NULLS LAST]]")))
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
| WITH typ(oid, typnamespace, typname, typtype) AS ( | ||
| SELECT * FROM (VALUES (1, 10, 't1', 'b')) | ||
| UNION ALL SELECT * FROM (VALUES (2, NULL, 't2', 'b')) | ||
| UNION ALL SELECT * FROM (VALUES (3, 12, 't3', NULL)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added in dffb0c2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
|
Thanks again @watford-ep and for reviews @asolimando ! |
Which issue does this PR close?
Rationale for this change
ORDER BY with a CASE statement didn't always work, raising a sanity check error in SortPreservingMergeExec. The plan showed that the partitions all had the same ordering, but for whatever reason they were not detected as being equal. Using a single partition succeeded always.
What changes are included in this PR?
The changes are non-obvious and I spent a lot of time bisecting/debug printing and landed on a failure in bounds checking with boolean interval arithmetic. Returning UNCERTAIN if either leg of the interval is NULL resolves the upstream issue where CASE statements end up being deemed Unordered.
My rust-fu is hobbyist at best, so while this appears to resolve my issue I cannot for-certain exclaim that I've solved it all (Claude 4.5 agrees with my fix, but that's not an indication its any good). I'm also reasonably certain my unit tests are more ham fisted than necessary.
Are these changes tested?
Are there any user-facing changes?
This does not change any behavior beyond resolving a bug with a valid SQL statement.