[ty] add `SyntheticTypedDictType` and implement `normalized` and `is_equivalent_to` #21784

oconnor663 · 2025-12-04T04:04:12Z

This PR cribs a lot of @ibraheemdev's work on #20732.

This depends on a couple of upstream changes, and the first commit is a temporary/dummy commit that pins git hashes for these:

Those pins (at least the second one, which overwrites a published version number) aren't intended to land, and ~~I'll need to fix up this PR once / assuming I can get the upstream changes shipped.~~ Update: This is done.

crates/ty_python_semantic/src/types/typed_dict.rs

astral-sh-bot · 2025-12-04T04:06:21Z

Diagnostic diff on typing conformance tests

No changes detected when running ty on typing conformance tests ✅

crates/ty_python_semantic/src/types/display.rs

crates/ty_python_semantic/src/types.rs

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

crates/ty_python_semantic/src/types/typed_dict.rs

astral-sh-bot · 2025-12-04T04:08:10Z

`mypy_primer` results

Changes were detected when running on open source projects

hydra-zen (https://github.com/mit-ll-responsible-ai/hydra-zen)
+ src/hydra_zen/structured_configs/_implementations.py:2982:60: error[invalid-argument-type] Argument to function `make_dataclass` is incorrect: Expected `dict[str, Any] | None`, found `bool`
+ src/hydra_zen/structured_configs/_implementations.py:2982:60: error[invalid-argument-type] Argument to function `make_dataclass` is incorrect: Expected `bool`, found `str | None`
+ src/hydra_zen/structured_configs/_implementations.py:2982:60: error[invalid-argument-type] Argument to function `make_dataclass` is incorrect: Expected `bool`, found `dict[str, Any] | None`
+ src/hydra_zen/structured_configs/_implementations.py:3342:52: error[invalid-argument-type] Argument to function `make_dataclass` is incorrect: Expected `dict[str, Any] | None`, found `bool`
+ src/hydra_zen/structured_configs/_implementations.py:3342:52: error[invalid-argument-type] Argument to function `make_dataclass` is incorrect: Expected `bool`, found `str | None`
+ src/hydra_zen/structured_configs/_implementations.py:3342:52: error[invalid-argument-type] Argument to function `make_dataclass` is incorrect: Expected `bool`, found `dict[str, Any] | None`
- Found 540 diagnostics
+ Found 546 diagnostics

scikit-build-core (https://github.com/scikit-build/scikit-build-core)
- src/scikit_build_core/_logging.py:153:13: warning[unsupported-base] Unsupported class base with type `<class 'Mapping[str, Style]'> | <class 'Mapping[str, Divergent]'>`
- src/scikit_build_core/build/wheel.py:98:20: error[no-matching-overload] No overload of bound method `__init__` matches arguments
- Found 43 diagnostics
+ Found 41 diagnostics

pydantic (https://github.com/pydantic/pydantic)
- pydantic/_internal/_schema_gather.py:126:28: error[invalid-key] Unknown key "steps" for TypedDict `DataclassSchema` - did you mean "type"?
+ pydantic/_internal/_schema_gather.py:126:28: error[invalid-key] Unknown key "steps" for TypedDict `DataclassSchema` - did you mean "slots"?
- pydantic/fields.py:943:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, int | float | str | ... omitted 3 union elements] | ((dict[str, int | float | str | ... omitted 3 union elements], /) -> None) | None`
+ pydantic/fields.py:943:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, Divergent] | ((dict[str, Divergent], /) -> None) | None`
- pydantic/fields.py:983:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, int | float | str | ... omitted 3 union elements] | ((dict[str, int | float | str | ... omitted 3 union elements], /) -> None) | None`
+ pydantic/fields.py:983:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, Divergent] | ((dict[str, Divergent], /) -> None) | None`
- pydantic/fields.py:1026:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, int | float | str | ... omitted 3 union elements] | ((dict[str, int | float | str | ... omitted 3 union elements], /) -> None) | None`
+ pydantic/fields.py:1026:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, Divergent] | ((dict[str, Divergent], /) -> None) | None`
- pydantic/fields.py:1066:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, int | float | str | ... omitted 3 union elements] | ((dict[str, int | float | str | ... omitted 3 union elements], /) -> None) | None`
+ pydantic/fields.py:1066:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, Divergent] | ((dict[str, Divergent], /) -> None) | None`
- pydantic/fields.py:1109:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, int | float | str | ... omitted 3 union elements] | ((dict[str, int | float | str | ... omitted 3 union elements], /) -> None) | None`
+ pydantic/fields.py:1109:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, Divergent] | ((dict[str, Divergent], /) -> None) | None`
- pydantic/fields.py:1148:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, int | float | str | ... omitted 3 union elements] | ((dict[str, int | float | str | ... omitted 3 union elements], /) -> None) | None`
+ pydantic/fields.py:1148:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, Divergent] | ((dict[str, Divergent], /) -> None) | None`
- pydantic/fields.py:1188:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, int | float | str | ... omitted 3 union elements] | ((dict[str, int | float | str | ... omitted 3 union elements], /) -> None) | None`
+ pydantic/fields.py:1188:5: error[invalid-parameter-default] Default value of type `PydanticUndefinedType` is not assignable to annotated parameter type `dict[str, Divergent] | ((dict[str, Divergent], /) -> None) | None`
- pydantic/fields.py:1567:13: error[invalid-argument-type] Argument is incorrect: Expected `dict[str, int | float | str | ... omitted 3 union elements] | ((dict[str, int | float | str | ... omitted 3 union elements], /) -> None) | None`, found `Top[dict[Unknown, Unknown]] | (((dict[str, int | float | str | ... omitted 3 union elements], /) -> None) & ~Top[dict[Unknown, Unknown]]) | None`
+ pydantic/fields.py:1567:13: error[invalid-argument-type] Argument is incorrect: Expected `dict[str, Divergent] | ((dict[str, Divergent], /) -> None) | None`, found `Top[dict[Unknown, Unknown]] | (((dict[str, Divergent], /) -> None) & ~Top[dict[Unknown, Unknown]]) | None`

Memory usage changes were detected when running on open source projects

sphinx (https://github.com/sphinx-doc/sphinx)
+ WARN expected `heap_size` to be provided by Salsa query `class_based_items`
+ WARN expected `heap_size` to be provided by Salsa query `class_based_items`
+ WARN expected `heap_size` to be provided by Salsa query `class_based_items`
+ WARN expected `heap_size` to be provided by Salsa query `class_based_items`
+ WARN expected `heap_size` to be provided by Salsa query `class_based_items`
+ WARN expected `heap_size` to be provided by Salsa query `class_based_items`

prefect (https://github.com/PrefectHQ/prefect)
+ WARN expected `heap_size` to be provided by Salsa query `class_based_items`
+ WARN expected `heap_size` to be provided by Salsa query `class_based_items`
+ WARN expected `heap_size` to be provided by Salsa query `class_based_items`
+ WARN expected `heap_size` to be provided by Salsa query `class_based_items`
+ WARN expected `heap_size` to be provided by Salsa query `class_based_items`
+ WARN expected `heap_size` to be provided by Salsa query `class_based_items`
+ WARN expected `heap_size` to be provided by Salsa query `class_based_items`
+ WARN expected `heap_size` to be provided by Salsa query `class_based_items`
+ WARN expected `heap_size` to be provided by Salsa query `class_based_items`
+ WARN expected `heap_size` to be provided by Salsa query `class_based_items`
+ WARN expected `heap_size` to be provided by Salsa query `class_based_items`
+ WARN expected `heap_size` to be provided by Salsa query `class_based_items`
+ WARN expected `heap_size` to be provided by Salsa query `class_based_items`
+ WARN expected `heap_size` to be provided by Salsa query `class_based_items`
+ WARN expected `heap_size` to be provided by Salsa query `class_based_items`
+ WARN expected `heap_size` to be provided by Salsa query `class_based_items`

crates/ty_python_semantic/src/types/typed_dict.rs

crates/ty_python_semantic/src/types/class.rs

oconnor663 · 2025-12-04T04:14:21Z

crates/ty_python_semantic/src/types/typed_dict.rs

+            TypedDictType::Class(_) => {
+                let synthesized =
+                    SynthesizedTypedDictType::new(db, self.params(db), self.items(db));
+                TypedDictType::Synthesized(synthesized.normalized_impl(db, visitor))


Would it be better to inline SynthesizedTypedDictType::normalized_impl here instead of creating two instances here and throwing the first away?

I think this is fine

An alternative that might be worth here is to cache synthesized instead of self.items()` like this

diff --git a/crates/ty_python_semantic/src/types/typed_dict.rs b/crates/ty_python_semantic/src/types/typed_dict.rs index 89ef0016f1..0a3f2f9cdd 100644 --- a/crates/ty_python_semantic/src/types/typed_dict.rs +++ b/crates/ty_python_semantic/src/types/typed_dict.rs @@ -71,36 +71,10 @@ impl<'db> TypedDictType<'db> { } pub(crate) fn items(self, db: &'db dyn Db) -> &'db TypedDictSchema<'db> { - #[salsa::tracked(returns(ref))] - fn class_based_items<'db>(db: &'db dyn Db, class: ClassType<'db>) -> TypedDictSchema<'db> { - let (class_literal, specialization) = class.class_literal(db); - class_literal - .fields(db, specialization, CodeGeneratorKind::TypedDict) - .into_iter() - .map(|(name, field)| { - let field = match field { - Field { - first_declaration, - declared_ty, - kind: - FieldKind::TypedDict { - is_required, - is_read_only, - }, - } => TypedDictFieldBuilder::new(*declared_ty) - .required(*is_required) - .read_only(*is_read_only) - .first_declaration(*first_declaration) - .build(), - _ => unreachable!("TypedDict field expected"), - }; - (name.clone(), field) - }) - .collect() - } - match self { - Self::Class(defining_class) => class_based_items(db, defining_class), + Self::Class(defining_class) => { + SynthesizedTypedDictType::for_class(db, defining_class).items(db) + } Self::Synthesized(synthesized) => synthesized.items(db), } } @@ -300,8 +274,8 @@ impl<'db> TypedDictType<'db> { pub(crate) fn normalized_impl(self, db: &'db dyn Db, visitor: &NormalizedVisitor<'db>) -> Self { match self { - TypedDictType::Class(_) => { - let synthesized = SynthesizedTypedDictType::new(db, self.items(db)); + TypedDictType::Class(class) => { + let synthesized = SynthesizedTypedDictType::for_class(db, class); TypedDictType::Synthesized(synthesized.normalized_impl(db, visitor)) } TypedDictType::Synthesized(synthesized) => { @@ -323,11 +297,13 @@ impl<'db> TypedDictType<'db> { // Compare the fields without requiring them to be in sorted order. Class-based `TypedDict` // fields are not sorted. We do sort synthetic fields in `normalized_impl`, but there will // soon be other sources of `SynthesizedTypedDictType` besides normalization. - if self.items(db).len() != other.items(db).len() { + let self_items = self.items(db); + let other_items = other.items(db); + if self_items.len() != other_items.len() { return ConstraintSet::from(false); } - let other_items = other.items(db); - self.items(db).iter().when_all(db, |(name, field)| { + + self_items.iter().when_all(db, |(name, field)| { let Some(other_field) = other_items.get(name) else { return ConstraintSet::from(false); }; @@ -744,7 +720,37 @@ pub struct SynthesizedTypedDictType<'db> { // The Salsa heap is tracked separately. impl get_size2::GetSize for SynthesizedTypedDictType<'_> {} +#[salsa::tracked] impl<'db> SynthesizedTypedDictType<'db> { + #[salsa::tracked] + fn for_class(db: &'db dyn Db, class: ClassType<'db>) -> SynthesizedTypedDictType<'db> { + let (class_literal, specialization) = class.class_literal(db); + let items: TypedDictSchema<'db> = class_literal + .fields(db, specialization, CodeGeneratorKind::TypedDict) + .into_iter() + .map(|(name, field)| { + let field = match field { + Field { + first_declaration, + declared_ty, + kind: + FieldKind::TypedDict { + is_required, + is_read_only, + }, + } => TypedDictFieldBuilder::new(*declared_ty) + .required(*is_required) + .read_only(*is_read_only) + .first_declaration(*first_declaration) + .build(), + _ => unreachable!("TypedDict field expected"), + }; + (name.clone(), field) + }) + .collect(); + SynthesizedTypedDictType::new(db, items) + } + pub(super) fn apply_type_mapping_impl<'a>( self, db: &'db dyn Db, @@ -768,15 +774,20 @@ impl<'db> SynthesizedTypedDictType<'db> { } pub(crate) fn normalized_impl(self, db: &'db dyn Db, visitor: &NormalizedVisitor<'db>) -> Self { + let mut changed = false; let items = self .items(db) .iter() .map(|(name, field)| { - let field = field.clone().normalized_impl(db, visitor); - (name.clone(), field) + let new_field = field.clone().normalized_impl(db, visitor); + if !changed && &new_field != field { + changed = true; + } + (name.clone(), new_field) }) .collect::<TypedDictSchema<'db>>(); - Self::new(db, items) + + if changed { Self::new(db, items) } else { self } } }

My intuition is that normal code (read: not Pydantic) is going to spend more time looking up the .items() of regular class-based TypedDicts than it spends normalizing, so caching .items() makes sense to me.

astral-sh-bot · 2025-12-04T04:24:05Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

codspeed-hq · 2025-12-04T04:24:10Z

CodSpeed Performance Report

Merging #21784 will not alter performance

_{Comparing synthesized_typeddict (c127766) with main (a2fb2ee)}

Summary

✅ 22 untouched
⏩ 30 skipped¹

30 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

AlexWaygood · 2025-12-04T14:04:08Z

This depends on a couple of upstream changes, and the first commit is a temporary/dummy commit that pins git hashes for these:

implement Update for OrderMap and OrderSet salsa-rs/salsa#1033

feat: Optionally implement GetSize for ordermap bircni/get-size2#40

while we wait for the upstream getsize2 change to land, you can use the implementations of heap_size we have in this repo for OrderMap and OrderSet:

ruff/crates/ruff_memory_usage/src/lib.rs

Lines 46 to 57 in 326025d

    
           /// An implementation of [`GetSize::get_heap_size`] for [`OrderSet`]. 
        
           pub fn order_set_heap_size<T: GetSize, S>(set: &OrderSet<T, S>) -> usize { 
        
               (set.capacity() * T::get_stack_size()) + set.iter().map(heap_size).sum::<usize>() 
        
           } 
        
           /// An implementation of [`GetSize::get_heap_size`] for [`OrderMap`]. 
        
           pub fn order_map_heap_size<K: GetSize, V: GetSize, S>(map: &OrderMap<K, V, S>) -> usize { 
        
               (map.capacity() * (K::get_stack_size() + V::get_stack_size())) 
        
                   + (map.iter()) 
        
                       .map(|(k, v)| heap_size(k) + heap_size(v)) 
        
                       .sum::<usize>() 
        
           }

But see my comment at #21784 (comment) -- I think we may actually need to use a BTreeMap for the items of a synthesized typeddict, since we need two equivalent typeddicts to normalize to synthesized typeddicts that compare equal

bircni · 2025-12-04T15:54:38Z

Just published the new get-size version with the implementation!

oconnor663 · 2025-12-04T16:04:11Z

@bircni incredible thank you!

astral-sh-bot · 2025-12-04T19:40:38Z

`ecosystem-analyzer` results

Lint rule	Added	Changed
`unsupported-base`	2	0
`redundant-cast`	0	1
Total	2	1

Full report with detailed diff (timing results)

oconnor663 · 2025-12-04T19:55:49Z

Hmm, the change I made to redundant cast warnings changed one unrelated union cast warning in the ecosystem analysis. Is this better or worse than before? Before:

warning[redundant-cast]: Value is already of type `Literal["function", "class", "method", "module"]`
  --> basic_checker.py:85:21
   |
83 |     lines = ["type", "number", "old number", "difference", "%documented", "%badname"]
84 |     for node_type in ("module", "class", "method", "function"):
85 |         node_type = cast(Literal["function", "class", "method", "module"], node_type)
   |                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
86 |         new = stats.get_node_count(node_type)
87 |         old = old_stats.get_node_count(node_type) if old_stats else None
   |
info: rule `redundant-cast` is enabled by default

After:

warning[redundant-cast]: Value is already of type `Literal["module", "class", "method", "function"]`, which is equivalent to `Literal["function", "class", "method", "module"]`
  --> basic_checker.py:85:21
   |
83 |     lines = ["type", "number", "old number", "difference", "%documented", "%badname"]
84 |     for node_type in ("module", "class", "method", "function"):
85 |         node_type = cast(Literal["function", "class", "method", "module"], node_type)
   |                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
86 |         new = stats.get_node_count(node_type)
87 |         old = old_stats.get_node_count(node_type) if old_stats else None
   |
info: rule `redundant-cast` is enabled by default

Is this unhelpful with unions? Like it's interesting and possibly surprising that Foo and Bar typeddicts can be equivalent, but it's boring and obvious that the same union in a different order is the same union? I could check specifically for TypedDict (and Protocol) when emitting this?

oconnor663 · 2025-12-04T20:46:29Z

@AlexWaygood says the beartype diagnostic in the ecosystem report is flaky, so other than the "eye doctor: better or worse, better or worse" question above, the report is clean.

carljm · 2025-12-05T00:14:11Z

I think the redundant-cast change is fine as-is, if anything a small improvement, even in the "obvious" case. I suppose we could even make it more explicit by naming both types in any case where they are equivalent-but-not-identical: "Value is already of type X which is equivalent to type Y". But I don't know that it's worth bothering to do that unless we get evidence that someone is confused by the shorter form that assumes you can read the cast type yourself.

EDIT: oops, I failed to scroll to the right and notice that the longer form I suggested is exactly what you already did! I think that's just fine, even in a more-obvious case.

oconnor663 · 2025-12-05T15:09:53Z

~~The last unaddressed issue that I know of on this PR is the Pydantic regression.~~ I'm looking at that now. It's presumably our old friend the gigantic recursive union?

Edit: very premature 😬

AlexWaygood · 2025-12-05T15:12:47Z

I would only expect a pydantic regression on this PR if (1) we're using Type::is_equivalent_to in our codebase somewhere where we shouldn't be or (2) pydantic is making extensive use of assert_type and/or cast. It might be worth checking those things?

AlexWaygood

Thanks! A lot of this looks good, but I think there's a few issues here to iron out

crates/ty_python_semantic/resources/mdtest/typed_dict.md

crates/ty_python_semantic/src/types/display.rs

AlexWaygood · 2025-12-05T19:35:37Z

crates/ty_python_semantic/src/types/typed_dict.rs

+    }
+
+    /// Return the meta-type of this `TypedDict` type.
+    pub(super) fn to_meta_class(self, db: &'db dyn Db) -> ClassType<'db> {


I don't think to_meta_class is an accurate name for what this method is doing (and... I don't think it's doing what it's trying to do correctly either 😄). A synthesized typeddict doesn't have a class, so it doesn't have a metaclass either. A class-based TypedDict does have a class, and therefore it also has a metaclass, but this method doesn't return the metaclass of a class-based typeddict, it just returns that class-based typeddict's defining class.

It looks like this method is actually being used to return a ClassType that can then be used to construct the meta-type of a TypedDict instance-type. So on that basis, we should rename this method to_meta_type, and have it return a Type rather than a ClassType.

But I also don't think this method gives the correct answer for the meta-type of a synthesized typeddict. A typeddict's meta-type is used for looking up synthesized TypedDict methods such as __getitem__ -- if you return type[TypedDictFallback] here rather than type[<synthetic typeddict>], then when we start doing member lookups on synthetic TypedDicts, we'll return the wrong results for members like __getitem__. As you noted in https://github.com/astral-sh/ruff/pull/21784/files#r2587403973, we don't ever do any member lookups on synthesized typeddicts yet, but we will if we start adding them to intersections in order to fix the TypedDict part of astral-sh/ty#1479. Giving a good answer for the meta-type of a synthesized TypedDict may involve adding a new variant to the SubclassOfInner enum in subclass_of.rs.

For now, I would recommend applying something like this patch. Then we can revisit the question of an accurate meta-type for synthesized TypedDicts when we actually start using them in more places. It's very hard to write tests for right now, when we use them in so few places :-)

diff --git a/crates/ty_python_semantic/src/types.rs b/crates/ty_python_semantic/src/types.rs index e7459ce7fb..2ab9c3beba 100644 --- a/crates/ty_python_semantic/src/types.rs +++ b/crates/ty_python_semantic/src/types.rs @@ -7524,7 +7524,13 @@ impl<'db> Type<'db> { Type::ProtocolInstance(protocol) => protocol.to_meta_type(db), // `TypedDict` instances are instances of `dict` at runtime, but its important that we // understand a more specific meta type in order to correctly handle `__getitem__`. - Type::TypedDict(typed_dict) => typed_dict.to_meta_class(db).into(), + Type::TypedDict(typed_dict) => match typed_dict { + TypedDictType::Class(class) => SubclassOfType::from(db, class), + TypedDictType::Synthesized(_) => SubclassOfType::from( + db, + todo_type!("TypedDict synthesized meta-type").expect_dynamic(), + ), + }, Type::TypeAlias(alias) => alias.value_type(db).to_meta_type(db), Type::NewTypeInstance(newtype) => Type::from(newtype.base_class_type(db)), } diff --git a/crates/ty_python_semantic/src/types/subclass_of.rs b/crates/ty_python_semantic/src/types/subclass_of.rs index 1045817a53..c6bb9d0378 100644 --- a/crates/ty_python_semantic/src/types/subclass_of.rs +++ b/crates/ty_python_semantic/src/types/subclass_of.rs @@ -8,7 +8,7 @@ use crate::types::{ ApplyTypeMappingVisitor, BoundTypeVarInstance, ClassType, DynamicType, FindLegacyTypeVarsVisitor, HasRelationToVisitor, IsDisjointVisitor, KnownClass, MaterializationKind, MemberLookupPolicy, NormalizedVisitor, SpecialFormType, Type, TypeContext, - TypeMapping, TypeRelation, TypeVarBoundOrConstraints, todo_type, + TypeMapping, TypeRelation, TypeVarBoundOrConstraints, TypedDictType, todo_type, }; use crate::{Db, FxOrderSet}; @@ -381,7 +381,12 @@ impl<'db> SubclassOfInner<'db> { pub(crate) fn try_from_instance(db: &'db dyn Db, ty: Type<'db>) -> Option<Self> { Some(match ty { Type::NominalInstance(instance) => SubclassOfInner::Class(instance.class(db)), - Type::TypedDict(typed_dict) => SubclassOfInner::Class(typed_dict.to_meta_class(db)), + Type::TypedDict(typed_dict) => match typed_dict { + TypedDictType::Class(class) => SubclassOfInner::Class(class), + TypedDictType::Synthesized(_) => SubclassOfInner::Dynamic( + todo_type!("type[T] for synthesized TypedDicts").expect_dynamic(), + ), + }, Type::TypeVar(bound_typevar) => SubclassOfInner::TypeVar(bound_typevar), Type::Dynamic(DynamicType::Any) => SubclassOfInner::Dynamic(DynamicType::Any), Type::Dynamic(DynamicType::Unknown) => SubclassOfInner::Dynamic(DynamicType::Unknown), diff --git a/crates/ty_python_semantic/src/types/typed_dict.rs b/crates/ty_python_semantic/src/types/typed_dict.rs index 6e4d2d5726..a7e5360a06 100644 --- a/crates/ty_python_semantic/src/types/typed_dict.rs +++ b/crates/ty_python_semantic/src/types/typed_dict.rs @@ -282,24 +282,6 @@ impl<'db> TypedDictType<'db> { } } - /// Return the meta-type of this `TypedDict` type. - pub(super) fn to_meta_class(self, db: &'db dyn Db) -> ClassType<'db> { - // `TypedDict` instances are instances of `dict` at runtime, but its important that we - // understand a more specific meta type in order to correctly handle `__getitem__`. - match self { - TypedDictType::Class(defining_class) => defining_class, - TypedDictType::Synthesized(_) => KnownClass::TypedDictFallback - .try_to_class_literal(db) - .map(|class| class.default_specialization(db)) - .unwrap_or_else(|| { - KnownClass::Object - .try_to_class_literal(db) - .map(|class| class.default_specialization(db)) - .expect("object class must exist") - }), - } - } - pub(crate) fn normalized_impl(self, db: &'db dyn Db, visitor: &NormalizedVisitor<'db>) -> Self { match self { TypedDictType::Class(_) => {

Oof, thanks for walking me through this.

Giving a good answer for the meta-type of a synthesized TypedDict may involve adding a new variant to the SubclassOfInner enum in subclass_of.rs.

Yes now that you mention it, that's what #20732 did.

crates/ty_python_semantic/src/types.rs

crates/ty_python_semantic/src/types/function.rs

crates/ty_python_semantic/src/types/typed_dict.rs

AlexWaygood

Thank you!! LGTM, though I'd still love it if we could figure out why the big pydantic regression is occurring

crates/ty_python_semantic/src/types/function.rs

AlexWaygood · 2025-12-07T15:07:40Z

crates/ty_python_semantic/src/types/typed_dict.rs

+            TypedDictType::Class(_) => {
+                let synthesized =
+                    SynthesizedTypedDictType::new(db, self.params(db), self.items(db));
+                TypedDictType::Synthesized(synthesized.normalized_impl(db, visitor))


I think this is fine

crates/ty_python_semantic/src/types/typed_dict.rs

crates/ty_python_semantic/resources/mdtest/typed_dict.md

AlexWaygood · 2025-12-08T21:49:26Z

I think the updates to the Cargo.toml/Cargo.lock files are probably not necessary with the latest version of this PR? They're obviously harmless, but they could probably be a standalone change at this point

oconnor663 · 2025-12-08T22:47:03Z

I think the updates to the Cargo.toml/Cargo.lock files are probably not necessary with the latest version of this PR? They're obviously harmless, but they could probably be a standalone change at this point

There are still a few places in class.rs where this PR changes IndexMap -> OrderMap, which depends on the Cargo.toml bumps. This PR no longer requires those changes after your patch, but we've also said (or at least Ibraheem said?) that would generally prefer to switch to OrderMap, so I've split all that out into a separate PR: #21854

oconnor663 · 2025-12-09T01:59:08Z

Some notes from digging into the performance regression today:

It's definitely our old friend CoreSchema, the gigantic TypedDict union. Not very surprising. One of the files affected in this particular PR is functional_serializers.py, which typechecks in ~0.05s on main and ~1s on this branch. I can shrink that entire file down to this while still preserving most of that gap:

from pydantic_core.core_schema import CoreSchema

def foo(core_schema: CoreSchema):
    core_schema.copy()

I'm not sure why .copy() in particular would be affected by these changes?

carljm · 2025-12-09T02:46:41Z

Can you profile main vs this branch checking that minimized example (e.g. using samply record)? With that big a difference in runtime, I would think the more-precise location of the extra time spent should jump out (and the stacks should help clarify from where we are calling the newly-expensive function).

MichaReiser · 2025-12-09T19:01:42Z

I already shared this with @oconnor663 but for broader visibility:

Here's my profile https://share.firefox.dev/4pzdJ0u
What stands out is that we spend a significant amount within Eq and Hash that we didn't before. We call Eq a lot within UnionBuilder and Hash in normalize_impl (because we intern the value).

I'm not aware of any tricks to make either of those magically go away, but some of our typing wizards may do.

oconnor663 · 2025-12-09T19:35:06Z

@MichaReiser + @AlexWaygood, thanks for chatting up a storm with me this morning. Here's something more I've noticed staring at all our profiles. The reduced copy() example above spends ~all of its time inside of BoundMethodType::has_relation_to_impl. Two things jump out at me about that:

That methods recurses on both the function/method type being called, and on the receiver object. Given that CoreSchema shows up on both sides of that, that could be part of why we're blowing up here?
The function/method side of that immediately calls normalize in the Redundancy case.

MichaReiser · 2025-12-09T19:40:08Z

crates/ty_python_semantic/src/types/typed_dict.rs

+        let other_items = other.items(db);
+        self.items(db).iter().when_all(db, |(name, field)| {
+            let Some(other_field) = other_items.get(name) else {
+                return ConstraintSet::from(false);
+            };
+            if field.flags != other_field.flags {
+                return ConstraintSet::from(false);
+            }
+            field
+                .declared_ty
+                .is_equivalent_to_impl(db, other_field.declared_ty, inferable, visitor)
+        })


I don't think this will change performance much but you could make use of the fact that you have two BTreeMap where both should return the keys in the same order if they have the same fields:

Suggested change

let other_items = other.items(db);

self.items(db).iter().when_all(db, |(name, field)| {

let Some(other_field) = other_items.get(name) else {

return ConstraintSet::from(false);

};

if field.flags != other_field.flags {

return ConstraintSet::from(false);

}

field

.declared_ty

.is_equivalent_to_impl(db, other_field.declared_ty, inferable, visitor)

})

let mut other_items_iter = other_items.iter();

self_items.iter().when_all(db, |(name, field)| {

let Some((other_name, other_field)) = other_items_iter.next() else {

return ConstraintSet::from(false);

};

if name != other_name || field.flags != other_field.flags {

return ConstraintSet::from(false);

}

field

.declared_ty

.is_equivalent_to_impl(db, other_field.declared_ty, inferable, visitor)

})

Doing the same in has_relation_to seems a bit trickier and would require peek_if

This is how you could do the same in has_relation_to but it's a bit trickier:

let mut self_items_iter = self_items.iter().peekable(); for (target_item_name, target_item_field) in target_items { // Skip over preceeding fields. let _ = { self_items_iter .peeking_take_while(|(name, _)| *name < target_item_name) .last() }; let self_item_field = self_items_iter .peeking_next(|(name, _)| *name == target_item_name) .map(|(_, field)| field);

or use Itertools::merge_by

for pair in a.iter().merge_join_by(b.iter(), |(k1, _), (k2, _)| k1.cmp(k2)) { match pair { EitherOrBoth::Both((k, v1), (_, v2)) => { // Key exists in both maps } EitherOrBoth::Left((k, v1)) => { // Only in `a` } EitherOrBoth::Right((k, v2)) => { // Only in `b` } } }

Yes (to the first diff above), and the comment here about not being in sorted order is entirely wrong now :) I think since we're doing a length check up front, we can go ahead and zip the iterators, which is a little shorter: 3e5b534

oconnor663 · 2025-12-09T20:58:48Z

Moving the TypedDictType special case as early as possible in UnionBuilder::push_type doesn't make a difference here.
Caching the conversion from class-based to synthetic typeddict saves 10% here in this most pathological case, but it doesn't close the bulk of the gap. As far as I know, I can't cache normalized_impl directly (which is what I'd really prefer to do), because of the visitor it takes.

oconnor663 · 2025-12-10T18:11:11Z

I've pushed an EXPERIMENTAL COMMIT which I don't intend to actually land, which deletes the call to normalized that I think is at the heart of this PR's regression:

diff --git a/crates/ty_python_semantic/src/types/function.rs b/crates/ty_python_semantic/src/types/function.rs
index 421504e09b..9bf73f116d 100644
--- a/crates/ty_python_semantic/src/types/function.rs
+++ b/crates/ty_python_semantic/src/types/function.rs
@@ -1022,30 +1022,18 @@ impl<'db> FunctionType<'db> {
     pub(crate) fn has_relation_to_impl(
         self,
         db: &'db dyn Db,
         other: Self,
         inferable: InferableTypeVars<'_, 'db>,
         relation: TypeRelation<'db>,
         relation_visitor: &HasRelationToVisitor<'db>,
         disjointness_visitor: &IsDisjointVisitor<'db>,
     ) -> ConstraintSet<'db> {
-        // A function type is the subtype of itself, and not of any other function type. However,
-        // our representation of a function type includes any specialization that should be applied
-        // to the signature. Different specializations of the same function type are only subtypes
-        // of each other if they result in subtype signatures.
-        if matches!(
-            relation,
-            TypeRelation::Subtyping | TypeRelation::Redundancy | TypeRelation::SubtypingAssuming(_)
-        ) && self.normalized(db) == other.normalized(db)
-        {
-            return ConstraintSet::from(true);
-        }
-
         if self.literal(db) != other.literal(db) {
             return ConstraintSet::from(false);
         }

This does seem to fix the perf regression (and not break tests) locally. I'm curious to see what CodSpeed says. If CodSpeed is green, I will probably land this PR as-is (reverting this experimental commit of course) and fork off an issue to track this regression.

(btw I'm sure there's a more direct way to run CodSpeed on a random branch, either locally or in the cloud, so if anyone knows the non-dumb workflow for this please do let me know)

AlexWaygood

The performance experiment looks like it was pretty successful to me!! It totally solved the regression and had zero impact on both our test suite and the primer report for this PR. Great job 😃

This looks ready to go now

crates/ty_python_semantic/src/types/typed_dict.rs

crates/ty_python_semantic/src/types/class.rs

…alent_to`

carljm · 2025-12-10T19:57:36Z

If the "experimental" commit fixed the pydantic regression and didn't add regressions anywhere else (which is what it looks like to me in CodSpeed), then I don't see any reason to revert that change here or wait for later to re-evaluate in astral-sh/ty#1845 -- I think we should just include that change in this PR. What's the downside?

carljm · 2025-12-10T20:03:28Z

(btw I'm sure there's a more direct way to run CodSpeed on a random branch, either locally or in the cloud, so if anyone knows the non-dumb workflow for this please do let me know)

AFAIK you have to make a PR, but it could be a separate draft PR (possibly based on an existing PR) if you want to "hide" it more.

AlexWaygood · 2025-12-10T20:06:55Z

If the "experimental" commit fixed the pydantic regression and didn't add regressions anywhere else (which is what it looks like to me in CodSpeed), then I don't see any reason to revert that change here or wait for later to re-evaluate in astral-sh/ty#1845 -- I think we should just include that change in this PR. What's the downside?

Wait, yeah, same comment! Why revert a successful experiment? If we don't introduce a performance regression in the first place, there's no need to create a followup issue. It seemed like that commit had zero downsides!

…-cycle * origin/main: [ty] Support implicit type of `cls` in signatures (#21771) [ty] add `SyntheticTypedDictType` and implement `normalized` and `is_equivalent_to` (#21784) [ty] Fix disjointness checks with type-of `@final` classes (#21770) [ty] Fix negation upper bounds in constraint sets (#21897)

oconnor663 requested review from AlexWaygood, MichaReiser, carljm, dcreager and sharkdp as code owners December 4, 2025 04:04

oconnor663 changed the title ~~add SyntheticTypedDictType and implement normalized and is_equivalent_to~~ [ty] add SyntheticTypedDictType and implement normalized and is_equivalent_to Dec 4, 2025

oconnor663 added the ty Multi-file analysis & type inference label Dec 4, 2025

oconnor663 commented Dec 4, 2025

View reviewed changes

crates/ty_python_semantic/src/types/typed_dict.rs Show resolved Hide resolved

oconnor663 commented Dec 4, 2025

View reviewed changes

crates/ty_python_semantic/src/types/display.rs Show resolved Hide resolved

oconnor663 commented Dec 4, 2025

View reviewed changes

crates/ty_python_semantic/src/types.rs Outdated Show resolved Hide resolved

chatgpt-codex-connector bot reviewed Dec 4, 2025

View reviewed changes

crates/ty_python_semantic/src/types/typed_dict.rs Outdated Show resolved Hide resolved

oconnor663 commented Dec 4, 2025

View reviewed changes

crates/ty_python_semantic/src/types/typed_dict.rs Show resolved Hide resolved

oconnor663 commented Dec 4, 2025

View reviewed changes

crates/ty_python_semantic/src/types/class.rs Outdated Show resolved Hide resolved

oconnor663 commented Dec 4, 2025

View reviewed changes

oconnor663 added the ecosystem-analyzer label Dec 4, 2025

oconnor663 force-pushed the synthesized_typeddict branch from e197647 to 9dba965 Compare December 5, 2025 05:33

AlexWaygood reviewed Dec 5, 2025

View reviewed changes

AlexWaygood approved these changes Dec 7, 2025

View reviewed changes

sharkdp removed their request for review December 8, 2025 09:55

oconnor663 mentioned this pull request Dec 8, 2025

[ty] bump dependencies to pull in Salsa support for ordermap #21854

Merged

MichaReiser reviewed Dec 9, 2025

View reviewed changes

AlexWaygood approved these changes Dec 10, 2025

View reviewed changes

crates/ty_python_semantic/src/types/typed_dict.rs Outdated Show resolved Hide resolved

crates/ty_python_semantic/src/types/class.rs Show resolved Hide resolved

This was referenced Dec 10, 2025

Pydantic perf regression related to normalizing TypedDicts astral-sh/ty#1845

Closed

support TypedDict (and Required, NotRequired, ReadOnly) astral-sh/ty#154

Open

add SyntheticTypedDictType and implement normalized and `is_equiv…

fd7b929

…alent_to`

oconnor663 force-pushed the synthesized_typeddict branch from ee5e060 to fd7b929 Compare December 10, 2025 19:53

put the experimental change back but actually land it this time

c127766

oconnor663 enabled auto-merge (squash) December 10, 2025 20:35

oconnor663 merged commit 1b44d7e into main Dec 10, 2025
41 checks passed

oconnor663 deleted the synthesized_typeddict branch December 10, 2025 20:36

oconnor663 mentioned this pull request Dec 17, 2025

Nested Pydantic schema causes slow performance astral-sh/ty#2026

Closed

[ty] add SyntheticTypedDictType and implement normalized and is_equivalent_to #21784

[ty] add SyntheticTypedDictType and implement normalized and is_equivalent_to #21784

Uh oh!

Conversation

oconnor663 commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

astral-sh-bot bot commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Diagnostic diff on typing conformance tests

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

astral-sh-bot bot commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

mypy_primer results

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

astral-sh-bot bot commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ruff-ecosystem results

Linter (stable)

Linter (preview)

Formatter (stable)

Formatter (preview)

Uh oh!

codspeed-hq bot commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging #21784 will not alter performance

Summary

Footnotes

Uh oh!

AlexWaygood commented Dec 4, 2025

Uh oh!

bircni commented Dec 4, 2025

Uh oh!

oconnor663 commented Dec 4, 2025

Uh oh!

astral-sh-bot bot commented Dec 4, 2025

ecosystem-analyzer results

Uh oh!

oconnor663 commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oconnor663 commented Dec 4, 2025

Uh oh!

carljm commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oconnor663 commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AlexWaygood commented Dec 5, 2025

Uh oh!

AlexWaygood left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

[ty] add `SyntheticTypedDictType` and implement `normalized` and `is_equivalent_to` #21784

[ty] add `SyntheticTypedDictType` and implement `normalized` and `is_equivalent_to` #21784

oconnor663 commented Dec 4, 2025 •

edited

Loading

astral-sh-bot bot commented Dec 4, 2025 •

edited

Loading

astral-sh-bot bot commented Dec 4, 2025 •

edited

Loading

`mypy_primer` results

astral-sh-bot bot commented Dec 4, 2025 •

edited

Loading

`ruff-ecosystem` results

codspeed-hq bot commented Dec 4, 2025 •

edited

Loading

`ecosystem-analyzer` results

oconnor663 commented Dec 4, 2025 •

edited

Loading

carljm commented Dec 5, 2025 •

edited

Loading

oconnor663 commented Dec 5, 2025 •

edited

Loading

oconnor663 commented Dec 9, 2025 •

edited

Loading

MichaReiser Dec 9, 2025 •

edited

Loading

MichaReiser Dec 9, 2025 •

edited

Loading

oconnor663 Dec 9, 2025 •

edited

Loading

oconnor663 commented Dec 10, 2025 •

edited

Loading

AlexWaygood left a comment •

edited

Loading

carljm commented Dec 10, 2025 •

edited

Loading