Skip to content

Conversation

@colesbury
Copy link
Contributor

@colesbury colesbury commented Jan 21, 2026

Description

The creation of the iterator class needs to be synchronized.

Fixes #5970

Suggested changelog entry:

  • Fix race condition in py::make_key_iterator with free threaded Python

colesbury and others added 2 commits January 21, 2026 16:46
The creation of the iterator class needs to be synchronized.
struct internals {
#ifdef Py_GIL_DISABLED
pymutex mutex;
pyrecursive_mutex mutex;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@henryiii @rwgk - I could use some advice here. I probably should have used some sort of recursive mutex here from the start -- it's pretty difficult to do the locking for make_iterator_impl without it.

I think changing this would require bumping PYBIND11_INTERNALS_VERSION, at least for Py_GIL_DISABLED builds. Is that acceptable?

Alternatively, maybe we should use something like Py_BEGIN_CRITICAL_SECTION_MUTEX, which supports recursion and eliminates some potential lock ordering deadlocks if you call into Python. The downside is that it is 3.14+ only.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think changing this would require bumping PYBIND11_INTERNALS_VERSION, at least for Py_GIL_DISABLED builds. Is that acceptable?

After the v3.0.2 release, yes. We already have three other PRs that need an internals version bump.

I was planning to release today (PR #5969), but there was a show stopper. We're waiting for a fix for the segfault.

My thinking: the race isn't new, therefore it would seem reasonable to defer fixing it until after the v3.0.2 release, when we have a window of opportunity to bump the internals version.

Caveat: I don't have enough background to decide between the internals version bump and the critical section alternative.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Py_BEGIN_CRITICAL_SECTION_MUTEX sounds fine, I'd be okay to deprecate and remove Python 3.13t support. Maybe we could just fix this bug on 3.14t, and then drop 3.13t in 3.1?

We are thinking about dropping 3.13t in cibuildwheel too.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also fine to make the lock recursive in 3.1, we have several ABI bumps coming up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, I think dropping 3.13t makes sense. I switched to PyCriticalSection_BeginMutex.

I saw there's also a py::scoped_critical_section. Not sure if you'd want to combine them.

@ikrommyd
Copy link

Thanks for working on this so quickly! I can try it with the full library that I discovered the issue with once it is in a ready-to-review state 😃

@colesbury colesbury marked this pull request as ready for review January 22, 2026 17:48
@colesbury
Copy link
Contributor Author

@ikrommyd - if you can run test with the full library, that would be great.

@ikrommyd
Copy link

ikrommyd commented Jan 23, 2026

Looks good from my side! (if I checkout pybind11 master branch, tests/test_core.py fails with the original error reported in the issue TypeError: Object of type 'iterator' is not an instance of 'iterator')
image
image

@henryiii
Copy link
Collaborator

@rwgk what should we do with 3.13t here? Okay to drop it in a patch release, or should we gate this for 3.14t+ only (fine to leave the bug unfixed on 3.13t), and drop in the next minor release instead? CPython 3.13t was always experimental, while 3.14t is not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG]: make_key_iterator type creation does not appear to be thread-safe

4 participants