-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Fix race condition with py::make_key_iterator in free threading #5971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
The creation of the iterator class needs to be synchronized.
include/pybind11/detail/internals.h
Outdated
| struct internals { | ||
| #ifdef Py_GIL_DISABLED | ||
| pymutex mutex; | ||
| pyrecursive_mutex mutex; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@henryiii @rwgk - I could use some advice here. I probably should have used some sort of recursive mutex here from the start -- it's pretty difficult to do the locking for make_iterator_impl without it.
I think changing this would require bumping PYBIND11_INTERNALS_VERSION, at least for Py_GIL_DISABLED builds. Is that acceptable?
Alternatively, maybe we should use something like Py_BEGIN_CRITICAL_SECTION_MUTEX, which supports recursion and eliminates some potential lock ordering deadlocks if you call into Python. The downside is that it is 3.14+ only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think changing this would require bumping
PYBIND11_INTERNALS_VERSION, at least forPy_GIL_DISABLEDbuilds. Is that acceptable?
After the v3.0.2 release, yes. We already have three other PRs that need an internals version bump.
I was planning to release today (PR #5969), but there was a show stopper. We're waiting for a fix for the segfault.
My thinking: the race isn't new, therefore it would seem reasonable to defer fixing it until after the v3.0.2 release, when we have a window of opportunity to bump the internals version.
Caveat: I don't have enough background to decide between the internals version bump and the critical section alternative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Py_BEGIN_CRITICAL_SECTION_MUTEX sounds fine, I'd be okay to deprecate and remove Python 3.13t support. Maybe we could just fix this bug on 3.14t, and then drop 3.13t in 3.1?
We are thinking about dropping 3.13t in cibuildwheel too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also fine to make the lock recursive in 3.1, we have several ABI bumps coming up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, I think dropping 3.13t makes sense. I switched to PyCriticalSection_BeginMutex.
I saw there's also a py::scoped_critical_section. Not sure if you'd want to combine them.
|
Thanks for working on this so quickly! I can try it with the full library that I discovered the issue with once it is in a ready-to-review state 😃 |
|
@ikrommyd - if you can run test with the full library, that would be great. |
|
@rwgk what should we do with 3.13t here? Okay to drop it in a patch release, or should we gate this for 3.14t+ only (fine to leave the bug unfixed on 3.13t), and drop in the next minor release instead? CPython 3.13t was always experimental, while 3.14t is not. |


Description
The creation of the iterator class needs to be synchronized.
Fixes #5970
Suggested changelog entry:
py::make_key_iteratorwith free threaded Python