-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Update json cache to eliminate race condition #3597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #3597 +/- ##
===========================================
- Coverage 93.13% 92.62% -0.51%
===========================================
Files 68 68
Lines 15336 15576 +240
===========================================
+ Hits 14283 14428 +145
- Misses 1053 1148 +95 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be the tradeoffs of making sure the cache key is always unique to avoid this race condition entirely?
| temp_fd, temp_path = tempfile.mkstemp( | ||
| dir=self._working_dir, suffix='.tmp' | ||
| ) | ||
| with os.fdopen(temp_fd, 'w') as f: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So @ubaskota and I chatted about we are removing the nested os.open(full_key, os.O_WRONLY | os.O_CREAT, 0o600), but I want to record it here for posterity:
The inner os.open that was formerly on line 3582 was necessary because Python has less restrictive default permissions for a newly created file. Now that we are using tempfile.mkstemp, the default permissions are already owner read/write for owner only, so we no longer need the nested open call.
| dir=self._working_dir, suffix='.tmp' | ||
| ) | ||
| with os.fdopen(temp_fd, 'w') as f: | ||
| temp_fd = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What specific errors are we trying to protect against by setting temp_fd and temp_path to None?
If we don't have a clear idea of the new failure modes we are accounting for by adding this logic, we should remove it. If we do and these None checks are necessary/valuable, we should document that here so it's easier to maintain going forward.
Description of changes:
Use
os.replace()for atomic file operations to eliminate race conditions during concurrent access to the file cache. This ensures that writes are either fully completed or not applied at all, preventing partial writes that could leave the JSON file in an invalid state when multiple processes access the cache simultaneously.Tests:
I verified that all tests including the the newly added unit tests pass.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.