Skip to content

Conversation

@chrishavlin
Copy link
Collaborator

Exploring some dask + yt interactions

@chrishavlin chrishavlin marked this pull request as draft June 13, 2024 19:58
Comment on lines +42 to +45
# KeyError: 'e34a252e426f6cc81be22b03b77786ea'

fname = 'IsolatedGalaxy/galaxy0030/galaxy0030'
ds = yt.load(fname)
Copy link
Collaborator Author

@chrishavlin chrishavlin Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matthewturk this file is one example of some confusing behavior with yt+dask. (don't bother with the other files in this PR for now at least). If you run this script as-is (python dask_failure_in_param_store.py 3 1 (the args are the number of workers and the threads per worker), you get this key error.

It relates to the dask worker state (see this SO answer https://stackoverflow.com/questions/75837897/dask-worker-has-different-imports-than-main-thread ): so if you move the ds = yt.load(fname) up before the __main__ block, the code runs OK.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally things wouldn't be quite so state dependent...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh and in this particular case, if you use the on-disk ParameterFileStore, the code also runs with the load in the __main__ block (after fixing the bug i mentioned on slack), because all the workers can access the on-disk hashes and args for reconstructing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants