Skip to content

Conversation

@jl-wynen
Copy link
Member

@jl-wynen jl-wynen commented Nov 5, 2025

This is far from complete. The main problem is indexing because it is heavily overloaded and mypy complains about code that uses ScippIndex. The worst offenders are to_canonical_select and convert_label_index_to_positional. I am not even convinced that these functions are correct in all cases. The culprit for this is ultimately #283

@jl-wynen
Copy link
Member Author

jl-wynen commented Nov 6, 2025

The build is failing because I changed some functions to no longer modify the input children dict. To illustrate, this code is from the docs and works:

import scippnexus as snx
from scippnexus import data
filename = data.get_path('PG3_4844_event.nxs')
with snx.File(filename) as f:
    tree = f['entry/instrument'][()]
    d = f['entry/instrument/bank102']['events'][...]

However, this code fails:

import scippnexus as snx
from scippnexus import data
filename = data.get_path('PG3_4844_event.nxs')
with snx.File(filename) as f:
    d = f['entry/instrument/bank102']['events'][...]

Note the missing f['entry/instrument'][()]. The reason the first example works is that loading the entire instument assembles the event data object and stores it in the bank102 group. And that group is cached in f. So then we can access the 'events'. But if we don't first load the whole instrument, 'events' has not been assembled and we cannot access it.

In a sense this is a feature of the caching mechanism. But I think it is bad that a call to load some data can depend on previous load calls. So I am in favour of removing this line from the docs.

For reference, these are the contents of bank102:

$ ls bank102
azimuthal_angle   event_id            event_time_zero   pixel_id       x_pixel_offset
data_x_y          event_index         local_name        polar_angle    y_pixel_offset
distance          event_time_offset   origin/           total_counts

@SimonHeybrock
Copy link
Member

In a sense this is a feature of the caching mechanism. But I think it is bad that a call to load some data can depend on previous load calls. So I am in favour of removing this line from the docs.

For reference, these are the contents of bank102:

$ ls bank102
azimuthal_angle   event_id            event_time_zero   pixel_id       x_pixel_offset
data_x_y          event_index         local_name        polar_angle    y_pixel_offset
distance          event_time_offset   origin/           total_counts

I agree, I don't think supporting this was intentional for the "embedded" NXevent_data. Please remove/update.

@jl-wynen jl-wynen merged commit 0af5ee9 into main Dec 18, 2025
5 checks passed
@jl-wynen jl-wynen deleted the type-hints branch December 18, 2025 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants