Skip to content

Conversation

@david-waterworth
Copy link

@david-waterworth david-waterworth commented Feb 23, 2023

This fixes the inconsistency of _roll_out_time_series aways naming the resulting tuple index as "id" even if the user passes and alternative id via the column_id parameter.

@david-waterworth
Copy link
Author

I think now that I've made this change, the return from roll_time_series also needs to change - the map_reduce may no longer contain an "id" column yet the return always sorts by "id"

I'll also add tests before this is ready to merge as well.

shifted_chunks = distributor.map_reduce(
    _roll_out_time_series,
    data=range_of_shifts,
    chunk_size=chunksize,
    function_kwargs=kwargs,
)

distributor.close()

df_shift = pd.concat(shifted_chunks, ignore_index=True)

return df_shift.sort_values(by=["id", column_sort or "sort"])

@nils-braun
Copy link
Collaborator

I think now that I've made this change, the return from roll_time_series also needs to change - the map_reduce may no longer contain an "id" column yet the return always sorts by "id"

Correct - you would want to also change this to column_id.

Thanks for your effort!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants