Skip to content

Conversation

@minnerbe
Copy link
Collaborator

@minnerbe minnerbe commented Dec 15, 2025

This PR proposes to use pixi to manage the python environment. Pixi has been widely adopted internally and it's much more robust than the current poetry-based environment management. The migration serves two purposes:

  • The environment from this PR should satisfy the needs of both the main and the multisem branch so that they can be consolidated more easily.
  • This PR essentially decouples the environment changes and the added functionality in FIBSEM metadata visualization #133.

As described below, the changes were successfully tested on production data. I'm happy for any feedback, @trautmane. My plan would be to merge this and #133 early in the new year if there are no concerns from your side.

Main changes

  • Drop dask_janelia, which hasn't been updated in 4 years and prevents other packages from being upgraded. Instead, copy and slightly update the functionality to a newer dask_jobqueue version. This should only be a temporary solution until we (hopefully) settle on a more more standardized approach on how to handle dask internally.
  • Update some libraries (most notably, pydantic, dask_jobqueue, and fibsem-tools) with all necessary code changes.

Tests

The changes were tested using the following test configuration

{
    "transfer_id": "fibsem::test_mi_hela-L63-1::jeiss4.hhmi.org",
    "scope_data_set": {
        "host": "jeiss4.hhmi.org",
        "root_dat_path": "/cygdrive/e/Images/Cell",
        "root_keep_path": "/cygdrive/d/UploadFlags",
        "data_set_id": "xxx",
        "rows_per_z_layer": 1,
        "columns_per_z_layer": 1,
        "first_dat_name": "Merlin-4238_25-10-31_220039_0-0-0.dat",
        "last_dat_name": "Merlin-4238_25-11-01_025932_0-0-0.dat",
        "dat_x_and_y_nm_per_pixel": 8,
        "dat_z_nm_per_pixel": 8,
        "dat_tile_overlap_microns": 2
    },
    "cluster_root_paths": {
        "raw_dat": "/nrs/fibsem/data/test_mi_hela-L63-1/dat",
        "raw_h5": "/nrs/fibsem/data/test_mi_hela-L63-1/raw",
        "align_h5": "/nrs/fibsem/data/test_mi_hela-L63-1/align",
        "export_n5": "/nrs/fibsem/data/test_mi_hela-L63-1/export.n5"
    },
    "archive_root_paths": {
        "raw_h5": "/nrs/fibsem/data/test_mi_hela-L63-1/archive"
    },
    "max_mipmap_level": 7,
    "render_data_set": {
        "owner": "fibsem",
        "project": "test_mi_hela_L63_1",
        "stack": "v1_acquire",
        "restart_context_layer_count": 1,
        "mask_width": 100,
        "mask_height": 0,
        "connect": {
            "host": "10.40.3.113",
            "port": 8080,
            "web_only": true,
            "validate_client": false,
            "client_scripts": "/groups/flyTEM/flyTEM/render/bin",
            "memGB": "1G"
        }
    },
    "transfer_tasks": [
        "GENERATE_CLUSTER_H5_RAW",
        "GENERATE_CLUSTER_H5_ALIGN",
        "REMOVE_DAT_AFTER_H5_CONVERSION",
        "ARCHIVE_H5_RAW",
        "IMPORT_H5_ALIGN_INTO_RENDER",
        "APPLY_FIBSEM_CORRECTION_TRANSFORM", "EXPORT_PREVIEW_VOLUME"
    ],
    "cluster_job_project_for_billing": "fibsem",
    "number_of_dats_converted_per_hour": 80,
    "number_of_preview_workers": 10
}

and the corresponding commands

pixi run python src/python/janelia_emrp/fibsem/dat_converter.py \
    --volume_transfer_info volume_transfer_info.json \
    --first_dat Merlin-4238_25-10-31_220039_0-0-0.dat \
    --last_dat Merlin-4238_25-11-01_025932_0-0-0.dat \
    --num_workers 10

pixi run python src/python/janelia_emrp/fibsem/h5_to_render.py \
    --volume_transfer_info volume_transfer_info.json \
    --num_workers 3

pixi run python src/python/janelia_emrp/fibsem/h5_archivist.py \
    --volume_transfer_dir test_mi_hela-L63-1/

This resulted in this render stack and the following state on disk:

du -sh /nrs/fibsem/data/test_mi_hela-L63-1/*
# 4.0G  /nrs/fibsem/data/test_mi_hela-L63-1/align
# 12G   /nrs/fibsem/data/test_mi_hela-L63-1/archive
# 0     /nrs/fibsem/data/test_mi_hela-L63-1/dat
# 1.0K  /nrs/fibsem/data/test_mi_hela-L63-1/fetch_data.sh
# 0     /nrs/fibsem/data/test_mi_hela-L63-1/raw
# 2.0K  /nrs/fibsem/data/test_mi_hela-L63-1/volume_transfer_info.json

TODOs

To make all bash script compatible with the environment changes, I need to

  • substitute all conda activate janelia_emrp_3_12 with (?)

Once this PR is merged, I will

  • Delete the test_mi_hela-L63-1 render project
  • Delete all data at /nrs/fibsem/data/test_mi_hela-L63-1

@minnerbe minnerbe requested a review from trautmane December 15, 2025 01:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant