kitzeslab · sammlapp · Oct 24, 2025 · Oct 24, 2025 · Oct 27, 2025 · Oct 27, 2025
diff --git a/docs/source/opensoundscape.ml.rst b/docs/source/opensoundscape.ml.rst
@@ -92,6 +92,14 @@ opensoundscape.ml.utils module
    :undoc-members:
    :show-inheritance:
 
+opensoundscape.ml.export module
+-------------------------------
+
+.. automodule:: opensoundscape.ml.export
+   :members:
+   :undoc-members:
+   :show-inheritance:
+
 Module contents
 ---------------
 

diff --git a/docs/tutorials/train_cnn.ipynb b/docs/tutorials/train_cnn.ipynb
@@ -1246,6 +1246,123 @@
                 "scores_df.head()"
             ]
         },
+        {
+            "cell_type": "markdown",
+            "id": "8aea4709",
+            "metadata": {},
+            "source": [
+                "## Saving and exporting the model\n",
+                "\n",
+                "There are a few different ways we can save the trained model depending on downstream use cases. In general, you can simply use the `model.save(path)` function to save the model in a JSON format that can be reloaded by OpenSoundscape (`opso.load_model(path)` or `CNN.load(path)`). To load the raw dictionary of saved content, use `torch.load(path,weights_only=False)`. \n",
+                "\n",
+                "When you want to continue training from a saved model file, it is helpful to use `model.save(pickle=True)` which saves a compressed version of the entire Python object, rather than a JSON-like format. Unlike the default saving method, this saved object retains temporary model training states like the optimizer and learning rate scheduler states. However, note that when you saved a pickled model object, you could encounter issues re-loading it in different Python environments or different versions of OpenSoundscape. After saving a pickled model, you can reload it in the same way as normal: `opso.load_model(path)` or `CNN.load(path)`. \n",
+                "\n",
+                "### ONNX Export\n",
+                "ONNX (Open Neural Network Exchange) is a cross-platform format for representing neural networks on a wide range of hardware and operating systems. Exporting a model to ONNX is useful for edge computing and other inference-only applications where you want independence from PyTorch. Note that not all models can be exported to ONNX. In particular, the supported method for creating a model that can be exported to ONNX is to initialize the model with the TorchSpectrogramPreprocessor class as the preprocessor. This preprocessor is designed for compatability with PyTorch's `torch.onnx.export` function. Here's an example:"
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": 3,
+            "id": "a8d82b80",
+            "metadata": {},
+            "outputs": [
+                {
+                    "name": "stderr",
+                    "output_type": "stream",
+                    "text": [
+                        "/Users/SML161/miniconda3/envs/opso_dev/lib/python3.13/site-packages/onnxscript/converter.py:457: DeprecationWarning: Expression.__init__ got an unexpected keyword argument 'lineno'. Support for arbitrary keyword arguments is deprecated and will be removed in Python 3.15.\n",
+                        "  expr = ast.Expression(expr, lineno=expr.lineno, col_offset=expr.col_offset)\n",
+                        "/Users/SML161/miniconda3/envs/opso_dev/lib/python3.13/site-packages/onnxscript/converter.py:457: DeprecationWarning: Expression.__init__ got an unexpected keyword argument 'col_offset'. Support for arbitrary keyword arguments is deprecated and will be removed in Python 3.15.\n",
+                        "  expr = ast.Expression(expr, lineno=expr.lineno, col_offset=expr.col_offset)\n"
+                    ]
+                },
+                {
+                    "name": "stdout",
+                    "output_type": "stream",
+                    "text": [
+                        "[torch.onnx] Obtain model graph for `ONNXModel()` with `torch.export.export(..., strict=False)`...\n",
+                        "[torch.onnx] Obtain model graph for `ONNXModel()` with `torch.export.export(..., strict=False)`... ✅\n",
+                        "[torch.onnx] Run decomposition...\n",
+                        "[torch.onnx] Run decomposition... ✅\n",
+                        "[torch.onnx] Translate the graph into ONNX...\n",
+                        "[torch.onnx] Translate the graph into ONNX... ✅\n"
+                    ]
+                },
+                {
+                    "name": "stderr",
+                    "output_type": "stream",
+                    "text": [
+                        "/Users/SML161/miniconda3/envs/opso_dev/lib/python3.13/site-packages/onnx/reference/ops/op_range.py:13: DeprecationWarning: Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.)\n",
+                        "  return (np.arange(starts, ends, steps).astype(starts.dtype),)\n"
+                    ]
+                },
+                {
+                    "name": "stdout",
+                    "output_type": "stream",
+                    "text": [
+                        "Applied 9 of general pattern rewrite rules.\n"
+                    ]
+                }
+            ],
+            "source": [
+                "from opensoundscape import CNN, preprocessors\n",
+                "\n",
+                "model = CNN(\n",
+                "    architecture=\"efficientnet_b0\",\n",
+                "    classes=[0, 1, 2, 3],\n",
+                "    sample_duration=3,\n",
+                "    preprocessor_cls=preprocessors.TorchSpectrogramPreprocessor,\n",
+                "    sample_rate=32000,\n",
+                ")\n",
+                "onnx_program = model.save_onnx(\"./opso_efficientnet.onnx\", activation_layer=\"sigmoid\")"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "id": "088f7597",
+            "metadata": {},
+            "source": [
+                "This saves an ONNX program for inference (prediction). The program produces three outputs by default: the pre-processed audio sample (a spectrogram), the embedding layer outputs, and the final classifier outputs. You can turn off any of these outputs using the arguments to `.save_onnx()`. If `model.network.classifier_layer` is not set, the function will not know which layer to use for embeddings, and will instead create a program that only exports the pre-processed sample and the final classifier outputs. \n",
+                "\n",
+                "You can also directly use the lower-level functions `opso.export.to_onnx_program` to export custom model classes, or inspect the code in that function to build a custom onnx export method. \n",
+                "\n",
+                "The ONNX program can be run in various ways once it is exported. In Python, you can run onnx programs using the onnx_runtime package. Here's a sample script:\n",
+                "\n",
+                "\n",
+                "```python\n",
+                "import onnx, onnxruntime\n",
+                "import numpy as np\n",
+                "\n",
+                "combined_model = onnx.load(\"opso_efficientnet.onnx\")\n",
+                "output_names = [node.name for node in combined_model.graph.output]\n",
+                "\n",
+                "onnx.checker.check_model(combined_model)\n",
+                "\n",
+                "\n",
+                "EP_list = [\"CPUExecutionProvider\"]  # [\"CUDAExecutionProvider\", \"CPUExecutionProvider\"]\n",
+                "ort_session = onnxruntime.InferenceSession(\"opso_efficientnet.onnx\", providers=EP_list)\n",
+                "\n",
+                "# make up some random inputs\n",
+                "audio_samples_per_input = (\n",
+                "    combined_model.graph.input[0].type.tensor_type.shape.dim[2].dim_value\n",
+                ")\n",
+                "batch_size = 3\n",
+                "input_batched = np.random.rand(batch_size, 1, audio_samples_per_input).astype(\n",
+                "    np.float32\n",
+                ")\n",
+                "\n",
+                "# compute ONNX Runtime output prediction\n",
+                "ort_inputs = {ort_session.get_inputs()[0].name: input_batched}\n",
+                "ort_outs = ort_session.run(None, ort_inputs)\n",
+                "\n",
+                "# restore the name-value dictionary mapping of outputs\n",
+                "outs_dict = {name: ort_outs[i] for i, name in enumerate(output_names)}\n",
+                "print(f\"shape of outputs for inference on one batch of batch size {batch_size}:\")\n",
+                "print({k: v.shape for k, v in outs_dict.items()})\n",
+                "```"
+            ]
+        },
         {
             "cell_type": "markdown",
             "id": "b1946c1f",
@@ -1255,24 +1372,31 @@
                 "\n",
                 "For guidance on how to use machine learning classifiers, see the Classifieres 101 Guide on opensoundscape.org and the tutorial on predicting with pre-trained CNNs.\n",
                 "\n",
+                "For transfer learning from pre-trained CNNs, see the transfer learning tutorial notebook.\n",
                 "\n",
-                "**Clean up:** Run the following cell to delete the files created in this tutorial. However, these files are used in other tutorials, so you may wish not to delete them just yet."
+                "**Clean up:** Run the following cell to delete the files created in this tutorial."
             ]
         },
         {
             "cell_type": "code",
-            "execution_count": null,
+            "execution_count": 4,
             "id": "440ca518-abcd-4bac-94e8-12ff8b8e46b1",
             "metadata": {},
             "outputs": [],
             "source": [
                 "import shutil\n",
-                "\n",
+                "from pathlib import Path\n",
                 "# uncomment to remove the training files\n",
                 "# shutil.rmtree('./annotated_data')\n",
                 "\n",
-                "shutil.rmtree(\"./wandb\")\n",
-                "shutil.rmtree(\"./model_training_checkpoints\")\n",
+                "if Path(\"./wandb\").exists():\n",
+                "    shutil.rmtree(\"./wandb\")\n",
+                "if Path(\"./model_training_checkpoints\").exists():\n",
+                "    shutil.rmtree(\"./model_training_checkpoints\")\n",
+                "try:\n",
+                "    Path(\"opso_efficientnet.onnx\").unlink()\n",
+                "except:\n",
+                "    pass\n",
                 "try:\n",
                 "    Path(\"annotation_Files.zip\").unlink()\n",
                 "except:\n",

diff --git a/opensoundscape/annotations.py b/opensoundscape/annotations.py
@@ -1092,9 +1092,13 @@ def clip_labels(
                 annotations result in at least one clip being labeled 1
                 (if there are no gaps between clips).
             full_duration: The amount of time (seconds) to split into clips for each file
-                float or `None`; if `None`, attempts to get each file's duration
+                any of float, list, or `None`
+                - if `None`, attempts to get each file's duration
                 using `librosa.get_duration(path=file)` where file is the value
                 of `audio` for each row of self.df
+                - if float: uses this fixed duration for all files
+                - if list: should be same length as `audio_files`, giving duration
+                for each file
             class_subset: list of classes for one-hot labels. If None, classes will
                 be all unique values of self.df['annotation']
             audio_files: list of audio file paths (as str or pathlib.Path)
@@ -1159,15 +1163,32 @@ class names for each clip; also returns a second value, the list of class names
                     argument to `clip_labels()` will avoid the attempt to get 
                     audio file durations from file paths."""
                 ) from exc
-        else:  # use fixed full_duration for all files
-            # make a clip df, will be re-used for each file
-            clip_df_template = generate_clip_times_df(
-                full_duration=full_duration, clip_duration=clip_duration, **kwargs
-            )
-            # make a clip df for all files
-            clip_df = pd.concat([clip_df_template] * len(audio_files))
-            # add file column, repeating value of file across each clip
-            clip_df["file"] = np.repeat(audio_files, len(clip_df_template))
+        else:
+            # clip_duration is fixed for all files, or is a list of durations for each file
+            if isinstance(full_duration, (list, np.ndarray)):
+                assert len(full_duration) == len(audio_files), (
+                    "`full_duration` should be a float or a list/array "
+                    "with same length as `audio_files`, or None. Length did not match audio_files."
+                )
+            else:
+                # same duration for all files
+                assert isinstance(full_duration, (int, float)), (
+                    "`full_duration` should be a float or a list/array "
+                    f"with same length as `audio_files`, or None. Got type {type(full_duration)}."
+                )
+                full_duration = [full_duration] * len(audio_files)
+
+            dfs = []
+            for i, file in enumerate(audio_files):
+
+                df = generate_clip_times_df(
+                    full_duration=full_duration[i],
+                    clip_duration=clip_duration,
+                    **kwargs,
+                )
+                df["file"] = file
+                dfs.append(df)
+            clip_df = pd.concat(dfs).reset_index(drop=True)
             clip_df = clip_df.set_index(["file", "start_time", "end_time"])
 
         # call labels_on_index with clip_df

diff --git a/opensoundscape/ml/__init__.py b/opensoundscape/ml/__init__.py
@@ -6,6 +6,7 @@
 from . import safe_dataset
 from . import sampling
 from . import utils
+from . import export
 import torch.multiprocessing
 
 # using 'file_system' avoids errors with "Too many open files",