Skip to content

Recording fails with video_encoding_batch_size > 1 #2404

@F-Fer

Description

@F-Fer

System Info

- lerobot version: 0.4.0
- Platform: Linux-6.8.0-86-generic-x86_64-with-glibc2.35
- Python version: 3.11.13
- Huggingface Hub version: 0.35.3
- Datasets version: 4.1.1
- Numpy version: 2.3.4
- PyTorch version: 2.7.1+cu126
- Is PyTorch built with CUDA support?: True
- Cuda version: 12.6
- GPU model: Quadro M2000
- Using GPU in script?: no

Information

  • One of the scripts in the examples/ folder of LeRobot
  • My own task or dataset (give details below)

Reproduction

Set video_encoding_batch_size to a value larger than 1 (e.g. 4) and run the recording script:

lerobot-record
--dataset.video_encoding_batch_size=4
--robot.type=so101_follower
--robot.port=/dev/tty.usbmodem585A0076841
--robot.id=my_awesome_follower_arm
--robot.cameras="{ front: {type: opencv, index_or_path: 0, width: 1920, height: 1080, fps: 30}}"
--teleop.type=so101_leader
--teleop.port=/dev/tty.usbmodem58760431551
--teleop.id=my_awesome_leader_arm
--display_data=true
--dataset.repo_id=${HF_USER}/record-test
--dataset.num_episodes=5
--dataset.single_task="Grab the black cube"

I got the following error:

INFO 2025-11-07 09:54:26 llo/gello.py:79 gello_teleop Gello connected.
INFO 2025-11-07 09:54:26 ls/utils.py:227 Recording episode 0
Right arrow key pressed. Exiting loop...
^[[CINFO 2025-11-07 09:54:33 ls/utils.py:227 Reset the environment
Right arrow key pressed. Exiting loop...
^[[CRight arrow key pressed. Exiting loop...
Map: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 175/175 [00:00<00:00, 1294.68 examples/s]
INFO 2025-11-07 09:54:40 eo_utils.py:634 Exception occurred. Encoding remaining episodes before exit...
INFO 2025-11-07 09:54:40 eo_utils.py:640 Encoding remaining 1 episodes, from episode 0 to 0
INFO 2025-11-07 09:54:40 dataset.py:1182 Batch encoding 4 videos for episodes 0 to 0
Traceback (most recent call last):
File "/home/finn/dev/lerobot_ur5e_gello/scripts/record.py", line 447, in record
dataset.meta.episodes = load_episodes(dataset.meta.root)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/finn/dev/lerobot_ur5e_gello/.venv/lib/python3.11/site-packages/lerobot/datasets/utils.py", line 365, in load_episodes
episodes = load_nested_dataset(local_dir / EPISODES_DIR)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/finn/dev/lerobot_ur5e_gello/.venv/lib/python3.11/site-packages/lerobot/datasets/utils.py", line 121, in load_nested_dataset
datasets = Dataset.from_parquet([str(path) for path in paths], features=features)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/finn/dev/lerobot_ur5e_gello/.venv/lib/python3.11/site-packages/datasets/arrow_dataset.py", line 1303, in from_parquet
).read()
^^^^^^
File "/home/finn/dev/lerobot_ur5e_gello/.venv/lib/python3.11/site-packages/datasets/io/parquet.py", line 61, in read
self.builder.download_and_prepare(
File "/home/finn/dev/lerobot_ur5e_gello/.venv/lib/python3.11/site-packages/datasets/builder.py", line 894, in download_and_prepare
self._download_and_prepare(
File "/home/finn/dev/lerobot_ur5e_gello/.venv/lib/python3.11/site-packages/datasets/builder.py", line 948, in _download_and_prepare
split_generators = self._split_generators(dl_manager, **split_generators_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/finn/dev/lerobot_ur5e_gello/.venv/lib/python3.11/site-packages/datasets/packaged_modules/parquet/parquet.py", line 60, in _split_generators
self.info.features = datasets.Features.from_arrow_schema(pq.read_schema(f))
^^^^^^^^^^^^^^^^^
File "/home/finn/dev/lerobot_ur5e_gello/.venv/lib/python3.11/site-packages/pyarrow/parquet/core.py", line 2393, in read_schema
file = ParquetFile(
^^^^^^^^^^^^
File "/home/finn/dev/lerobot_ur5e_gello/.venv/lib/python3.11/site-packages/pyarrow/parquet/core.py", line 328, in init
self.reader.open(
File "pyarrow/_parquet.pyx", line 1656, in pyarrow._parquet.ParquetReader.open
File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: Parquet magic bytes not found in footer. Either the file is corrupted or this is not a parquet file.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/finn/dev/lerobot_ur5e_gello/scripts/record.py", line 474, in
main()
File "/home/finn/dev/lerobot_ur5e_gello/scripts/record.py", line 470, in main
record()
File "/home/finn/dev/lerobot_ur5e_gello/.venv/lib/python3.11/site-packages/lerobot/configs/parser.py", line 233, in wrapper_inner
response = fn(cfg, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/finn/dev/lerobot_ur5e_gello/scripts/record.py", line 393, in record
with VideoEncodingManager(dataset):
File "/home/finn/dev/lerobot_ur5e_gello/.venv/lib/python3.11/site-packages/lerobot/datasets/video_utils.py", line 644, in exit
self.dataset._batch_save_episode_video(start_ep, end_ep)
File "/home/finn/dev/lerobot_ur5e_gello/.venv/lib/python3.11/site-packages/lerobot/datasets/lerobot_dataset.py", line 1186, in _batch_save_episode_video
chunk_idx = self.meta.episodes[start_episode]["data/chunk_index"]
~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^
TypeError: 'NoneType' object is not subscriptable

Expected behavior

The recording should work as usual but with batch encoding.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn’t working correctlydatasetIssues regarding data inputs, processing, or datasetsperformanceIssues aimed at improving speed or resource usage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions