Error converting np float32

Setup: Nvidia Jetson, using ROS2 to collect data. When I collect data on my local system over ssh, it works fine, on the Jetson I get this issue:

  File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: Could not convert 0.037997484 with type numpy.float32: did not recognize Python value type when inferring an Arrow data type

Versions on Jetson:

On my jetson:
(lerobot) root@dusty-1000052:/home/dusty/lerobot/lerobot-clutterbot# pip freeze | grep -E 'pyarrow|datasets|numpy|protobuf'
datasets==3.6.0
numpy @ file:///croot/numpy_and_numpy_base_1672336188316/work
protobuf==6.31.1
pyarrow==20.0.0
(lerobot) root@dusty-1000052:/home/dusty/lerobot/lerobot-clutterbot# python -c "import numpy; print(numpy.__version__)"
1.23.5

Detailed error log:

    response = fn(cfg, *args, **kwargs)
  File "/home/dusty/lerobot/lerobot/src/lerobot/record.py", line 380, in record
    dataset.save_episode()
  File "/home/dusty/lerobot/lerobot/src/lerobot/datasets/lerobot_dataset.py", line 850, in save_episode
    self._save_episode_table(episode_buffer, episode_index)
  File "/home/dusty/lerobot/lerobot/src/lerobot/datasets/lerobot_dataset.py", line 896, in _save_episode_table
    ep_dataset = datasets.Dataset.from_dict(episode_dict, features=self.hf_features, split="train")
  File "/root/miniconda3/envs/lerobot/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 940, in from_dict
    pa_table = InMemoryTable.from_pydict(mapping=mapping)
  File "/root/miniconda3/envs/lerobot/lib/python3.10/site-packages/datasets/table.py", line 758, in from_pydict
    return cls(pa.Table.from_pydict(*args, **kwargs))
  File "pyarrow/table.pxi", line 1982, in pyarrow.lib._Tabular.from_pydict
  File "pyarrow/table.pxi", line 6379, in pyarrow.lib._from_pydict
  File "pyarrow/array.pxi", line 405, in pyarrow.lib.asarray
  File "pyarrow/array.pxi", line 255, in pyarrow.lib.array
  File "pyarrow/array.pxi", line 117, in pyarrow.lib._handle_arrow_array_protocol
  File "/root/miniconda3/envs/lerobot/lib/python3.10/site-packages/datasets/arrow_writer.py", line 311, in __arrow_array__
    out = pa.array(cast_to_python_objects(data, only_1d_for_numpy=True, optimize_list_casting=False))
  File "pyarrow/array.pxi", line 375, in pyarrow.lib.array
  File "pyarrow/array.pxi", line 45, in pyarrow.lib._sequence_to_array
  File "pyarrow/error.pxi", line 155, in pyarrow.lib.pyarrow_internal_check_status
  File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: Could not convert 0.037997484 with type numpy.float32: did not recognize Python value type when inferring an Arrow data type

1 Like

datasets==3.6.0

I think the culprit is probably this. The old datasets library and old arrow library dug up ancient issues. If it is possible to upgrade the library, that would be the quickest solution, but if not, try using an old workaround.

Thanks for the reply, I tried upgrading but it still doesnt work. I’m also running it with the Lerobot library which has the requirement of datasets being 3.6, I’m still getting the same issue so I guess 4.0 is still compatible.

1 Like

Hmm, maybe try updating PyArrow together?
It’s also used in the datasets library.

pip install -U pyarrow>=21.0.0 datasets