The dataset viewer is not working.
The image I uploaded has no issues when tested locally, but it returns an error after being uploaded.
Is there any code or tool related to packaging images into Parquet files locally?
Error details:
Error code: StreamingRowsError
Exception: OSError
Message: image file is truncated (20 bytes not processed)
Traceback: Traceback (most recent call last):
File "/src/services/worker/src/worker/utils.py", line 99, in get_rows_or_raise
return get_rows(
File "/src/libs/libcommon/src/libcommon/utils.py", line 197, in decorator
return func(*args, **kwargs)
File "/src/services/worker/src/worker/utils.py", line 77, in get_rows
rows_plus_one = list(itertools.islice(ds, rows_max_number + 1))
File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 2097, in __iter__
example = _apply_feature_types_on_example(
File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 1635, in _apply_feature_types_on_example
decoded_example = features.decode_example(encoded_example, token_per_repo_id=token_per_repo_id)
File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/features/features.py", line 2044, in decode_example
return {
File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/features/features.py", line 2045, in <dictcomp>
column_name: decode_nested_example(feature, value, token_per_repo_id=token_per_repo_id)
File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/features/features.py", line 1405, in decode_nested_example
return schema.decode_example(obj, token_per_repo_id=token_per_repo_id)
File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/features/image.py", line 188, in decode_example
image.load() # to avoid "Too many open files" errors
File "/src/services/worker/.venv/lib/python3.9/site-packages/PIL/ImageFile.py", line 288, in load
raise OSError(msg)
OSError: image file is truncated (20 bytes not processed)
cc @lhoestq .