Introducing data manipulation features (append, insert, delete)

Hi everyone,

I’d like to introduce an open source ML storage framework (GitHub - google/space: Unified storage framework for the entire machine learning lifecycle) that provides data manipulation, materialized views, version management features to popular ML datasets.

It supports lightweight conversion to/from HuggingFace datasets by reusing Parquet files. You can use it to easily modify or incrementally transform data. Here is an example: space/notebooks/huggingface_conversion.ipynb at main · google/space · GitHub

Your feedback will be very helpful. Thank you!

1 Like