I noticed a couple spaces where persistence appears to work for saving new records to a CSV file in HF Datasets.
I tried to reproduce it but it does not appear to save however.
Can you provide insight on how to use persistence where you make a modification to data, then want to write back from cached version in memory to persistent shared Dataset?
Here was my last attempt:
Here are my last 3 failed attempts as Datasets:
These two spaces appear to have it working yet I cannot see how - Is it a key/secret or something?
maybe @abidlabs or @osanseviero can help! That would be cool to document how to do this in Streamlit in addition to Gradio
To get it working, one option is to use
dataset library with it’s
app.py · julien-c/persistent-data at main and app.py · chrisjay/afro-speech at main are examples that work using
huggingface_hub (a Python library that works as a wrapper of Hugging Face Hub Public APIs) using the
- Create or clone a repo using
Repository app.py · julien-c/persistent-data at main. These methods use a token
HF_TOKEN which is passed as a secret from the Hub. Note that they also specify a local directory.
- Save your data in the directory from above. E.g. the first space is appending the data to a csv.
huggingface_hub also has an
upload_file method which might be more intuitive which just uploads one file at a time to a given dataset, see app.py · chrisjay/afro-speech at main.