Upload File API for Saving to Persistent Datasets on HF Spaces

I am trying to demonstrate the loading and writing of a dataset where I can persist a history of messages and within spaces load my token key which would provide write access to the data to save the information back to the dataset.

I would like to use the example in a class I teach on Fridays as a persistent dataset example. URL is here: Memory Shared - a Hugging Face Space by awacke1

My code errors here:

def store_message(name: str, message: str):
if name and message:
with open(DATA_FILE, “a”) as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=[“name”, “message”, “time”])
writer.writerow(
{“name”: name, “message”: message, “time”: str(datetime.now())}
)
commit_url = upload_file(
DATA_FILE,
path_in_repo=DATA_FILENAME,
repo_id=DATASET_REPO_ID,
token=HF_TOKEN,
)

    print(commit_url)

return generate_html()

I set up the parameters used here:
DATASET_REPO_URL = “awacke1/data.csv · Datasets at Hugging Face”
DATASET_REPO_ID = “awacke1/data.csv”
DATA_FILENAME = “data.csv”
DATA_FILE = os.path.join(“data”, DATA_FILENAME)
HF_TOKEN = os.environ.get(“HF_TOKEN”)

The stack trace is shown below. I am pretty sure one of my variables is not correct which calls upload file api.

Cloning awacke1/data.csv · Datasets at Hugging Face into local empty directory.
IMPORTANT: You are using gradio version 2.4.2, however version 2.5.1 is available, please upgrade.

Running on local URL: http://0.0.0.0:7860/

To create a public link, set share=True in launch().
Traceback (most recent call last):
File “/home/user/.local/lib/python3.8/site-packages/gradio/networking.py”, line 193, in predict
prediction, durations = app.interface.process(raw_input)
File “/home/user/.local/lib/python3.8/site-packages/gradio/interface.py”, line 363, in process
predictions, durations = self.run_prediction(
File “/home/user/.local/lib/python3.8/site-packages/gradio/interface.py”, line 332, in run_prediction
prediction = predict_fn(*processed_input)
File “app.py”, line 78, in store_message
commit_url = upload_file(
File “/home/user/.local/lib/python3.8/site-packages/huggingface_hub/hf_api.py”, line 1346, in upload_file
raise err
File “/home/user/.local/lib/python3.8/site-packages/huggingface_hub/hf_api.py”, line 1337, in upload_file
r.raise_for_status()
File “/home/user/.local/lib/python3.8/site-packages/requests/models.py”, line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/api/awacke1/data.csv/upload/main/data.csv

Can you point out what I need to change in parameters below to get my commit URL correct?
commit_url = upload_file(
DATA_FILE,
path_in_repo=DATA_FILENAME,
repo_id=DATASET_REPO_ID,
token=HF_TOKEN,
)

I think you’re simpy missing repo_type="dataset" in upload_file :wink:

1 Like