Pandas dataframe save to dataset gives error

I was looking for a way to write a logfile after my model has made a prediction but got stuck as the append mode is not supported by the file system
I tried to make a dataframe and save it - it is able to read and load the dataframe but unable to save it
I am following : Interact with the Hub through the Filesystem API

from huggingface_hub import HfFileSystem
import pandas as pd

token=‘hf_xxxx’

fs = HfFileSystem(token=token)

df = pd.read_csv(“hf://datasets/sujitb/data/querylog.csv”)

till this it works fine

df.to_csv(“hf://datasets/sujitb/data/querylog.csv”)

throws error

RepositoryNotFoundError: 401 Client Error. (Request ID: Root=1-66098a8a-2f8d371169e6f6991a126d89;3366d0ed-0e17-4061-8f41-845dfdc872a4)

Repository Not Found for url: https://huggingface.co/api/datasets/sujitb/data/preupload/main.
Please make sure you specified the correct repo_id and repo_type.
If you are trying to access a private or gated repo, make sure you are authenticated.
Invalid username or password.
Note: Creating a commit assumes that the repo already exists on the Huggingface Hub. Please use create_repo if it’s not the case.

Has anyone faced this or got around this issue?

Hi ! you need to login using huggingface-cli login command or pass your token as storage_options (it’s just a dict containing your token)

token=‘hf_xxxx’
fs = HfFileSystem(token=token)
df.to_csv(“hf://datasets/sujitb/data/querylog.csv”, storage_options=fs.storage_options)

Otherwise df.to_csv will use a brand new HfFileSystem() without token to upload the file and fail

1 Like

Thanks a lot for your response ! Passing the storage_options solved my problem. I just passed a dict {“token”:‘xxxxxxxxxxx’} and it worked.

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.