I have a dataset that I want to update once in a while. When I call dataset.push_to_hub(repo_id = f"{COMPANY_NAME}/{dataset_name}", private=True, token=os.environ['HUGGINGFACE_TOKEN'], split=split)
, it
- either silently does not update the dataset, even though I called
datasets.disable_caching()
- or raises
ValueError: Split train already present
inSplitInfo
.
Is there a simple way to force update an already present dataset split? Ideally, with push_to_hub(), but any simple Python code will do, as long as I can update private datasets in
COMPANY_NAME`.