Loading a fraction of data
|
|
5
|
5273
|
May 12, 2023
|
Can I get the access list of my dataset?
|
|
0
|
184
|
May 30, 2023
|
RFC: Licensing datasets that alter existing datasets
|
|
0
|
343
|
May 29, 2023
|
How can I load the (labels of the) imagenet val dataset?
|
|
0
|
932
|
May 29, 2023
|
Length error using `map` with datasets
|
|
2
|
753
|
May 29, 2023
|
Load the Flores Data set
|
|
4
|
1174
|
May 25, 2023
|
DatasetDict.map function generated very big cache files from a relatively small data
|
|
3
|
1724
|
May 25, 2023
|
Writing load_dataset loading script with multiple code files
|
|
3
|
301
|
May 25, 2023
|
AttributeError: module 'dill._dill' has no attribute 'log'
|
|
3
|
2161
|
May 25, 2023
|
'list' as a feature in huggingface dataset
|
|
1
|
1134
|
May 25, 2023
|
How to use load_dataset to load my own local dataset?
|
|
1
|
909
|
May 24, 2023
|
When is the cache written to file?
|
|
3
|
260
|
May 22, 2023
|
Authentication Error Datasets
|
|
1
|
1251
|
May 21, 2023
|
Understanding the `Datasets` cache system
|
|
2
|
3254
|
May 19, 2023
|
Serially creating a very large dataset using from_generator(), slower each iteration (slows to >2-3s per example!)
|
|
1
|
767
|
May 18, 2023
|
How to concatenate 100s of small datasets into a very large dataset? *Without* loading into memory?
|
|
1
|
430
|
May 18, 2023
|
Generate dataset with empty features
|
|
2
|
1855
|
May 17, 2023
|
ConnectionError: Couldn't reach 'database' while doing distributed training
|
|
0
|
463
|
May 17, 2023
|
IterableDataset compute feature mean and create histogram
|
|
2
|
440
|
May 15, 2023
|
I uploaded a dataset through huggface web interface. But i can't load it!
|
|
3
|
1001
|
May 14, 2023
|
Data_files not working with custom loading script and remote dataset
|
|
3
|
769
|
May 12, 2023
|
Column lengths mismatch in IterableDataset
|
|
2
|
980
|
May 12, 2023
|
Data files not working with custom loading script and dataset
|
|
3
|
1302
|
May 2, 2023
|
Seeding everything to get the same masked words
|
|
1
|
575
|
May 12, 2023
|
Cannot access RedPajama-Data-1T-Sample sub-file
|
|
2
|
378
|
May 12, 2023
|
Loading a h5ad file with HF datasets
|
|
1
|
422
|
May 8, 2023
|
KeyError: "length" - load_from_disk Training Model on AWS SageMaker
|
|
4
|
2238
|
May 5, 2023
|
Prepare data for pretraining T5 model
|
|
1
|
1073
|
May 4, 2023
|
Couldn't find 'my_dataset' on the Hugging Face Hub
|
|
4
|
3259
|
May 2, 2023
|
How to return custom `token_type_ids` from a tokenizer?
|
|
0
|
307
|
May 2, 2023
|