Loading train and test splits with `audiofolder`
|
|
5
|
1711
|
February 10, 2024
|
Three-way Random Split
|
|
2
|
2399
|
March 19, 2021
|
Create custom splits
|
|
3
|
2072
|
June 1, 2023
|
Huggingface datasets streaming problem
|
|
6
|
1567
|
July 27, 2022
|
Structure-agnostic Knowledge Graph Extracting LLM
|
|
0
|
730
|
February 6, 2024
|
How to load large-scale text-image pair dataset
|
|
4
|
1032
|
February 7, 2025
|
Question about loading wikipedia datset
|
|
2
|
2368
|
November 11, 2020
|
UnicodeEncodeError: surrogates not allowed
|
|
2
|
2333
|
May 26, 2024
|
Loading dataset from disk taking more time than expected
|
|
0
|
718
|
March 14, 2022
|
OverflowError: There was an overflow with type <class 'list'>. Try to reduce writer_batch_size to have batches smaller than 2GB
|
|
2
|
2327
|
May 14, 2022
|
Slow DataLoader with big batch_size
|
|
4
|
1785
|
October 5, 2023
|
Access denied when reading files in dataset
|
|
4
|
1774
|
September 23, 2021
|
Custom dataset for Mask2Former finetuning
|
|
2
|
2287
|
November 23, 2023
|
Datasets.load_metric("cer") does not work
|
|
2
|
2280
|
November 17, 2021
|
[SOLVED] Dataset.map() is frozen on ELI5
|
|
2
|
2272
|
August 24, 2020
|
TypeError in load_dataset related to a sequence of strings
|
|
3
|
1955
|
October 3, 2022
|
Splitting dataset from generator
|
|
3
|
1940
|
January 26, 2023
|
Which tokenizer does "rouge" metric uses under the hood?
|
|
2
|
2227
|
July 11, 2022
|
How to use load_dataset the dataset downloaded via snapshot_download?
|
|
4
|
1720
|
July 8, 2024
|
How to use Datasets in a distributed system?
|
|
4
|
1717
|
March 22, 2023
|
RuntimeError: Error while uploading 'data/train-00040-of-00157-15109dabc9b3967a.parquet' to the Hub
|
|
2
|
394
|
November 28, 2024
|
ImportError: cannot import name 'CommitInfo' from 'huggingface_hub'
|
|
0
|
680
|
July 11, 2024
|
Creating a Dataset object from large pandas dataframe
|
|
3
|
1893
|
July 21, 2022
|
How to access order of shards in streaming IterableDataset
|
|
1
|
1498
|
October 6, 2022
|
Use Git to download datasets but fails to load
|
|
4
|
1666
|
March 8, 2024
|
Streaming and creating refactored dataset with shards using Generator
|
|
4
|
292
|
October 30, 2024
|
How do I set feature type when loading dataset(ClassLabel etc)?
|
|
2
|
2091
|
January 19, 2022
|
Access to gated repositories
|
|
6
|
243
|
January 10, 2025
|
Transform list-like elements to rows
|
|
2
|
1173
|
May 14, 2021
|
WER Metric running out of Memory
|
|
3
|
1792
|
April 30, 2021
|
Read CSV multi threading
|
|
5
|
1460
|
July 21, 2021
|
Seeing AttributeError: 'Dataset' object has no attribute 'reshape' when using "dataset.get_nearest_examples"
|
|
3
|
1789
|
June 28, 2023
|
Problem with Dataset Preview with audio files
|
|
7
|
1261
|
April 17, 2025
|
Uploading large datasets iteratively
|
|
4
|
1594
|
October 30, 2023
|
Datasets not behaving as expected after random data augmentation with map
|
|
7
|
1260
|
September 23, 2021
|
Which URLs should be reachable to work with Huggingface hub
|
|
2
|
2056
|
January 26, 2022
|
Does masked language model training script does random shuffle on the dataset?
|
|
4
|
1593
|
October 29, 2021
|
Cache is not being loaded when code is called from a Jupyter notebook
|
|
5
|
1447
|
September 8, 2022
|
How do I get the dataset loader working with multiple versions?
|
|
4
|
1584
|
November 8, 2022
|
Import Error: Need to install datasets
|
|
1
|
2502
|
November 25, 2021
|
ArrowNotImplementedError when loading json dataset
|
|
3
|
1759
|
December 17, 2021
|
Trojan in common_voice dataset?
|
|
8
|
1171
|
June 30, 2022
|
Create multiple dataset subsets at the same time
|
|
0
|
111
|
December 8, 2024
|
Calculate custom dataset size
|
|
3
|
1744
|
June 15, 2022
|
DatasetDict.map function generated very big cache files from a relatively small data
|
|
3
|
1740
|
May 25, 2023
|
Splitting dataset via length
|
|
3
|
1740
|
September 1, 2022
|
Error when downloading own dataset with git lfs files
|
|
4
|
1547
|
June 22, 2022
|
AttributeError: 'DatasetDict' object has no attribute 'to'
|
|
1
|
2431
|
November 29, 2022
|
When using Dataset.map to tokenize a dataset, the speed slows down as the progress approaches 100%
|
|
3
|
966
|
December 23, 2024
|
Rouge implementation of Huggingface Datasets
|
|
2
|
1981
|
November 18, 2021
|