Joining datasets by column & best practices for multi-view datasets
|
|
3
|
2937
|
May 13, 2024
|
Commas in transcription Audio dataset?
|
|
1
|
114
|
May 13, 2024
|
Skipping in Steaming mode takes forever
|
|
4
|
736
|
May 13, 2024
|
How to mark unknown values in ClassLabel with negative numbers?
|
|
2
|
125
|
May 13, 2024
|
Can the parquet-converter bot handle list[str] dtypes found in a webdataset?
|
|
4
|
163
|
May 12, 2024
|
Medical QnA dataset with context
|
|
0
|
216
|
May 12, 2024
|
Question answeriung model, unanswerable question answer formatformat
|
|
3
|
280
|
May 8, 2024
|
How the presence of multiple values in certain entries affect the quality of the chatbot's
|
|
0
|
75
|
May 7, 2024
|
Load dataset from files already downloaded
|
|
1
|
135
|
May 6, 2024
|
Why are dict objects added to all keys for all records?
|
|
3
|
162
|
May 6, 2024
|
Videos for training data
|
|
1
|
383
|
May 6, 2024
|
Dataset revision number
|
|
8
|
831
|
May 6, 2024
|
Don't know how to split imdb to train, test, validation
|
|
0
|
331
|
May 6, 2024
|
How can you use downloaded dataset in streaming mode offline?
|
|
0
|
218
|
May 5, 2024
|
Fastest way to do inference on a large dataset in huggingface?
|
|
5
|
3285
|
May 3, 2024
|
How to split large metadata.jsonl for ImageFolder?
|
|
2
|
229
|
May 2, 2024
|
How to use Join operations like merege in Datasets
|
|
0
|
149
|
May 2, 2024
|
Any workaround for push_to_hub() limits?
|
|
9
|
2167
|
May 2, 2024
|
Not able to push dataset/model with write token
|
|
1
|
214
|
April 30, 2024
|
Keeping IterableDataset node-wise split fixed during DDP
|
|
8
|
1939
|
April 29, 2024
|
Error EBUG:filelock:Attempting to acquire lock
|
|
0
|
941
|
April 27, 2024
|
Ds.map(): optimizing PIL Image processing as tensorflow tensor
|
|
2
|
1363
|
April 27, 2024
|
Standard way to upload huge dataset
|
|
5
|
603
|
April 26, 2024
|
I need to create my own dataset based on mlabonne/orpo-dpo-mix-40k. but when i does it and create a dataset for ORPO training it gives error
|
|
2
|
603
|
April 26, 2024
|
Loading images directly in data folder
|
|
2
|
753
|
April 26, 2024
|
Cannot user load_dataset in Google colab
|
|
6
|
1868
|
April 26, 2024
|
Loading local dataset shows less classes than are present
|
|
0
|
197
|
April 25, 2024
|
Loading a large dataset occupies ~2GB on each GPU
|
|
0
|
102
|
April 24, 2024
|
Loading imagenet dataset shuts down kernel
|
|
0
|
133
|
April 24, 2024
|
Map with num_proc over 1 fails
|
|
1
|
166
|
April 24, 2024
|