Joining datasets by column & best practices for multi-view datasets
|
|
3
|
3048
|
May 13, 2024
|
Commas in transcription Audio dataset?
|
|
1
|
115
|
May 13, 2024
|
Skipping in Steaming mode takes forever
|
|
4
|
792
|
May 13, 2024
|
How to mark unknown values in ClassLabel with negative numbers?
|
|
2
|
125
|
May 13, 2024
|
Can the parquet-converter bot handle list[str] dtypes found in a webdataset?
|
|
4
|
164
|
May 12, 2024
|
Medical QnA dataset with context
|
|
0
|
227
|
May 12, 2024
|
Question answeriung model, unanswerable question answer formatformat
|
|
3
|
284
|
May 8, 2024
|
How the presence of multiple values in certain entries affect the quality of the chatbot's
|
|
0
|
75
|
May 7, 2024
|
Load dataset from files already downloaded
|
|
1
|
142
|
May 6, 2024
|
Why are dict objects added to all keys for all records?
|
|
3
|
167
|
May 6, 2024
|
Videos for training data
|
|
1
|
403
|
May 6, 2024
|
Dataset revision number
|
|
8
|
889
|
May 6, 2024
|
Don't know how to split imdb to train, test, validation
|
|
0
|
344
|
May 6, 2024
|
How can you use downloaded dataset in streaming mode offline?
|
|
0
|
228
|
May 5, 2024
|
Fastest way to do inference on a large dataset in huggingface?
|
|
5
|
3366
|
May 3, 2024
|
How to split large metadata.jsonl for ImageFolder?
|
|
2
|
233
|
May 2, 2024
|
How to use Join operations like merege in Datasets
|
|
0
|
152
|
May 2, 2024
|
Any workaround for push_to_hub() limits?
|
|
9
|
2233
|
May 2, 2024
|
Not able to push dataset/model with write token
|
|
1
|
221
|
April 30, 2024
|
Keeping IterableDataset node-wise split fixed during DDP
|
|
8
|
2022
|
April 29, 2024
|
Ds.map(): optimizing PIL Image processing as tensorflow tensor
|
|
2
|
1369
|
April 27, 2024
|
Standard way to upload huge dataset
|
|
5
|
623
|
April 26, 2024
|
I need to create my own dataset based on mlabonne/orpo-dpo-mix-40k. but when i does it and create a dataset for ORPO training it gives error
|
|
2
|
607
|
April 26, 2024
|
Loading images directly in data folder
|
|
2
|
766
|
April 26, 2024
|
Cannot user load_dataset in Google colab
|
|
6
|
1928
|
April 26, 2024
|
Loading local dataset shows less classes than are present
|
|
0
|
200
|
April 25, 2024
|
Loading a large dataset occupies ~2GB on each GPU
|
|
0
|
103
|
April 24, 2024
|
Loading imagenet dataset shuts down kernel
|
|
0
|
133
|
April 24, 2024
|
Map with num_proc over 1 fails
|
|
1
|
177
|
April 24, 2024
|
Two labels, getting converted to 0 and 247
|
|
0
|
101
|
April 24, 2024
|