Pre-training datasets for base and roberta
|
|
0
|
234
|
May 12, 2022
|
Make text data continuous from DatasetDict
|
|
1
|
259
|
May 11, 2022
|
Index retrieval speed varies considerably with dataset size
|
|
2
|
416
|
May 9, 2022
|
Loader for dataset with multiple source files in one split
|
|
1
|
274
|
May 9, 2022
|
Small python dataset
|
|
1
|
267
|
May 7, 2022
|
HugginFace dataset error: RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor
|
|
3
|
635
|
May 6, 2022
|
What does the wikipedia dataset with the specific language and date mean?
|
|
1
|
248
|
May 5, 2022
|
Script to prepare load my own data into a DataSet
|
|
1
|
232
|
May 4, 2022
|
OpenbookQA and CommonsenseQA data format issues
|
|
2
|
222
|
May 4, 2022
|
Add Sequence(feature=ClassLabel(...), ...) to an existing dataset
|
|
1
|
293
|
May 2, 2022
|
Converting string label to int
|
|
4
|
2961
|
May 2, 2022
|
In-memory dataset to disk for caching operations
|
|
1
|
212
|
May 2, 2022
|
`push_to_hub` a dataset dict with subsets and splits (e.g., GLUE)
|
|
2
|
305
|
April 26, 2022
|
Running out of Diskspace
|
|
1
|
335
|
April 26, 2022
|
MyPy and DatasetDict. Error: Incompatible return value type (got "Union[DatasetDict, Dataset, IterableDatasetDict, IterableDataset]", expected "DatasetDict")
|
|
2
|
262
|
April 26, 2022
|
Wav2vec2 pretraining on own wav files
|
|
2
|
327
|
April 24, 2022
|
Datasets map is slower than pandas apply
|
|
0
|
237
|
April 23, 2022
|
Dataset.map hangs on tokenization (relatively small dataset)
|
|
2
|
357
|
April 22, 2022
|
When calling load_metric ('rouge') what file is downloaded (and where do I find it)?
|
|
1
|
305
|
April 22, 2022
|
Map() function doesn't process
|
|
2
|
293
|
April 21, 2022
|
Common Voice 8.0.0 en using all available RAM
|
|
6
|
458
|
April 19, 2022
|
BigPatent - cased version
|
|
2
|
513
|
April 14, 2022
|
Split DataFrame into validation and train split
|
|
2
|
488
|
April 11, 2022
|
Saving a dataset to disk after select copies the data
|
|
8
|
405
|
April 7, 2022
|
Flatten List of features
|
|
1
|
492
|
April 7, 2022
|
Download only a subset of a split
|
|
4
|
356
|
April 7, 2022
|
UnicodeDecodeError when loading Mulit Lingual text file
|
|
1
|
490
|
April 7, 2022
|
Can't automatically load_dataset due to network
|
|
1
|
398
|
April 7, 2022
|
Pushing multiple splits of dataset to a single repo of Hub
|
|
1
|
495
|
April 7, 2022
|
Representing nested dictionary with different keys
|
|
5
|
361
|
April 7, 2022
|