Gettings nan with deepspeed
|
|
0
|
846
|
March 20, 2021
|
Choosing correct seq2seq model
|
|
1
|
1620
|
March 19, 2021
|
Maybe there is a bug in BertTokenizer?
|
|
0
|
377
|
March 19, 2021
|
Difference between setting label index to -100 & setting attention mask to 0
|
|
5
|
2912
|
March 17, 2021
|
Model training in Multi GPU
|
|
1
|
1817
|
March 17, 2021
|
Wav2Vec2 For Swedish
|
|
6
|
952
|
March 17, 2021
|
Extracting the output of hidden BERT layers and re-training the BERT model on custom datasets
|
|
0
|
807
|
March 17, 2021
|
Tf transformers. New transformer based library for tensorflow and Albert joint model
|
|
0
|
224
|
March 17, 2021
|
Missing `model_type` key in config.json of TinyBERT
|
|
4
|
6687
|
March 17, 2021
|
Problem with torch.multiprocessing and Roberta
|
|
2
|
2596
|
March 14, 2021
|
New model output types
|
|
7
|
5721
|
March 11, 2021
|
Weights of pre-trained BERT model not initialized
|
|
2
|
2067
|
March 11, 2021
|
Hyperparameter search
|
|
0
|
434
|
March 10, 2021
|
Can't reproduce xlm-roberta-large finetuned result on XNLI
|
|
2
|
1897
|
March 10, 2021
|
OOM issues with save_pretrained models
|
|
0
|
1052
|
March 9, 2021
|
Parameter groups and GPT2 LayerNorm
|
|
3
|
638
|
March 9, 2021
|
OOM run_seq2seq.py from checkpoint
|
|
0
|
186
|
March 8, 2021
|
fine-tune Pegasus with xsum using Colab but generation results have no difference
|
|
0
|
988
|
March 8, 2021
|
Different doc with BertForPretraining and TFBertForPretraining
|
|
2
|
280
|
March 7, 2021
|
Recommended way to perform batch inference for generation
|
|
0
|
2494
|
March 6, 2021
|
Cache T5 encoder results within batch when training
|
|
0
|
483
|
March 6, 2021
|
Can I train pytorch T5 on TPU with variable batch shape?
|
|
2
|
296
|
March 6, 2021
|
Saving memory with run_mlm.py with wikipedia datasets
|
|
0
|
720
|
March 4, 2021
|
Hyperparameter_search does not log params after first trial
|
|
0
|
326
|
March 4, 2021
|
ASR hypotheses rescoring with perplexity score
|
|
0
|
1193
|
March 4, 2021
|
Bert followed by a GRU
|
|
1
|
1192
|
March 3, 2021
|
Warning when adding compute_metrics function to Trainer
|
|
9
|
4792
|
March 3, 2021
|
Multilabel sequence classification with Roberta value error expected input batch size to match target batch size
|
|
1
|
4206
|
March 2, 2021
|
Workflow: how to avoid dummy_pt_objects.py in IDE search results?
|
|
9
|
1945
|
February 26, 2021
|
Error using `max_length` in transformers
|
|
3
|
2695
|
February 26, 2021
|