HF Trainer: HF trainer cause a problem while fine-tuning T5 (T5 doesn't generate eos token at proper point)
|
|
0
|
824
|
March 6, 2022
|
How to continue BERT training
|
|
1
|
1355
|
March 4, 2022
|
Model.generate() OOM on 1 of 2 GPUs?
|
|
4
|
1698
|
March 4, 2022
|
What are the goals in Positional Embedding methods?
|
|
2
|
507
|
March 3, 2022
|
Any examples on VisualBERTforMultipleChoice
|
|
1
|
418
|
March 3, 2022
|
After vocabulary extension the tokenizer keeps on running
|
|
0
|
321
|
March 2, 2022
|
How to use only one bert to do generation task with 'past_key_values' mechanism?
|
|
2
|
797
|
March 1, 2022
|
Use Trainer API with two valiation sets
|
|
1
|
1873
|
February 28, 2022
|
How to remove input from from generated text in GPTNeo?
|
|
0
|
987
|
March 1, 2022
|
Word embedding with BERT
|
|
0
|
628
|
February 28, 2022
|
Self-attention masking for T5 encoder?
|
|
0
|
1708
|
February 27, 2022
|
BERT for NextSentencePrediction train and inference problem, thanks
|
|
0
|
636
|
February 25, 2022
|
Add_tokens + finetune
|
|
0
|
533
|
February 25, 2022
|
BertPreTrainedModel and RobertaPreTrainedModel works, however PreTrainedModel does not work
|
|
0
|
1068
|
February 25, 2022
|
Errors when training on multi node single gpu
|
|
1
|
1771
|
February 25, 2022
|
DistilHubert: PyTorch to ONNX conversion issue
|
|
3
|
739
|
February 24, 2022
|
Pipeline text classification with two sequences for each example
|
|
2
|
748
|
February 24, 2022
|
Self-pretrained model predicts token with -1 index gap
|
|
0
|
669
|
February 22, 2022
|
Huge disparity between CPU and GPU memory usage?
|
|
0
|
406
|
February 22, 2022
|
MarianMT training produce "▁" in results
|
|
1
|
325
|
February 21, 2022
|
Random seed for weight initialization and data order
|
|
0
|
1238
|
February 21, 2022
|
Use asr-wav2vec2-commonvoice-fr model offline
|
|
1
|
937
|
February 21, 2022
|
Get embedding from finetuned BertForSequenceClassification model
|
|
1
|
3731
|
February 19, 2022
|
Transformer for TF 1.15.0?
|
|
2
|
1514
|
February 18, 2022
|
Errors when fine-tuning using Keras
|
|
0
|
670
|
February 18, 2022
|
Errors while fine-tuning using Keras
|
|
2
|
1203
|
February 18, 2022
|
Which model of transformers to use if I want to do multiclassification of a pair of sentences containing a questionair
|
|
0
|
252
|
February 18, 2022
|
Two transformers in one model
|
|
0
|
244
|
February 17, 2022
|
Sentence length influence on similarity
|
|
1
|
396
|
February 17, 2022
|
Callbacks for logging results to GPT2
|
|
1
|
465
|
February 16, 2022
|