Training fails on multiple gpu throwing cuda runtime errors
|
|
0
|
922
|
September 30, 2022
|
Training Reproducibility when resuming from checkpoint
|
|
0
|
354
|
September 30, 2022
|
Problem in loading an old sentence classification roberta model generated using transformer version 3.0.2 with new library
|
|
0
|
644
|
September 30, 2022
|
GLUE-STS Finetune Error
|
|
0
|
394
|
September 30, 2022
|
Problems with a custom model using a transformer base model in the evaluation phase (eval_strategy)
|
|
0
|
856
|
September 30, 2022
|
Useful compute_metrics functions for perplexity
|
|
0
|
641
|
September 29, 2022
|
How to change the Text embedder(Layoutlmv2Tokenizer) in LayoutLMv2 model?
|
|
3
|
522
|
September 29, 2022
|
M2M100 training does not improve model performance
|
|
0
|
303
|
September 29, 2022
|
Constant output predictions on test data
|
|
0
|
509
|
September 29, 2022
|
ELECTRA TF2 => PT Convert Problem
|
|
0
|
224
|
September 28, 2022
|
Multi-instance transformers
|
|
0
|
245
|
September 27, 2022
|
Evaluation results (metric) during training is different from the evaluation results at the end
|
|
4
|
3256
|
September 26, 2022
|
Speeding up Tokenization on large text corpus
|
|
0
|
446
|
September 26, 2022
|
RuntimeError: cannot reshape tensor of 0 elements into shape [-1, 0] because the unspecified dimension size -1 can be any value and is ambiguous
|
|
1
|
8338
|
September 26, 2022
|
How to take ensemble of T5ForConditionalGeneration?
|
|
0
|
284
|
September 25, 2022
|
New Layer in BERT
|
|
0
|
200
|
September 25, 2022
|
Further train a fine tuned wav2vec model
|
|
2
|
535
|
September 25, 2022
|
How to parallelize model in order version
|
|
0
|
221
|
September 24, 2022
|
GPT-J generating chatbot response
|
|
2
|
2686
|
September 23, 2022
|
Why is transformer decoder always generating output of same length as gold labels?
|
|
0
|
574
|
September 23, 2022
|
Create a Few Shots NER
|
|
0
|
997
|
September 22, 2022
|
How to generate text with T5Model other than T5ForConditionalGeneration?
|
|
0
|
301
|
September 22, 2022
|
How can I train M2M-100 or NLLB-200 on my parallel bilingual corpus?
|
|
0
|
793
|
September 22, 2022
|
Fine-Tuning DeBERTa Produces Non-Results
|
|
3
|
3132
|
September 21, 2022
|
How to map generated characters to tokens?
|
|
0
|
484
|
September 21, 2022
|
T5 model fine-tuning in the stsb dataset generates wrong outputs
|
|
2
|
934
|
September 21, 2022
|
Why model.generate does encoding multiple times
|
|
1
|
565
|
September 20, 2022
|
Baseline vs language-specific finetuned model for multilingual speech recognition
|
|
0
|
314
|
September 20, 2022
|
Can Mac M1 GPU be used to train HappyGeneration
|
|
0
|
609
|
September 20, 2022
|
Load a cached custom model in offline mode
|
|
1
|
10418
|
September 19, 2022
|