Continue pre-training Greek BERT with domain specific dataset
|
|
10
|
4671
|
January 4, 2023
|
Covid-19 - TPU V3-1024 - T5 11B: Tensorflow to Pytorch conversion failed
|
|
1
|
556
|
January 3, 2023
|
MSc Computer Science and SwDesign in an Open Source Project: Transformers
|
|
0
|
289
|
January 3, 2023
|
Closed end text generation
|
|
0
|
453
|
January 3, 2023
|
How to calculate perplexity from the `generate` function?
|
|
2
|
2319
|
January 2, 2023
|
Huggingface Trainer eval while training
|
|
1
|
733
|
December 31, 2022
|
Validation Loss for VITMAE
|
|
1
|
591
|
December 30, 2022
|
Deploying PyTorch ViT to Vertex AI using model artifacts
|
|
0
|
334
|
December 29, 2022
|
How can I use class_weights when training?
|
|
19
|
30820
|
December 29, 2022
|
Train MarianMT from scratch using transformers
|
|
0
|
325
|
December 28, 2022
|
Huggingface LR Decay Schedulers Spend the first epoch w/ an LR of 0
|
|
1
|
793
|
December 27, 2022
|
[google/flan-t5-xl] Scores in each result
|
|
0
|
219
|
December 27, 2022
|
`BertEmbeddings` contains positional embedding?
|
|
2
|
3171
|
December 27, 2022
|
'Bert' object has no attribute 'config'
|
|
0
|
1232
|
December 26, 2022
|
Export whisper large model to ONNX and prediction
|
|
0
|
551
|
December 26, 2022
|
Generation but constraining first few tokens
|
|
0
|
738
|
December 25, 2022
|
`KeyError: 'eval_loss'` when using Trainer with BertForQA
|
|
7
|
7362
|
September 14, 2022
|
Increasing eval batch size in trainer api causes size mismatch during evaluation
|
|
0
|
497
|
December 24, 2022
|
How do tokenizer(text_target=text) work
|
|
0
|
450
|
December 24, 2022
|
Train a large transformer with Custom Tokenizer/Data
|
|
0
|
355
|
December 23, 2022
|
Unmasker probabilities for all tokens in sequence
|
|
0
|
223
|
December 23, 2022
|
Encoder decoder model
|
|
0
|
292
|
December 23, 2022
|
Sentences' embeddings from BERT cross-encoder
|
|
0
|
279
|
December 22, 2022
|
Role of attention mask in base Bert
|
|
0
|
333
|
December 22, 2022
|
Is it possible to use Decision Transformers on text?
|
|
0
|
232
|
December 22, 2022
|
When i am using summarizer model, i am getting error anyone can fix my error
|
|
0
|
265
|
December 21, 2022
|
HF Trainer downstream evaluation on multiple GPUS
|
|
1
|
1090
|
December 21, 2022
|
Train T5 model with two different datasets
|
|
0
|
327
|
December 21, 2022
|
What are the best transformers for summarization of big texts in pt-BR?
|
|
0
|
333
|
December 21, 2022
|
Decode whisper logits to transcript using forward instead of generate method
|
|
3
|
1855
|
December 20, 2022
|