BART for Portuguese
|
|
7
|
1689
|
October 20, 2020
|
`add_prefix_space=True` option for the BPE tokenizer
|
|
0
|
1657
|
October 19, 2020
|
Are the weights of the maskedLM head of the `BertForMaskedLM` model pre-trained?
|
|
0
|
417
|
October 19, 2020
|
How to fine-tune the output head of the pre-trained Transformer models?
|
|
0
|
487
|
October 19, 2020
|
Adding a new model to Transformers with additional dependencies
|
|
15
|
1456
|
October 19, 2020
|
More complex training setups
|
|
4
|
1014
|
October 18, 2020
|
Why do different tokenizers use different vocab files?
|
|
0
|
1771
|
October 18, 2020
|
Training GPT2 on CPUs?
|
|
4
|
1664
|
October 17, 2020
|
For the logits from HuggingFace Transformer models, can the sum of the elements of the logit vector be greater than 1?
|
|
1
|
1608
|
October 16, 2020
|
Clarification for the forward function of the SequenceSummary class from modeling_utils.py
|
|
0
|
368
|
October 16, 2020
|
Do I need to apply the softmax function to my logit before calculating the CrossEntropyLoss?
|
|
1
|
3211
|
October 15, 2020
|
Keeping some tokens untranslated
|
|
0
|
557
|
October 15, 2020
|
Getting predictions
|
|
1
|
284
|
October 15, 2020
|
Distillation: create student model from a different base model than teacher
|
|
9
|
2054
|
October 14, 2020
|
Is there any way to control the input of a `Longformer` layer?
|
|
1
|
253
|
October 14, 2020
|
I'm getting "nan" value for loss, while following a tutorial from the documentatin
|
|
0
|
662
|
October 14, 2020
|
[RFC] Transformers Pipeline v2
|
|
4
|
1849
|
October 14, 2020
|
Finetuning Pegasus for summarization task
|
|
3
|
1045
|
October 14, 2020
|
Warning occured when trying to load checkpoint to continue training
|
|
5
|
2268
|
October 13, 2020
|
Longformer for sequenceclassification
|
|
5
|
471
|
October 13, 2020
|
Using LongformerForMultipleChoice for processing multiple-choice questions with the 4 options
|
|
1
|
658
|
October 13, 2020
|
Pplm runtime error with finetuned model
|
|
1
|
557
|
October 12, 2020
|
T5 fine tuning, loss difference when using labels and decoder_input_ids
|
|
2
|
1165
|
October 12, 2020
|
Strange error when using the Longformer (HuggingFace developers, please reply)
|
|
8
|
1796
|
October 12, 2020
|
Do I need token_type_ids for BertForSequenceClassification?
|
|
2
|
213
|
October 12, 2020
|
How do we insert our own datasets in DPR / RAG retrieval Q&A models?
|
|
1
|
1636
|
October 11, 2020
|
How to get cross-attention values of T5?
|
|
2
|
3817
|
October 9, 2020
|
Further Pretrain Basic BERT for sequence classification
|
|
4
|
1777
|
October 9, 2020
|
SavedModel export for DistilBERT is failing
|
|
9
|
507
|
October 9, 2020
|
Looking for tool class to do predictions
|
|
3
|
548
|
October 9, 2020
|