Bfloat16 conversion results in significantly slower computation for various transformer models
|
|
0
|
1419
|
December 20, 2021
|
Why is using my DistilBERT model for inference so slow?
|
|
0
|
921
|
June 18, 2021
|
"table-question-answering" is not an available task under pipeline
|
|
6
|
2721
|
January 21, 2021
|
Finetuned model generating test label exactly
|
|
0
|
462
|
October 15, 2020
|
Tricks to control/surpress logging output
|
|
3
|
5546
|
August 29, 2020
|
Performing Back Translation with T5 network
|
|
4
|
1521
|
August 1, 2020
|
The Zero Shot demonstration site is broken
|
|
1
|
540
|
July 17, 2020
|
ICLR 2020 highlights - Yacine
|
|
1
|
1748
|
July 11, 2020
|
Paper Discussion: Weight Poisoning Attacks on Pre-trained Models
|
|
0
|
1029
|
July 8, 2020
|
About the Tokenizers category
|
|
1
|
312
|
July 7, 2020
|
About the Transformers category
|
|
1
|
242
|
July 7, 2020
|
Different results predicting from trainer and model
|
|
6
|
7962
|
December 20, 2021
|
GPT-2 trained models output repeated "!"
|
|
2
|
2796
|
December 20, 2021
|
Question about supported framework
|
|
2
|
340
|
June 18, 2021
|
AttributeError: 'Flaubert For Sequence Classification' object has no attribute 'predict'
|
|
2
|
3219
|
December 20, 2021
|
How to define the compute_metrics() function in Trainer?
|
|
3
|
16512
|
December 20, 2021
|
How to add special tokens to a pretrained model?
|
|
0
|
387
|
June 18, 2021
|
Masked Language Modeling (MLM) using TFBertForMaskedLM (Tensorflow)
|
|
4
|
590
|
January 21, 2021
|
NER fine-tuning
|
|
1
|
4742
|
December 20, 2021
|
Tensorflow h5 file doesn't contain network, it only include weighs?
|
|
1
|
586
|
December 20, 2021
|
Cannot download translation models in Colab
|
|
4
|
2856
|
June 18, 2021
|
Difference between language modeling scripts
|
|
1
|
476
|
December 20, 2021
|
Translating multiple languages to English (Tensorflow) - repost
|
|
1
|
750
|
December 20, 2021
|
Model Parallelism, how to parallelize transformer?
|
|
3
|
12725
|
June 18, 2021
|
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
|
|
1
|
1608
|
January 20, 2021
|
Do I need to apply the softmax function to my logit before calculating the CrossEntropyLoss?
|
|
1
|
3238
|
October 15, 2020
|
How to interpret logit score from Hugging face binary classification model and convert it to probability sore
|
|
0
|
1518
|
December 20, 2021
|
GPT-2 finetuning with Openwebtext: Socket timeout
|
|
0
|
1104
|
December 19, 2021
|
Fine tuning Labse 2 model?
|
|
0
|
1425
|
June 18, 2021
|
Getting a list of config parameters
|
|
0
|
285
|
December 19, 2021
|