@sshleifer Progress Update Aug 4 -> Aug 19
|
|
5
|
502
|
August 19, 2020
|
Albert MLM is slow
|
|
0
|
735
|
August 19, 2020
|
Marian Deprecation Warning
|
|
0
|
240
|
August 18, 2020
|
Can we resize embedding with embedding weighted initialized differently?
|
|
0
|
1351
|
August 18, 2020
|
Best models for seq2seq tasks
|
|
3
|
1098
|
August 16, 2020
|
How to load a google's bert ckpt using tf2
|
|
3
|
1309
|
August 14, 2020
|
Is there any pretraining script for BART?
|
|
0
|
1217
|
August 14, 2020
|
Write With Transformers XLNet Broken
|
|
6
|
445
|
August 13, 2020
|
How to do selective masking in Language modeling
|
|
3
|
524
|
August 13, 2020
|
Masked language modeling loss
|
|
1
|
4449
|
August 13, 2020
|
GPT2 Implementation from scratch
|
|
0
|
391
|
August 11, 2020
|
Addition for Migration Documentation
|
|
0
|
234
|
August 10, 2020
|
Language pair with multiple models on the model hub?
|
|
1
|
336
|
August 10, 2020
|
Any Pre-trained reformer model available for classification fine tuning
|
|
4
|
1174
|
August 10, 2020
|
Looking for translation mechanism (es-en,en-es)
|
|
1
|
533
|
August 10, 2020
|
How to use `.modules()` command to get all the parameters that pertains to the uppermost layer of `roberta-large` model?
|
|
1
|
4065
|
August 10, 2020
|
How to obtain GPT2 output after softmax layer, along with the gradient information?
|
|
0
|
401
|
August 8, 2020
|
Tiny mBART doc/info
|
|
14
|
2183
|
August 7, 2020
|
NER on multiple languages
|
|
1
|
2818
|
August 6, 2020
|
DPRQuestionEncoder
|
|
5
|
1004
|
August 5, 2020
|
Static type checking(with mypy): What's the official position?
|
|
9
|
4928
|
August 4, 2020
|
Robertaforquestionanswering
|
|
1
|
2177
|
August 3, 2020
|
What is the purpose of the additional dense layer in classification heads?
|
|
9
|
10023
|
August 3, 2020
|
Search query autocomplete from the queries I have in my data
|
|
0
|
1658
|
July 31, 2020
|
Untrained models produce inconsistent outputs
|
|
3
|
1155
|
July 30, 2020
|
Unbalanced training with BERT
|
|
0
|
697
|
July 27, 2020
|
IDE Tips for reading abstracted code
|
|
7
|
1008
|
July 27, 2020
|
Albert LM on WikiText2
|
|
0
|
770
|
July 27, 2020
|
`tpu_cores` can only be 1, 8 or [<1-8>]
|
|
5
|
714
|
July 25, 2020
|
BERT performs worse than other implementations?
|
|
0
|
775
|
July 24, 2020
|