Is there an easy way to apply layer-wise decaying learning rate in huggingface trainer for RobertaMaskedForLM?
|
|
3
|
2942
|
April 5, 2022
|
The discussion is about entity recognition and corefrence resolution
|
|
0
|
718
|
March 25, 2022
|
GPT2 for QA Pair Generation
|
|
9
|
8605
|
March 23, 2022
|
Converting Test Case Description into Test case Steps
|
|
0
|
780
|
March 4, 2022
|
Best Pre-training Strategy
|
|
0
|
745
|
March 3, 2022
|
Relative Position Representation/Encoding for Transformer
|
|
0
|
1929
|
February 22, 2022
|
How find idea for academic thesis?
|
|
2
|
879
|
February 19, 2022
|
Extractive oracle
|
|
0
|
813
|
February 9, 2022
|
A Survey to Understand Challenges of Deploying Text Classification
|
|
2
|
943
|
February 8, 2022
|
Question Answering model on mathematical domain for the greek language
|
|
0
|
813
|
February 1, 2022
|
Finetuning German BERT for QA on biomedical domain
|
|
2
|
1016
|
January 30, 2022
|
[Suggestions and Guidance]Finetuning Bert models for Next word Prediction
|
|
4
|
4896
|
January 26, 2022
|
Suggestions for an open source tagging tool to build custom LayoutLMv2 datasets
|
|
0
|
910
|
January 25, 2022
|
Paper Notes: Deepspeed Mixture of Experts
|
|
2
|
2206
|
January 20, 2022
|
How does the vocabulary size count towards total parameter size of a model?
|
|
0
|
2314
|
January 18, 2022
|
Guide: The best way to calculate the perplexity of fixed-length models
|
|
9
|
9452
|
December 16, 2021
|
Few shot automatic moderation
|
|
0
|
666
|
November 20, 2021
|
Let's Make an Ethics Chat Bot that's Not Racist!
|
|
0
|
733
|
November 16, 2021
|
New Paper: Masked Autoencoders Are Scalable Vision Learners
|
|
0
|
1375
|
November 14, 2021
|
Improving performance of Wav2Vec2 fine tuning with word piece vocabulary
|
|
5
|
2994
|
October 27, 2021
|
[Help needed] Extending Trainer for Meta learning
|
|
3
|
1572
|
October 19, 2021
|
Detection Transformer (DETR) for text detection in documents
|
|
0
|
2030
|
September 29, 2021
|
Summarization for downstream task
|
|
0
|
657
|
September 15, 2021
|
[Call for participation] Interactive Grounded Language Understanding in a Collaborative Environment (IGLU) Competition@NeurIPS2021
|
|
0
|
727
|
September 9, 2021
|
Implementing a custom Attention Transformer
|
|
5
|
3182
|
September 6, 2021
|
Collaborative Training Experiment Round 2 with Yandex and HuggingFace
|
|
0
|
566
|
September 1, 2021
|
Tutorial / codebase for models interacting while training?
|
|
0
|
494
|
August 29, 2021
|
10_000 samples & 10_000 labels
|
|
0
|
510
|
July 31, 2021
|
Best way to infer continuously with Transformer?
|
|
0
|
557
|
July 26, 2021
|
The (hidden) meaning behind the embedding of the padding token?
|
|
2
|
6276
|
July 14, 2021
|