Incorporating structural information in a Transformer?
|
|
0
|
718
|
April 6, 2022
|
Can you use both copy mechanism and BPE for a NMT task?
|
|
0
|
712
|
April 6, 2022
|
Is there an easy way to apply layer-wise decaying learning rate in huggingface trainer for RobertaMaskedForLM?
|
|
3
|
2914
|
April 5, 2022
|
The discussion is about entity recognition and corefrence resolution
|
|
0
|
718
|
March 25, 2022
|
GPT2 for QA Pair Generation
|
|
9
|
8597
|
March 23, 2022
|
Converting Test Case Description into Test case Steps
|
|
0
|
780
|
March 4, 2022
|
Best Pre-training Strategy
|
|
0
|
744
|
March 3, 2022
|
Relative Position Representation/Encoding for Transformer
|
|
0
|
1923
|
February 22, 2022
|
How find idea for academic thesis?
|
|
2
|
877
|
February 19, 2022
|
Extractive oracle
|
|
0
|
813
|
February 9, 2022
|
A Survey to Understand Challenges of Deploying Text Classification
|
|
2
|
943
|
February 8, 2022
|
Question Answering model on mathematical domain for the greek language
|
|
0
|
813
|
February 1, 2022
|
Finetuning German BERT for QA on biomedical domain
|
|
2
|
1012
|
January 30, 2022
|
[Suggestions and Guidance]Finetuning Bert models for Next word Prediction
|
|
4
|
4863
|
January 26, 2022
|
Suggestions for an open source tagging tool to build custom LayoutLMv2 datasets
|
|
0
|
910
|
January 25, 2022
|
Paper Notes: Deepspeed Mixture of Experts
|
|
2
|
2197
|
January 20, 2022
|
How does the vocabulary size count towards total parameter size of a model?
|
|
0
|
2301
|
January 18, 2022
|
Guide: The best way to calculate the perplexity of fixed-length models
|
|
9
|
9351
|
December 16, 2021
|
Few shot automatic moderation
|
|
0
|
666
|
November 20, 2021
|
Let's Make an Ethics Chat Bot that's Not Racist!
|
|
0
|
732
|
November 16, 2021
|
New Paper: Masked Autoencoders Are Scalable Vision Learners
|
|
0
|
1372
|
November 14, 2021
|
Improving performance of Wav2Vec2 fine tuning with word piece vocabulary
|
|
5
|
2976
|
October 27, 2021
|
[Help needed] Extending Trainer for Meta learning
|
|
3
|
1571
|
October 19, 2021
|
Detection Transformer (DETR) for text detection in documents
|
|
0
|
2022
|
September 29, 2021
|
Summarization for downstream task
|
|
0
|
657
|
September 15, 2021
|
[Call for participation] Interactive Grounded Language Understanding in a Collaborative Environment (IGLU) Competition@NeurIPS2021
|
|
0
|
725
|
September 9, 2021
|
Implementing a custom Attention Transformer
|
|
5
|
3161
|
September 6, 2021
|
Collaborative Training Experiment Round 2 with Yandex and HuggingFace
|
|
0
|
565
|
September 1, 2021
|
Tutorial / codebase for models interacting while training?
|
|
0
|
494
|
August 29, 2021
|
10_000 samples & 10_000 labels
|
|
0
|
509
|
July 31, 2021
|