Text generation conditioned on numbers
|
|
0
|
404
|
May 26, 2022
|
Distilling T5-small for summarization
|
|
0
|
462
|
May 25, 2022
|
Decoder generate with prompts of variable lengths?
|
|
0
|
664
|
May 25, 2022
|
The best way to load pertained weights then continue training BERT?
|
|
1
|
840
|
May 25, 2022
|
Change Positional Embedding in T5 from Relative to Absolute
|
|
0
|
680
|
May 25, 2022
|
Pegasus max_token_len restriction
|
|
0
|
368
|
May 25, 2022
|
Is it Possible to modify the zero-shot classier?
|
|
0
|
291
|
May 24, 2022
|
Zero-Shot Classification
|
|
0
|
343
|
May 24, 2022
|
How to order sentences based on pairwise probabilities?
|
|
0
|
293
|
May 24, 2022
|
Inference result is SequenceClassifierOutput instance?
|
|
0
|
419
|
May 24, 2022
|
Training and evaluation loss goes down however, WER score stays the same
|
|
0
|
373
|
May 23, 2022
|
EncoderDecoderModel for Machine Translation
|
|
0
|
444
|
May 21, 2022
|
Cannot use the new model built
|
|
1
|
312
|
May 21, 2022
|
The best way to install and edit the transformers package locally?
|
|
2
|
1485
|
May 21, 2022
|
How to add additional module to BERT architecture, then load the original weight and use it
|
|
0
|
465
|
May 20, 2022
|
Logging which decoder selected in generation
|
|
0
|
339
|
May 19, 2022
|
How to input word2vec embeddings to gpt2 model?
|
|
0
|
636
|
May 17, 2022
|
ValueError: Mixed precision training with AMP or APEX (`--fp16` or `--bf16`) and half precision evaluation (`--fp16_full_eval` or `--bf16_full_eval`) can only be used on CUDA devices
|
|
0
|
1969
|
May 17, 2022
|
Problem with Adding LayerNorm after BART's Encoder for Summarization
|
|
0
|
392
|
May 16, 2022
|
How to log the eval metrics every `eval_steps` to a file?
|
|
1
|
646
|
May 16, 2022
|
How to use 1 model for 2 downstream tasks?
|
|
0
|
337
|
May 16, 2022
|
How to represent paginated documents as a single training data instance
|
|
2
|
616
|
May 16, 2022
|
Why is deberta-v3-large model twice as large on disk after MLM finetuning? (notebook to reproduce)
|
|
0
|
397
|
May 16, 2022
|
Memory efficiency when using softprompts
|
|
0
|
383
|
May 15, 2022
|
Issues with building extensions in Deepspeed
|
|
7
|
10296
|
May 14, 2022
|
Show and/or delete cached language models
|
|
0
|
1140
|
May 14, 2022
|
Can i use Transformer-XL for text classification task?
|
|
1
|
318
|
May 14, 2022
|
Attribute Error reported when loading training_args.bin
|
|
1
|
1415
|
May 13, 2022
|
Why is the CrossEntropyLoss ignore_index set to sequence length in BertForQuestionAnswering
|
|
0
|
791
|
May 13, 2022
|
Train a new tokenizer from scratch
|
|
4
|
1723
|
November 10, 2020
|