🤗Transformers

Topic	Replies	Views	Activity
Text generation conditioned on numbers 🤗Transformers	0	404	May 26, 2022
Distilling T5-small for summarization 🤗Transformers	0	462	May 25, 2022
Decoder generate with prompts of variable lengths? 🤗Transformers	0	664	May 25, 2022
The best way to load pertained weights then continue training BERT? 🤗Transformers	1	840	May 25, 2022
Change Positional Embedding in T5 from Relative to Absolute 🤗Transformers	0	680	May 25, 2022
Pegasus max_token_len restriction 🤗Transformers	0	368	May 25, 2022
Is it Possible to modify the zero-shot classier? 🤗Transformers	0	291	May 24, 2022
Zero-Shot Classification 🤗Transformers	0	343	May 24, 2022
How to order sentences based on pairwise probabilities? 🤗Transformers	0	293	May 24, 2022
Inference result is SequenceClassifierOutput instance? 🤗Transformers	0	419	May 24, 2022
Training and evaluation loss goes down however, WER score stays the same 🤗Transformers	0	373	May 23, 2022
EncoderDecoderModel for Machine Translation 🤗Transformers	0	444	May 21, 2022
Cannot use the new model built 🤗Transformers	1	312	May 21, 2022
The best way to install and edit the transformers package locally? 🤗Transformers	2	1485	May 21, 2022
How to add additional module to BERT architecture, then load the original weight and use it 🤗Transformers	0	465	May 20, 2022
Logging which decoder selected in generation 🤗Transformers	0	339	May 19, 2022
How to input word2vec embeddings to gpt2 model? 🤗Transformers	0	636	May 17, 2022
ValueError: Mixed precision training with AMP or APEX (`--fp16` or `--bf16`) and half precision evaluation (`--fp16_full_eval` or `--bf16_full_eval`) can only be used on CUDA devices 🤗Transformers	0	1969	May 17, 2022
Problem with Adding LayerNorm after BART's Encoder for Summarization 🤗Transformers	0	392	May 16, 2022
How to log the eval metrics every `eval_steps` to a file? 🤗Transformers	1	646	May 16, 2022
How to use 1 model for 2 downstream tasks? 🤗Transformers	0	337	May 16, 2022
How to represent paginated documents as a single training data instance 🤗Transformers	2	616	May 16, 2022
Why is deberta-v3-large model twice as large on disk after MLM finetuning? (notebook to reproduce) 🤗Transformers	0	397	May 16, 2022
Memory efficiency when using softprompts DeepSpeed	0	383	May 15, 2022
Issues with building extensions in Deepspeed DeepSpeed	7	10296	May 14, 2022
Show and/or delete cached language models 🤗Transformers	0	1140	May 14, 2022
Can i use Transformer-XL for text classification task? 🤗Transformers	1	318	May 14, 2022
Attribute Error reported when loading training_args.bin 🤗Transformers	1	1415	May 13, 2022
Why is the CrossEntropyLoss ignore_index set to sequence length in BertForQuestionAnswering 🤗Transformers	0	791	May 13, 2022
Train a new tokenizer from scratch 🤗Transformers	4	1723	November 10, 2020