🤗Transformers

Topic	Replies	Views	Activity
Changing the default branch from master to main 🤗Transformers	0	920	March 21, 2022
Value error : Connection error 🤗Transformers	8	16219	August 4, 2021
Getting total train_runtime even if training stopped in the middle 🤗Transformers	0	878	March 20, 2022
Freezing layers when using gradient checkpointing 🤗Transformers	0	713	March 20, 2022
Domain-specific pre-training of GPT? Help! 🤗Transformers	1	650	March 18, 2022
NER at the Inference Time 🤗Transformers	0	441	March 18, 2022
About the Cross-attention Layer Shape in Encoder-Decoder Model 🤗Transformers	1	1914	March 18, 2022
Transformer loss 🤗Transformers	0	286	March 17, 2022
Error Loading google/bart-large or bart-xsum 🤗Transformers	1	359	March 17, 2022
Batch_decode does not give the correct output as generate 🤗Transformers	0	301	March 17, 2022
Sentiment Analysis Pipeline on single label function_to_apply not working 🤗Transformers	1	1032	March 17, 2022
Using trainer to train a bart model on 4 gpus failed 🤗Transformers	0	338	March 16, 2022
Pre-training a language model on a large dataset 🤗Transformers	5	3887	March 15, 2022
Fnet with upper case DeepSpeed	0	277	March 15, 2022
Continue LM pretraining with run_mlm - loss function clarification 🤗Transformers	0	460	March 14, 2022
Training arguments for flax 🤗Transformers	0	253	March 14, 2022
Need help understanding input of model in generation 🤗Transformers	0	251	March 14, 2022
How to extend the vocab of T5? 🤗Transformers	0	432	March 14, 2022
Use tf.data.Data with HuggingFace datasets 🤗Transformers	2	2641	March 22, 2021
Why we add math to word embedding 🤗Transformers	0	262	March 13, 2022
Convert tokens and token-labels to string 🤗Transformers	7	7631	March 12, 2022
BigBirdPegasus with attention_type="original_full" vs T5 🤗Transformers	0	254	March 11, 2022
NLP Pretrained model model doesn’t use GPU when making inference 🤗Transformers	11	10152	March 11, 2022
Adding linear layer to transformer model (+ save_pretrained and load_pretrained) 🤗Transformers	1	3784	March 10, 2022
Differences between transformers GPT2 and megatron-lm? 🤗Transformers	0	382	March 10, 2022
When can we expect TPU Trainer? 🤗Transformers	4	4067	March 3, 2022
Is there a way to get per word loss instead of the average loss for GPT model 🤗Transformers	0	334	March 7, 2022
Ensemble decoding 🤗Transformers	0	566	March 7, 2022
Torch JIT Training 🤗Transformers	0	1166	March 7, 2022
BartForConditionalGeneration : lm_head layer dimension change 🤗Transformers	0	445	March 7, 2022