🤗Transformers

Topic	Replies	Views	Activity
No module named 'deepspeed.checkpoint.utils' DeepSpeed	6	2129	June 28, 2023
Forward() got an unexpected keyword argument 'image' 🤗Transformers	0	830	June 28, 2023
Key-value pair from attention layer of GPT2 🤗Transformers	0	327	June 28, 2023
Inference problem after loading a fine tuned T5 model for seq2seq method 🤗Transformers	0	366	June 28, 2023
Difference between using the Trainer class vs Accelerate library DeepSpeed	0	912	June 27, 2023
Finetuning Llama 13B with my own dataset 🤗Transformers	2	2798	June 27, 2023
Non-meaningful response from finetuned GPT-2 model 🤗Transformers	0	450	June 26, 2023
Distributed training with Sagemaker 🤗Transformers	0	305	June 26, 2023
Exhaustive list of changes across all touchpoints in the tokenization pipeline of LM training 🤗Transformers	0	288	June 26, 2023
Whisper fine-tuning on Librispeech makes WER worse 🤗Transformers	6	2484	June 26, 2023
AWS Lambda + Transformers + Docker = use High RAM for summarization model 🤗Transformers	1	596	June 26, 2023
How to catch Up with the GPT2 based model. at each iteration the size of the model increases 🤗Transformers	0	292	June 26, 2023
T5 trained with seq2seq method 🤗Transformers	0	295	June 26, 2023
Mullti Label Text Classification 🤗Transformers	2	1604	June 26, 2023
Inserting custom layer after embeddings layer in BERT 🤗Transformers	0	209	June 26, 2023
Faster Segment Anything: Towards Lightweight SAM for Mobile Applications 🤗Transformers	0	1272	June 26, 2023
AutoModelForCausalLM.from_pretrained unable to load model from Huggingface 🤗Transformers	1	3135	June 25, 2023
Training AutoModelForCausalLM in a Seq2Seq task 🤗Transformers	0	330	June 25, 2023
TFViT model keeps throwing error while training it using TFTrainer 🤗Transformers	0	331	June 24, 2023
Why there are chat and instruct models for 13B parameters? 🤗Transformers	0	639	June 23, 2023
Using Huggingface Trainer in Colab -> Disk Full 🤗Transformers	5	5183	June 23, 2023
Custom gradient accumulation scheme in Trainer 🤗Transformers	0	334	June 23, 2023
Can I compute `eval_loss` and `bleu` score simultaneously for decoder only transformers 🤗Transformers	0	438	June 23, 2023
Why is the repeating_penalty implemented using the full context rather than a generated token? 🤗Transformers	0	203	June 23, 2023
Transformers trying to use keras? 🤗Transformers	0	555	June 23, 2023
How to load a torch model with transformers? 🤗Transformers	5	17707	June 22, 2023
Summarization Evalutor Example 🤗Transformers	0	160	June 22, 2023
Hyperparameter optimization and load_best_model_at_end 🤗Transformers	2	889	June 22, 2023
Which is the correct bbox ocr level for LiLT? block level or word level? 🤗Transformers	0	353	June 22, 2023
How to set language in Whisper pipeline for audio transcription? 🤗Transformers	2	9305	June 22, 2023