🤗Transformers

Topic	Replies	Views	Activity
The hidden_states when i use model.generate 🤗Transformers	4	2158	March 28, 2025
Fixing the random seed in the Trainer does not produce the same results across runs 🤗Transformers	5	17692	March 27, 2025
The size of tensor a (882) must match the size of tensor b (568) at non-singleton dimension 1 🤗Transformers	2	126	March 27, 2025
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:7 and cuda:0! 🤗Transformers	2	207	March 25, 2025
Unexpected behavior of load_best_model_at_end in Trainer (or am I doing it wrong?) 🤗Transformers	2	71	March 25, 2025
Load_best_model_at_end doesn't work? 🤗Transformers	1	122	March 25, 2025
Molformer model training error 🤗Transformers	6	61	March 25, 2025
Extract Attention Weights from a Specific Layer and Head Efficiently 🤗Transformers	1	228	March 25, 2025
Clarification on Commercial License Impact of LayoutLMv3ImageProcessor within UdopProcessor 🤗Transformers	0	58	March 24, 2025
Runtime Error: Cuda Initialization 🤗Transformers	13	387	March 24, 2025
Reasoning LLM Benchmarking 🤗Transformers	2	2180	March 24, 2025
Web worker fails to process input data 🤗Transformers	1	45	March 22, 2025
Adding dropout in custom model, but setting dropout through .from_pretrained() 🤗Transformers	2	70	March 21, 2025
Multimodal training 🤗Transformers	4	67	March 21, 2025
Target branch/tag/commit for automatic Hub pushes 🤗Transformers	1	15	March 21, 2025
Two questions when I wraped the AutoModelForMaskedLM 🤗Transformers	7	32	March 21, 2025
One question is about the pretrain method in Transformer packge ？ 🤗Transformers	1	204	March 19, 2025
Partially loss calculation with transformers LLM Trainer and DataCollator 🤗Transformers	1	70	March 19, 2025
Custom VLM - Swapping a vision encoder from a VLM 🤗Transformers	1	275	March 19, 2025
HFvalidationerror: Repo_id must be in the form repo_name 🤗Transformers	8	30353	March 19, 2025
Fine-Tuning a Mamba Model with using Hugging Face Transformers 🤗Transformers	1	278	March 18, 2025
TRL SFTTrainer 0.15 compute_token_accuracy error 🤗Transformers	2	178	March 18, 2025
How can I set `max_memory` parameter while loading Quantized model with Model Pipeline class? 🤗Transformers	2	61	March 18, 2025
E5 embedding models 🤗Transformers	1	24	March 17, 2025
Using TableTransformer in Standalone Mode Without Hugging Face Hub Access 🤗Transformers	1	52	March 17, 2025
The checkpoint you are trying to load has model type `gemma2` but Transformers does not recognize this architecture 🤗Transformers	8	7297	March 17, 2025
Get each generated token last layer hidden state 🤗Transformers	3	56	March 16, 2025
Why does automodelforcausallm.from_pretrained() work on base models and not instruct models? 🤗Transformers	4	121	March 15, 2025
[Possibly] Forgotten TODO Comment for `TrainingArguments.default_optim` 🤗Transformers	1	32	March 14, 2025
Metrics for Training Set in Trainer 🤗Transformers	11	26929	March 14, 2025