🤗Transformers

Topic	Replies	Views	Activity
Reducing unwanted generation in Gemma 3 🤗Transformers	7	218	April 5, 2025
Difference between pre-training and fine tuning with language modeling to instill new knowledge 🤗Transformers	3	76	April 3, 2025
What is the most efficient way to dynamically change context mid-generation? 🤗Transformers	4	31	April 2, 2025
🚀 Introducing FlashTokenizer: The World's Fastest CPU Tokenizer! 🤗Transformers	2	27	April 4, 2025
Using DistributedSampler with accelerate 🤗Transformers	4	107	April 2, 2025
ValueError: Could not interpret optimizer identifier 🤗Transformers	1	178	April 1, 2025
Model_accepts_loss_kwargs detection based on **kwargs is too permissive 🤗Transformers	0	41	April 1, 2025
Limit mask size in Mask2Former results 🤗Transformers	1	25	April 1, 2025
Args in RewardConfig 🤗Transformers	1	15	April 1, 2025
FASTAI:TypeError: empty() received an invalid combination of arguments - got (tuple, dtype=NoneType, device=NoneType) 🤗Transformers	2	40	March 29, 2025
Optimize GPU Usage for Long-Context Training 🤗Transformers	2	60	March 28, 2025
The hidden_states when i use model.generate 🤗Transformers	4	1663	March 28, 2025
Fixing the random seed in the Trainer does not produce the same results across runs 🤗Transformers	5	17302	March 27, 2025
The size of tensor a (882) must match the size of tensor b (568) at non-singleton dimension 1 🤗Transformers	2	56	March 27, 2025
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:7 and cuda:0! 🤗Transformers	2	103	March 25, 2025
Unexpected behavior of load_best_model_at_end in Trainer (or am I doing it wrong?) 🤗Transformers	2	35	March 25, 2025
Load_best_model_at_end doesn't work? 🤗Transformers	1	81	March 25, 2025
Molformer model training error 🤗Transformers	6	27	March 25, 2025
Extract Attention Weights from a Specific Layer and Head Efficiently 🤗Transformers	1	79	March 25, 2025
Clarification on Commercial License Impact of LayoutLMv3ImageProcessor within UdopProcessor 🤗Transformers	0	23	March 24, 2025
Runtime Error: Cuda Initialization 🤗Transformers	13	138	March 24, 2025
Reasoning LLM Benchmarking 🤗Transformers	2	714	March 24, 2025
Web worker fails to process input data 🤗Transformers	1	24	March 22, 2025
Adding dropout in custom model, but setting dropout through .from_pretrained() 🤗Transformers	2	48	March 21, 2025
Multimodal training 🤗Transformers	4	47	March 21, 2025
Target branch/tag/commit for automatic Hub pushes 🤗Transformers	1	13	March 21, 2025
Two questions when I wraped the AutoModelForMaskedLM 🤗Transformers	7	27	March 21, 2025
One question is about the pretrain method in Transformer packge ？ 🤗Transformers	1	202	March 19, 2025
Partially loss calculation with transformers LLM Trainer and DataCollator 🤗Transformers	1	67	March 19, 2025
Custom VLM - Swapping a vision encoder from a VLM 🤗Transformers	1	117	March 19, 2025