🤗Transformers

Topic	Replies	Views	Activity
ValueError: The model is quantized with QuantizationMethod.QUANTO and is not serializable 🤗Transformers	1	318	May 20, 2024
"No token was detected" when using Hosted inference API 🤗Transformers	3	751	May 20, 2024
Fine-tuning BERT for vulnerability detection with data sharing the same label 🤗Transformers	0	93	May 17, 2024
TypeError: MistralModel.__init__() got an unexpected keyword argument 'safe_serialization' 🤗Transformers	0	360	May 17, 2024
Training Longformer works on jupyter notebook but not with .py 🤗Transformers	0	88	May 17, 2024
Mixtral-8x7B trained with `--load_in_4bit`, showed as Tensor type F32 🤗Transformers	3	151	May 17, 2024
Which is actually used to configure scheduler in deepspeed and TrainingArguments? 🤗Transformers	0	92	May 17, 2024
How to control the GPU id for loading model weights when fintune Llama8B model with the Trainer? 🤗Transformers	0	76	May 17, 2024
Cannot import name 'WhisperForAudioClassification 🤗Transformers	0	139	May 16, 2024
Isn't KV cache influenced by position encoding in inference? 🤗Transformers	3	850	May 16, 2024
ModuleNotFoundError: No module named 'transformers.agents' 🤗Transformers	2	664	May 16, 2024
Why follow Flan-T5 template when T5 tokenizer ignores multiple newlines 🤗Transformers	0	112	May 15, 2024
Decoder only model - how to have it not include the prompt in its output? 🤗Transformers	3	581	May 15, 2024
ValueError: Unrecognized configuration class <class 'transformers.models.whisper.configuration_whisper.WhisperConfig'> 🤗Transformers	0	238	May 15, 2024
KeyError: 'eval_qwk' when used get_peft_model 🤗Transformers	0	112	May 14, 2024
Fedrated Learning using trainer 🤗Transformers	0	72	May 14, 2024
Next sentence prediction on custom model 🤗Transformers	3	3382	May 14, 2024
Llama 3 tokenizer prints cryptic message 🤗Transformers	0	157	May 13, 2024
Whisper Inference RuntimeError: The expanded size of the tensor (3000) must match the existing size (3392) at non-singleton dimension 1. Target sizes: [80, 3000]. Tensor sizes: [80, 3392] 🤗Transformers	1	712	May 13, 2024
HUBERT Implementation with increased vocabulary size 🤗Transformers	0	85	May 13, 2024
Model Parralelism approach in Llama Code looks like very inefficient 🤗Transformers	0	95	May 13, 2024
ValueError when training on a multi GPU setup and DPO 🤗Transformers	0	238	May 13, 2024
Transformer shifting output question 🤗Transformers	1	337	May 13, 2024
Not able to add data_collator to Trainer 🤗Transformers	1	517	May 13, 2024
How can we automatically run the script with a token included in a script 🤗Transformers	0	82	May 13, 2024
Index Error: Target {} is out of bounds 🤗Transformers	0	263	May 13, 2024
Load_in_8bit vs. loading 8-bit quantized model 🤗Transformers	6	6385	May 13, 2024
Convert Conv1D to nn.Linear 🤗Transformers	2	922	May 12, 2024
SFTrainer doesn't show added column 🤗Transformers	0	101	May 12, 2024
How can I keep use of the base model version for inference after fine-tuning 🤗Transformers	1	93	May 12, 2024