🤗Transformers

Topic	Replies	Views	Activity
Torchrun uses more vram than running the script with python directly 🤗Transformers	1	363	May 27, 2024
How can you switch between adapters in the inference model? 🤗Transformers	2	406	May 27, 2024
Need Help Improving Similarity Scores for Follow-up Detection Using BERT or similar 🤗Transformers	1	113	May 26, 2024
Fine tuning t5 to write like me 🤗Transformers	0	177	May 26, 2024
Is it possible to get the data that is seen by the model during training? 🤗Transformers	1	124	May 26, 2024
Parallelize Mistral/ llama2 output 🤗Transformers	1	154	May 25, 2024
Decision Transformer a question about the tutorial 🤗Transformers	0	127	April 15, 2024
Understanding the Decision Transformer 🤗Transformers	0	146	May 25, 2024
How to get the loss from the Trainer class? 🤗Transformers	0	167	May 25, 2024
Modify the model input format in a .tflite file generated by the run_image_classification.py script 🤗Transformers	0	111	May 24, 2024
How to get list of downloaded models names? 🤗Transformers	6	5034	May 24, 2024
Mistral load_in_8bit slow inference 🤗Transformers	0	252	May 24, 2024
Perplexity Calculation in run_clm.py 🤗Transformers	0	278	May 23, 2024
Can I dynamically add or remove LoRA weights in the transformer library like diffusers 🤗Transformers	3	942	May 23, 2024
Is it possible to generate more than one token when using a decoder only model via forward pass? 🤗Transformers	1	637	May 23, 2024
Trainer RuntimeError: The size of tensor a (462) must match the size of tensor b (448) at non-singleton dimension 1 🤗Transformers	17	45484	May 23, 2024
ValueError: too many values to unpack (expected 2) or not enough values to unpack (expected 2, got 1). T5ForConditionalGeneration 🤗Transformers	0	181	May 23, 2024
T5 tokenizer / ideal method of calculating max_sequence_length? 🤗Transformers	1	548	May 22, 2024
Pass input_embed to WhisperDecoder 🤗Transformers	0	83	May 22, 2024
How to fix ValueError: The model did not return a loss from the inputs? 🤗Transformers	1	616	May 22, 2024
Transformers.js went wrong during the model construction 🤗Transformers	0	484	May 21, 2024
System RAM gets full in sometime and ( VideoMAE ) training job is killed 🤗Transformers	0	65	May 21, 2024
What data batch does SFTTrainer looks at when resumed training 🤗Transformers	0	109	May 21, 2024
TypeError: LlamaForCausalLM.__init__() got an unexpected keyword argument 'load_in_4bit' 🤗Transformers	7	20551	October 7, 2023
ValueError: The model is quantized with QuantizationMethod.QUANTO and is not serializable 🤗Transformers	1	344	May 20, 2024
"No token was detected" when using Hosted inference API 🤗Transformers	3	756	May 20, 2024
Fine-tuning BERT for vulnerability detection with data sharing the same label 🤗Transformers	0	100	May 17, 2024
TypeError: MistralModel.__init__() got an unexpected keyword argument 'safe_serialization' 🤗Transformers	0	406	May 17, 2024
Training Longformer works on jupyter notebook but not with .py 🤗Transformers	0	91	May 17, 2024
Mixtral-8x7B trained with `--load_in_4bit`, showed as Tensor type F32 🤗Transformers	3	159	May 17, 2024