🤗Transformers

Topic	Replies	Views	Activity
BartForConditionalGeneration: loss function diverges instead of converging 🤗Transformers	0	118	May 12, 2024
Beam search error 🤗Transformers	2	553	May 12, 2024
An error occurred: You have to specify input_ids 🤗Transformers	0	272	May 11, 2024
How to change max_length of a fine tuned model 🤗Transformers	4	11249	May 11, 2024
Phi3 Mini 4k Instruct Flash Attention not found 🤗Transformers	4	4990	May 11, 2024
LayoutLMv3 inference - bboxes are incorrect 🤗Transformers	0	111	May 10, 2024
How could I define a LogitsProcessorList with multi parameters? 🤗Transformers	0	88	May 10, 2024
Argmax of Generation Probabilities doesn't match with Generated Sequence Tokens 🤗Transformers	2	944	May 10, 2024
How could I fusion the logits from different models and then convert it to Token? 🤗Transformers	0	97	May 10, 2024
Finetune_rag.py won't save checkpoints 🤗Transformers	0	114	May 9, 2024
CLIP: The `backend_tokenizer` provided does not match the expected format 🤗Transformers	3	231	May 9, 2024
What does the `use_cache` in `generate` actually do? 🤗Transformers	1	2269	May 9, 2024
AWD-LSTM beats finetuned BERT as train ds decreases?! :person_shrugging:t4: 🤗Transformers	2	126	May 9, 2024
How to count how many forward passes were done in model.generate when using assistant_model 🤗Transformers	0	84	May 9, 2024
How to pass multiple datasets into Trainer for Knowledge distillation in NMT 🤗Transformers	3	334	May 9, 2024
Trainer doesn't show the loss at each step 🤗Transformers	20	35161	May 9, 2024
Lazy model initialization 🤗Transformers	3	927	May 8, 2024
Getting zero gradients for image patch embeddings when implementing GRADCAM for ViLT 🤗Transformers	0	91	May 8, 2024
Input to reshape is a tensor with 3763200 values, but the requested shape requires a multiple of 20384 🤗Transformers	0	86	May 8, 2024
Having multiple candidate labels in a zero shot classification model 🤗Transformers	3	573	May 8, 2024
Why eval_accumulation_steps takes so much memory 🤗Transformers	5	1430	May 8, 2024
Add metrics to object detection example 🤗Transformers	12	3838	May 8, 2024
Runtime error: NotImplementedError: Cannot copy out of meta tensor; no data! 🤗Transformers	0	1957	May 7, 2024
Llama-2 significantly slower than other models on huggingface 🤗Transformers	2	974	May 7, 2024
Retraining the SAM model on the color image database in order to segment multiple classes in the image‏ 🤗Transformers	0	346	May 7, 2024
Cuda Out of Memory when fine tuning llm model 🤗Transformers	3	1149	May 7, 2024
Lower Memory Usage for TF GPT-J 🤗Transformers	1	809	May 7, 2024
How to stream responses from AutoModelforCausalLM? 🤗Transformers	0	435	May 7, 2024
Fine tuning T5 Encoder and T5 Decoder separately 🤗Transformers	1	722	May 6, 2024
AttributeError: module 'torch' has no attribute 'chalf' 🤗Transformers	8	996	May 6, 2024