🤗Transformers

Topic	Replies	Views	Activity
Source code of transformers models 🤗Transformers	2	2006	February 14, 2024
Using Quantization with fp16/bf16 Trainer flag 🤗Transformers	0	722	February 14, 2024
How to run Trainer-based script in Colab? 🤗Transformers	3	377	February 14, 2024
M2M model finetuning on multiple language pairs 🤗Transformers	4	1479	August 17, 2022
How retrieval loss is calculated in RAG model? 🤗Transformers	0	360	February 14, 2024
KV Cache size shrinks during Inference instead of growing. Can someone explain why? 🤗Transformers	0	421	February 14, 2024
"too many values to unpack (expected 4)" but pixel_values dimension is correct 🤗Transformers	2	441	February 14, 2024
Exact difference between Transformers' and Accelerate's DeepSpeed integrations? DeepSpeed	5	835	February 13, 2024
ValueError: could not broadcast input array from shape (30,512,32128) into shape (30,512) 🤗Transformers	2	2467	February 13, 2024
Load checkpoint from Trainer 🤗Transformers	0	589	February 13, 2024
Inference just halts, no error, how to troubleshoot 🤗Transformers	7	1231	February 13, 2024
Enabling load_in_8bit makes inference much slower 🤗Transformers	3	1809	February 13, 2024
Tutorial: Fine-tuning with custom datasets – sentiment, NER, and question answering 🤗Transformers	19	12891	February 12, 2024
Clarifying AutoModelForMultipleChoice 🤗Transformers	0	142	February 12, 2024
Conda install -c huggingface or conda-forge? 🤗Transformers	4	3107	February 11, 2024
MultiModel Inferencing 🤗Transformers	0	109	February 11, 2024
Possible to use transformers with GGML-style quantization? 🤗Transformers	0	117	February 10, 2024
Time series for prediction: generate RuntimeError 🤗Transformers	1	375	February 10, 2024
Why run_glue.py does change the Tiny BERT Model? 🤗Transformers	0	132	February 10, 2024
Understanding how changing bnb_4bit_compute_dtype affects outputs 🤗Transformers	1	4793	February 10, 2024
Whisperx training hf_tokenizer 🤗Transformers	0	246	February 9, 2024
Hyper Parameter Optimization with Optuna backend timeout when using Pytorch DDP 🤗Transformers	0	577	February 9, 2024
How to apply SpecAugment to a Whisper? 🤗Transformers	3	1419	February 9, 2024
AttributeError: 'ChatPromptValue' object has no attribute 'size' 🤗Transformers	9	2353	February 9, 2024
How can I disable log history from getting printed every logging_steps 🤗Transformers	0	627	February 8, 2024
RL for LLM keeps outputting NaN 🤗Transformers	0	205	February 8, 2024
T5 Model Evaluation on Generation 🤗Transformers	0	429	February 8, 2024
ModuleNotFoundError: No module named 'open_flamingo' 🤗Transformers	1	403	February 8, 2024
Setting up a timeseries transformer 🤗Transformers	3	1263	February 8, 2024
Training loss is zero from the first step and model generation is empty after training? 🤗Transformers	0	365	February 8, 2024