Source code of transformers models
|
|
2
|
2006
|
February 14, 2024
|
Using Quantization with fp16/bf16 Trainer flag
|
|
0
|
722
|
February 14, 2024
|
How to run Trainer-based script in Colab?
|
|
3
|
377
|
February 14, 2024
|
M2M model finetuning on multiple language pairs
|
|
4
|
1479
|
August 17, 2022
|
How retrieval loss is calculated in RAG model?
|
|
0
|
360
|
February 14, 2024
|
KV Cache size shrinks during Inference instead of growing. Can someone explain why?
|
|
0
|
421
|
February 14, 2024
|
"too many values to unpack (expected 4)" but pixel_values dimension is correct
|
|
2
|
441
|
February 14, 2024
|
Exact difference between Transformers' and Accelerate's DeepSpeed integrations?
|
|
5
|
835
|
February 13, 2024
|
ValueError: could not broadcast input array from shape (30,512,32128) into shape (30,512)
|
|
2
|
2467
|
February 13, 2024
|
Load checkpoint from Trainer
|
|
0
|
589
|
February 13, 2024
|
Inference just halts, no error, how to troubleshoot
|
|
7
|
1231
|
February 13, 2024
|
Enabling load_in_8bit makes inference much slower
|
|
3
|
1809
|
February 13, 2024
|
Tutorial: Fine-tuning with custom datasets â sentiment, NER, and question answering
|
|
19
|
12891
|
February 12, 2024
|
Clarifying AutoModelForMultipleChoice
|
|
0
|
142
|
February 12, 2024
|
Conda install -c huggingface or conda-forge?
|
|
4
|
3107
|
February 11, 2024
|
MultiModel Inferencing
|
|
0
|
109
|
February 11, 2024
|
Possible to use transformers with GGML-style quantization?
|
|
0
|
117
|
February 10, 2024
|
Time series for prediction: generate RuntimeError
|
|
1
|
375
|
February 10, 2024
|
Why run_glue.py does change the Tiny BERT Model?
|
|
0
|
132
|
February 10, 2024
|
Understanding how changing bnb_4bit_compute_dtype affects outputs
|
|
1
|
4793
|
February 10, 2024
|
Whisperx training hf_tokenizer
|
|
0
|
246
|
February 9, 2024
|
Hyper Parameter Optimization with Optuna backend timeout when using Pytorch DDP
|
|
0
|
577
|
February 9, 2024
|
How to apply SpecAugment to a Whisper?
|
|
3
|
1419
|
February 9, 2024
|
AttributeError: 'ChatPromptValue' object has no attribute 'size'
|
|
9
|
2353
|
February 9, 2024
|
How can I disable log history from getting printed every logging_steps
|
|
0
|
627
|
February 8, 2024
|
RL for LLM keeps outputting NaN
|
|
0
|
205
|
February 8, 2024
|
T5 Model Evaluation on Generation
|
|
0
|
429
|
February 8, 2024
|
ModuleNotFoundError: No module named 'open_flamingo'
|
|
1
|
403
|
February 8, 2024
|
Setting up a timeseries transformer
|
|
3
|
1263
|
February 8, 2024
|
Training loss is zero from the first step and model generation is empty after training?
|
|
0
|
365
|
February 8, 2024
|