CUDA Out Of Memory when training a DETR Object detection model with compute_metrics
|
|
3
|
116
|
July 17, 2025
|
Function/tool calling using Transformer models
|
|
5
|
988
|
July 17, 2025
|
Pytorch Language Modeling Example for Seq2Seq Models
|
|
0
|
12
|
July 16, 2025
|
Mes Spaces restent bloqués sur âStartingâ malgré abonnement Pro et hébergement GPU
|
|
2
|
46
|
July 14, 2025
|
Fine-tune for function call on Meta-Llama-3.1-8B-Instruct
|
|
6
|
115
|
July 15, 2025
|
Object detection resolution fine-tuning
|
|
1
|
37
|
July 14, 2025
|
deBERTa v3 implementation in HuggingFace (with RTD training)
|
|
5
|
340
|
July 12, 2025
|
Make repo-consistency fails even for intentional tweaks in a copied model
|
|
9
|
42
|
July 11, 2025
|
When Fine-tuning a object detection model which parameters do we update?
|
|
1
|
36
|
July 10, 2025
|
Understanding T5 with custom embedding
|
|
3
|
26
|
July 9, 2025
|
How can I make the LLM to forget the knowledge?
|
|
3
|
68
|
July 9, 2025
|
Can I use a custom attention layer while still leveraging a pre-trained BERT model?
|
|
4
|
27
|
July 8, 2025
|
OneFormer ID/Labels for FineTuning
|
|
13
|
38
|
July 8, 2025
|
How to save the best trial's model using `trainer.hyperparameter_search`
|
|
6
|
2715
|
July 8, 2025
|
This is my fine tuning trocr code why is it not working anyone please help me I really need your help I am working on new language
|
|
9
|
69
|
July 8, 2025
|
Accuracy decreasing after saving/reloading my model
|
|
3
|
10
|
July 8, 2025
|
Java version of transformers library?
|
|
2
|
179
|
July 7, 2025
|
[Trainer] Evaluation loss changes with batch size
|
|
2
|
32
|
July 7, 2025
|
What bounding boxes format does Grounding DINO use?
|
|
1
|
19
|
July 5, 2025
|
Latest Docker update to transformers-pytorch-gpu causing failure
|
|
1
|
50
|
July 4, 2025
|
Is the Trainer slower than customised loops?
|
|
3
|
51
|
July 4, 2025
|
Donut Pre-Train on new Language
|
|
4
|
2370
|
July 1, 2025
|
Segfault during PyTorch + Transformers inference on Apple Silicon M4 (libomp.dylib crash on LayerNorm)
|
|
5
|
84
|
June 30, 2025
|
Use ReduceLROnPlateau with deepspeed
|
|
4
|
26
|
June 26, 2025
|
Verification of script to train a LLM on supervised data
|
|
5
|
31
|
June 25, 2025
|
How to use lr_scheduler_kwargs param in TrainingArguments?
|
|
6
|
73
|
June 25, 2025
|
CUDA out of memory when using Trainer with compute_metrics
|
|
25
|
46684
|
June 25, 2025
|
How to decode CSM tokens into audio tensors for streaming
|
|
1
|
31
|
June 23, 2025
|
API isn't working since 2 days
|
|
1
|
27
|
June 21, 2025
|
Whisper warning about not predicting end of a timestamp
|
|
1
|
1573
|
June 20, 2025
|