Fine-tuning whisper on sound-event-detection dataset
|
|
0
|
62
|
December 20, 2024
|
The model trained in PyTorch produces inconsistent predictions for the same image when processed individually versus in a batch
|
|
4
|
33
|
December 20, 2024
|
How to Ensure Each Process Reads Its Own Dataset and Trains Correctly When Using Trainer?
|
|
0
|
12
|
December 20, 2024
|
How to count input tokens in vision model?
|
|
4
|
511
|
December 19, 2024
|
Could not load model meta-llama/Llama-2-7b-chat-hf with any of the following classes
|
|
22
|
48702
|
December 19, 2024
|
Ensure the sentence is complete during generation
|
|
5
|
6948
|
December 19, 2024
|
Unbelievable Error: Help ME!
|
|
6
|
533
|
December 18, 2024
|
Most effiecient way to move padding tokens to the right side of a tensor?
|
|
2
|
110
|
December 18, 2024
|
How to beat LSTM in time series regression preferably with transformer?
|
|
0
|
65
|
December 17, 2024
|
Transformers.js Chat Completion on SmolLM2 using
|
|
0
|
55
|
December 17, 2024
|
Using IterableDataset with Trainer - `IterableDataset' has no len()
|
|
7
|
13963
|
December 17, 2024
|
Qwen based AI assistant randomly having an absolute, utter, complete 'mental breakdowns'?? (Inference API)
|
|
2
|
94
|
December 17, 2024
|
KV caching for varying length texts
|
|
1
|
149
|
December 16, 2024
|
Positional encoding
|
|
3
|
110
|
December 16, 2024
|
Repeated training runs out of GPU memory
|
|
3
|
209
|
December 16, 2024
|
Index out of range in transformer summarization
|
|
2
|
71
|
December 16, 2024
|
What loss type should be used to train vision-llm with auto regression style?
|
|
1
|
64
|
December 15, 2024
|
CUDA out of memory when using Trainer with compute_metrics
|
|
24
|
44473
|
December 13, 2024
|
Interrupting run to trigger checkpoint?
|
|
0
|
6
|
December 13, 2024
|
Not able to access after login through hugging face hub in google colab
|
|
1
|
115
|
December 13, 2024
|
Solution for Fine Tuning the Blip Model
|
|
0
|
71
|
December 13, 2024
|
In SpeechSeq2Seq models, is it possible to pass decoder_input_ids for each sample during the training time using huggingface Trainer?
|
|
0
|
24
|
December 12, 2024
|
How to Load Llama-3.3-70B-Instruct Model in Float8 Precision?
|
|
1
|
247
|
December 11, 2024
|
LLama 3.1 torch.compile & static cache
|
|
2
|
221
|
December 9, 2024
|
Padding side in instruction fine-tuning using SFTT
|
|
1
|
1103
|
December 9, 2024
|
Transformers Pretrained model import
|
|
3
|
147
|
December 9, 2024
|
CUDA error: device-side assert triggered on device_map="auto"
|
|
4
|
1592
|
December 8, 2024
|
Pretrain model not accepting optimizer
|
|
30
|
4508
|
December 7, 2024
|
How to use I-JEPA for image classficiation
|
|
4
|
1844
|
December 6, 2024
|
Albert pre-train convergence problem
|
|
1
|
631
|
December 6, 2024
|