Fine-tuning DistilGPT2 on custom data, training Accuracy 100%, output is garbage
|
|
4
|
2151
|
January 31, 2024
|
Getting this 'AttributeError: 'list' object has no attribute 'get'' error when trying to fine tune wav2vec2 model
|
|
0
|
794
|
January 31, 2024
|
GPT2 returns sequence of <|endoftext|> after finetuning
|
|
2
|
254
|
January 31, 2024
|
Customizing generation config for Trainer's training loop evaluation
|
|
1
|
2130
|
January 31, 2024
|
Early Stopping with GPT from AutoModelForCausalLM
|
|
1
|
718
|
January 30, 2024
|
Finetuning llama-2 for classification
|
|
2
|
1934
|
January 29, 2024
|
How can I use the ONNX model?
|
|
2
|
1695
|
January 29, 2024
|
Best Practices for Optimizing Model Training
|
|
0
|
310
|
January 29, 2024
|
While fine tuning the quantize model of sarvamai/OpenHathi-7B-Hi-v0.1-Base model getting memory error
|
|
0
|
230
|
January 29, 2024
|
Summarization for survey open end questions
|
|
0
|
227
|
January 29, 2024
|
Memory consumption qlora with gradient checkpointing
|
|
0
|
435
|
January 28, 2024
|
Fine tuning bert with tensorflow huggingface transformers
|
|
1
|
201
|
January 28, 2024
|
PPOTrainer: Output generated during training different than that during inference
|
|
1
|
435
|
January 27, 2024
|
Dataset.transform() hangs indefinitely while finetuning the stable diffusion XL
|
|
3
|
8099
|
January 27, 2024
|
Tr ocr training error
|
|
0
|
248
|
January 26, 2024
|
More processes than GPUs with DeepSpeed launcher
|
|
0
|
232
|
January 25, 2024
|
Saving unique weights while training on multiple GPU - Trainer
|
|
0
|
260
|
January 25, 2024
|
Regarding the problem of starcoderbase training, the reasoning becomes slower after training
|
|
1
|
288
|
January 25, 2024
|
How to reset parameters from AutoModelFor SequenceClassification?
|
|
1
|
477
|
January 25, 2024
|
Transformer is unable to download into a space
|
|
1
|
812
|
January 25, 2024
|
How does the Transformer handle different batch sizes?
|
|
3
|
3685
|
January 24, 2024
|
Saving fine-tuned MT5ForSequenceClassification
|
|
5
|
391
|
January 24, 2024
|
Error: RuntimeError: Could not infer dtype of DatasetInfo
|
|
0
|
557
|
January 24, 2024
|
Fine tuning for Llama2 based model with LoftQ quantization
|
|
7
|
2382
|
January 24, 2024
|
Deleting tokens from a Seq2Seq model
|
|
0
|
188
|
January 24, 2024
|
FSDP training not saving the best checkpoint and load from checkpoint fails
|
|
0
|
816
|
January 23, 2024
|
T5: classification using text2text?
|
|
18
|
21299
|
January 23, 2024
|
Device_map="auto" in MIG Instance
|
|
0
|
555
|
January 23, 2024
|
Training of GPT hang during Checkpoint stage
|
|
0
|
138
|
January 23, 2024
|
How to convert zephyr-7b-beta / pytorch_model to .onnx format?
|
|
0
|
127
|
January 23, 2024
|