Wav2vec2 model tuning on Mac Silicon
|
|
2
|
626
|
December 27, 2023
|
Discrepancy in Model Inference: Local vs. Hugging Face Model Hub
|
|
1
|
819
|
December 27, 2023
|
Shape mismatch between labels and logits
|
|
1
|
1702
|
December 27, 2023
|
Warning when using ESM pre-trained model
|
|
2
|
1710
|
December 26, 2023
|
Set_input_embeddings() values not being saved with save_pretrained()
|
|
3
|
450
|
December 26, 2023
|
Poor Real-Time Performance of Whisper Models Fine-Tuned on Synthetic Data #198
|
|
0
|
141
|
December 25, 2023
|
The best way to modify a transformers model with minimal modifications
|
|
0
|
680
|
December 25, 2023
|
T5/mT5 model distillation
|
|
1
|
998
|
December 25, 2023
|
Deepspeed script launcher vs accelerate script launcher for TRL
|
|
0
|
370
|
December 25, 2023
|
Best practice to run DeepSpeed
|
|
2
|
1566
|
December 25, 2023
|
Progress bar for HF pipelines
|
|
9
|
18799
|
December 24, 2023
|
How to make Data Loader for "Multi-Head" Regression which can be used with Trainer
|
|
0
|
296
|
December 24, 2023
|
Depth estimation on MPS device?
|
|
0
|
240
|
December 24, 2023
|
Training On Mac M3 Max.. blazing fast but
|
|
3
|
8169
|
December 24, 2023
|
Using past_key_values to provide context to decoder results in same output
|
|
0
|
705
|
December 23, 2023
|
How to apply decoding method and penalty
|
|
1
|
241
|
December 23, 2023
|
Runtime Error: Trainer API Dataloader Using CPU but Expecting CUDA
|
|
2
|
1816
|
December 22, 2023
|
Unable to update the weights / learn anything
|
|
2
|
605
|
December 22, 2023
|
VisualBert model producing RuntimeError
|
|
7
|
459
|
December 22, 2023
|
Fine-Tune dit-large-finetuned-rvlcdip
|
|
3
|
244
|
December 21, 2023
|
Functorch with transformers
|
|
2
|
267
|
December 21, 2023
|
Loading Vision Transformer Model After Changing Its Classifier Head
|
|
2
|
955
|
December 21, 2023
|
How to generate without decoding?
|
|
1
|
372
|
December 13, 2023
|
Image size understanding in DinoV2
|
|
2
|
4231
|
December 21, 2023
|
Llama2 fine-tunning with PEFT QLora and testing the model
|
|
13
|
15394
|
December 21, 2023
|
A fine tuned Llama2-chat model can't answer questions from the dataset
|
|
0
|
309
|
December 20, 2023
|
Training Arguments to do pure bf16 training?
|
|
0
|
2073
|
December 20, 2023
|
Using text-generation pipeline for Llama-2-7b-chat-hf setting high T doesn't change output
|
|
1
|
3662
|
December 20, 2023
|
Gradient clipping on Transformers
|
|
0
|
265
|
December 20, 2023
|
Avoid loading checkpoint shards for each inference
|
|
2
|
2452
|
December 19, 2023
|