How to combine LoRA and gradient_checkpointing in Whisper?
|
|
1
|
4911
|
August 17, 2023
|
DataCollatorWithPadding - Minimum Pad Argument
|
|
1
|
220
|
August 17, 2023
|
LLaMA2 7B uses > 128 GB of GPU Ram and fails with OOM or Loss Scale Minimum
|
|
3
|
5595
|
August 17, 2023
|
Handling OOMs nicely when enabling batching in general
|
|
1
|
255
|
August 17, 2023
|
The LayoutLM Installation Fails
|
|
2
|
1568
|
August 17, 2023
|
Profiling models on execution time and memory
|
|
1
|
1183
|
August 17, 2023
|
Any advice on LLM inference over a large dataset?
|
|
0
|
786
|
August 16, 2023
|
Framework for Continual Pretraining
|
|
0
|
1290
|
August 16, 2023
|
Is there a possibility to use MLM modelling for pretraining for autocasualLM model like MPT or falcon? If yes, Has someone tried it? Are there any relevant code bases which I can use?
|
|
0
|
214
|
August 16, 2023
|
Is there an alternative to translate audio from any language to french using Whisper?
|
|
0
|
217
|
August 16, 2023
|
Push_to_hub failing due to unknown error
|
|
0
|
375
|
August 16, 2023
|
Reusable custom heads, wrappers, and `from_pretrained`?
|
|
0
|
277
|
August 16, 2023
|
How to run trainer.py with megatron_lm_plugin
|
|
0
|
254
|
August 15, 2023
|
Zero Shot Classification using multiGPU
|
|
1
|
645
|
August 14, 2023
|
Very Slow Fine Tuning Performance for Speech?
|
|
3
|
665
|
August 14, 2023
|
How does from_pretrained work with ZeRO=3?
|
|
0
|
682
|
August 14, 2023
|
Whats the difference between QLoRA and autoGPTQ?
|
|
0
|
536
|
August 13, 2023
|
Unauthorized access to file Transformers.js
|
|
0
|
666
|
August 13, 2023
|
Machine translation to Sql like output
|
|
0
|
403
|
August 13, 2023
|
How to correctly evaluate a Masked Language Model?
|
|
3
|
4488
|
August 11, 2023
|
Get the positive score in a classification task by using a generative model
|
|
0
|
408
|
August 12, 2023
|
ZeRO3 with int8 training
|
|
0
|
889
|
August 11, 2023
|
How to generate one word and output it instead of all the answers at once, which would take a long time
|
|
0
|
454
|
August 11, 2023
|
Finetuned EncoderDecoder (RoBERTa): How to score decoder output confidence/context?
|
|
0
|
482
|
April 11, 2022
|
Question answering task with falcon model fails with "TypeError: forward() got an unexpected keyword argument 'token_type_ids'"
|
|
0
|
1958
|
August 10, 2023
|
RuntimeError: grad can be implicitly created only for scalar outputs
|
|
0
|
1062
|
August 10, 2023
|
Gradient_checkpointing control
|
|
0
|
1140
|
August 10, 2023
|
KeyError Convert SWIN to Pytorch
|
|
0
|
210
|
August 9, 2023
|
How to log a new tensor variable in TrainingArguments
|
|
0
|
191
|
August 9, 2023
|
Different generations during test time and validation time
|
|
0
|
166
|
August 9, 2023
|