Processing the [-100] Mask in SFT
|
|
2
|
1112
|
April 9, 2024
|
Server-side Audio Processing in Node.js
|
|
0
|
106
|
April 8, 2024
|
Subject: Issues with Custom Model Saving Behavior Using Trainer Class in LVLM Training
|
|
0
|
118
|
April 8, 2024
|
Speeding up the inference for marian MT
|
|
4
|
2736
|
April 8, 2024
|
Loading only pre-trained backbone for Mask2Former
|
|
0
|
204
|
April 8, 2024
|
Hardware Requirements | Fine tuning Pegasus Large
|
|
1
|
978
|
April 8, 2024
|
Slower train with collator for completion only
|
|
1
|
1221
|
April 7, 2024
|
How does Gemini 1.5 achieve 10M context window?
|
|
0
|
311
|
April 7, 2024
|
How to run hf MoE series model in an expert parallel manner?
|
|
0
|
325
|
April 7, 2024
|
What should I do if I want to use model from DeepSpeed
|
|
5
|
1628
|
April 6, 2024
|
Setting up separate device for validation in Trainer?
|
|
0
|
99
|
April 6, 2024
|
Langchain & SentenceTransformerEmbeddings error while passing the embeded function to chromadb
|
|
0
|
767
|
April 5, 2024
|
Stopping criteria for batch
|
|
7
|
4147
|
April 5, 2024
|
Finetuning DPR on Custom Dataset
|
|
4
|
2874
|
April 5, 2024
|
Problem with transformer optimizer assertion error
|
|
1
|
408
|
April 4, 2024
|
Always getting RuntimeError: CUDA out of memory with Trainer
|
|
10
|
6884
|
April 4, 2024
|
[Maybe Bug] When using EarlyStopping Callbacks with Seq2SeqTraininer, training didn't stop
|
|
3
|
1518
|
April 4, 2024
|
Problem with EarlyStoppingCallback
|
|
13
|
10532
|
April 4, 2024
|
FREQUENT LOSS SPIKING in CONTINUE TRAINING LLM
|
|
2
|
1045
|
April 4, 2024
|
What is the data file format of `run_ner.py`?
|
|
2
|
319
|
April 4, 2024
|
Custom loss weight for train a different weight for validation
|
|
0
|
176
|
April 4, 2024
|
Unable to load a model with added special token
|
|
1
|
563
|
April 3, 2024
|
[Request] Provide better examples for each model and task existing in the library [/Request]
|
|
0
|
115
|
April 3, 2024
|
Transformers crashes when using mlflow-skinny
|
|
1
|
115
|
April 3, 2024
|
How to compare the meaning of documents
|
|
2
|
974
|
April 3, 2024
|
CUDA not working with asr pipeline
|
|
0
|
162
|
April 3, 2024
|
Name is not correct
|
|
0
|
105
|
April 3, 2024
|
How to train an EncoderDecoderModel with different pretrained encoder and decoder?
|
|
2
|
417
|
April 2, 2024
|
Get scores from Whisper using ASR pipeline
|
|
2
|
3664
|
April 2, 2024
|
TrOCR expects square images though lines are rectangle images
|
|
0
|
113
|
April 2, 2024
|