Text classification training on long text
|
|
3
|
5111
|
June 18, 2024
|
Accelerate - WeightedRandomSampler Dataloader
|
|
1
|
280
|
June 18, 2024
|
Accelerate socket timeout on multi-node LLM training
|
|
0
|
342
|
June 14, 2024
|
How to ensure my custom Trainer is using my custom TrainerState and TrainerControl?
|
|
1
|
364
|
June 14, 2024
|
FSDP with Trainer class: AlgorithmError: ValueError('Cannot flatten integer dtype tensors'), exit code: 1
|
|
0
|
581
|
June 13, 2024
|
Fine-tuning `mistral-7B` for classification with QLoRA using peft
|
|
2
|
483
|
June 13, 2024
|
Using LLM cache
|
|
0
|
107
|
June 12, 2024
|
Finetuning with SFTtrainer
|
|
1
|
446
|
June 12, 2024
|
Which weights change when fine-tunning a pre-trained model?
|
|
3
|
852
|
June 11, 2024
|
Creating a docvqa dataset - gt_parses
|
|
1
|
559
|
June 11, 2024
|
Deployment of finetuned Mistral for Classification and Generation
|
|
4
|
349
|
June 10, 2024
|
How to fix RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
|
|
1
|
1777
|
June 10, 2024
|
Error Training Vision Encoder Decoder for Image Captioning
|
|
8
|
2939
|
June 8, 2024
|
Inference after QLoRA fine-tuning
|
|
8
|
6352
|
June 7, 2024
|
SAMModel output size different to the input
|
|
2
|
238
|
June 6, 2024
|
MaskFormer Jagged Edges Issues of output masks
|
|
1
|
230
|
June 5, 2024
|
4:3 to 16:9 Infill
|
|
0
|
111
|
June 4, 2024
|
Deploying Whisper Based Live Transcription for 1000 Concurrent users
|
|
0
|
378
|
June 1, 2024
|
Running multiple instances on GPU
|
|
0
|
185
|
June 1, 2024
|
Quantization not yet implemented
|
|
0
|
98
|
June 1, 2024
|
Fine-tuning Mistral/Mixtral for sequence classification on long context
|
|
2
|
2621
|
May 29, 2024
|
Need help about the using of transformers GPT2 for training
|
|
0
|
112
|
May 29, 2024
|
How to use GPT4 with trl PPO script
|
|
0
|
168
|
May 28, 2024
|
Way to fine tune pre trained model & get the embeddings
|
|
2
|
3612
|
May 28, 2024
|
Security of the LLM applications
|
|
1
|
167
|
May 26, 2024
|
Forward method inconsistent for time series transformer
|
|
0
|
95
|
May 26, 2024
|
Evaluating RAG only with open-source
|
|
1
|
627
|
May 24, 2024
|
Accessing certain hidden layer layer outputs
|
|
0
|
155
|
May 22, 2024
|
VisEncoderDecoderModel generate text incomplete when predict image with long text label
|
|
0
|
89
|
May 21, 2024
|
Inference time in TGI quantization
|
|
0
|
165
|
May 21, 2024
|