Intermediate

Topic	Replies	Views	Activity
Text classification training on long text	3	5111	June 18, 2024
Accelerate - WeightedRandomSampler Dataloader	1	280	June 18, 2024
Accelerate socket timeout on multi-node LLM training	0	342	June 14, 2024
How to ensure my custom Trainer is using my custom TrainerState and TrainerControl?	1	364	June 14, 2024
FSDP with Trainer class: AlgorithmError: ValueError('Cannot flatten integer dtype tensors'), exit code: 1	0	581	June 13, 2024
Fine-tuning `mistral-7B` for classification with QLoRA using peft	2	483	June 13, 2024
Using LLM cache	0	107	June 12, 2024
Finetuning with SFTtrainer	1	446	June 12, 2024
Which weights change when fine-tunning a pre-trained model?	3	852	June 11, 2024
Creating a docvqa dataset - gt_parses	1	559	June 11, 2024
Deployment of finetuned Mistral for Classification and Generation	4	349	June 10, 2024
How to fix RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn	1	1777	June 10, 2024
Error Training Vision Encoder Decoder for Image Captioning	8	2939	June 8, 2024
Inference after QLoRA fine-tuning	8	6352	June 7, 2024
SAMModel output size different to the input	2	238	June 6, 2024
MaskFormer Jagged Edges Issues of output masks	1	230	June 5, 2024
4:3 to 16:9 Infill	0	111	June 4, 2024
Deploying Whisper Based Live Transcription for 1000 Concurrent users	0	378	June 1, 2024
Running multiple instances on GPU	0	185	June 1, 2024
Quantization not yet implemented	0	98	June 1, 2024
Fine-tuning Mistral/Mixtral for sequence classification on long context	2	2621	May 29, 2024
Need help about the using of transformers GPT2 for training	0	112	May 29, 2024
How to use GPT4 with trl PPO script	0	168	May 28, 2024
Way to fine tune pre trained model & get the embeddings	2	3612	May 28, 2024
Security of the LLM applications	1	167	May 26, 2024
Forward method inconsistent for time series transformer	0	95	May 26, 2024
Evaluating RAG only with open-source	1	627	May 24, 2024
Accessing certain hidden layer layer outputs	0	155	May 22, 2024
VisEncoderDecoderModel generate text incomplete when predict image with long text label	0	89	May 21, 2024
Inference time in TGI quantization	0	165	May 21, 2024