PerceiverModel training logits does not require grad and does not have a grad_fn
|
|
0
|
235
|
December 5, 2023
|
PEFT fine-tuning as slow as full model fine-tuning
|
|
3
|
1615
|
December 6, 2023
|
What can you fine tune with 2x A6000s?
|
|
1
|
356
|
December 5, 2023
|
AutoModelForCausalLM.from_pretrained refuses to load safetensors weights
|
|
0
|
962
|
December 5, 2023
|
Running low on GPU memory on a cluster with ESM2 lowest config
|
|
2
|
397
|
December 5, 2023
|
Similarity search based on multiple text attributes
|
|
0
|
408
|
December 4, 2023
|
Predictions format sent to compute_metrics depends on model used
|
|
0
|
215
|
December 4, 2023
|
LLM Inference hosting issue
|
|
2
|
393
|
December 4, 2023
|
How to get vocabulary embedding matrix from an LLM?
|
|
1
|
398
|
December 1, 2023
|
How to implement LoRA with Pytorch?
|
|
0
|
1195
|
November 30, 2023
|
(IMPOSSIBLE) LORA Finetuning with BASIC custom dataset
|
|
2
|
485
|
November 30, 2023
|
[LMM Fine Tuning] Supervised Fine Tuning Trainer (SFTTrainer) vs transformers Trainer
|
|
1
|
1688
|
November 29, 2023
|
Advice Needed for Training an Imbalanced Dataset AI Model: lr, Epochs, and neuronal Architecture
|
|
3
|
573
|
November 27, 2023
|
Running SDXL diffusers in a container on python running ubuntu 2204, system RAM not being released
|
|
0
|
987
|
November 27, 2023
|
Refine BERT to pay more attention to key words
|
|
0
|
322
|
November 24, 2023
|
Bert Text classification
|
|
7
|
565
|
November 24, 2023
|
You should probably TRAIN this model on a down-stream task with BertForQuestionAnswering
|
|
3
|
8164
|
November 21, 2023
|
Converting HFT Checkpoint to CoreML
|
|
0
|
216
|
November 20, 2023
|
A standard way to have the `generate` method of the `GenerateMixin` only output the generated tokens
|
|
0
|
634
|
November 19, 2023
|
4-bit quantization
|
|
0
|
473
|
November 18, 2023
|
New Model Architecture
|
|
0
|
196
|
November 16, 2023
|
Need help with making a Nepali-to-English Translator
|
|
0
|
353
|
November 16, 2023
|
How to fine-tune an LLM to support funciton calling
|
|
0
|
883
|
November 15, 2023
|
Vector DB - Exhaustive search in RAG
|
|
0
|
334
|
November 14, 2023
|
Loading Fine tuned whisper model (LOCAL)
|
|
0
|
761
|
November 13, 2023
|
Setting local path for Dataset: Fine-tuning Whisper model
|
|
2
|
1024
|
November 13, 2023
|
Continue pre-training BERT
|
|
5
|
2508
|
November 13, 2023
|
Batch (List of Prompts) for Inference Client feature
|
|
0
|
237
|
November 9, 2023
|
Howto train translation model WITHOUT VALIDATION data?
|
|
0
|
350
|
November 8, 2023
|
Loading an LoRA adapter trained on quantized model on a non-quantized model
|
|
0
|
1393
|
November 7, 2023
|