Fine Tuning A sentence transformer model with my own data
|
|
2
|
3027
|
April 17, 2024
|
DeepSpeed giving Assertion Error
|
|
2
|
2962
|
July 22, 2023
|
What is the limit of grad accumulation?
|
|
2
|
2906
|
May 4, 2021
|
Difference between GAT and Transformer?
|
|
0
|
886
|
April 7, 2022
|
What is the difference between triplet loss and contrastive loss?
|
|
1
|
1975
|
June 18, 2022
|
Loading models sometimes maxes DISK%, then crashes
|
|
2
|
2867
|
October 8, 2020
|
Why is it so slow to access data through iteration with hugginface dataset?
|
|
2
|
2841
|
July 21, 2022
|
How to fine-tune an LLM to support funciton calling
|
|
0
|
874
|
November 15, 2023
|
how to convert text to word embeddings using bert's pretrained model 'faster'?
|
|
1
|
3466
|
January 4, 2021
|
Summariser pipeline giving different results on same model with fixed seed
|
|
0
|
870
|
August 17, 2022
|
Run training script in DDP using GLOO
|
|
1
|
1941
|
August 17, 2022
|
Fine tunning QA model in SQUAD 2 dataset with more than one answer
|
|
2
|
878
|
November 6, 2024
|
Using GPT-Neo-125M with ONNX
|
|
3
|
1348
|
July 5, 2022
|
Model validation failed - Target is multiclass but average='binary'
|
|
2
|
2705
|
January 21, 2024
|
Specify attention masks for some heads in multi-head attention
|
|
3
|
2335
|
November 17, 2020
|
BPEDecoder no spaces after special tokens
|
|
4
|
2032
|
April 19, 2023
|
Perplexity from fine-tuned GPT2LMHeadModel with and without lm_head as a parameter
|
|
4
|
2030
|
May 10, 2022
|
Fine-tuning Mistral/Mixtral for sequence classification on long context
|
|
2
|
2606
|
May 29, 2024
|
Convert models to Longformer
|
|
3
|
2189
|
February 1, 2021
|
FineTune LLM for regex
|
|
3
|
2139
|
April 21, 2024
|
Load Custom Model
|
|
8
|
1424
|
November 21, 2022
|
Image similarity
|
|
2
|
2435
|
March 31, 2023
|
Segmentation fault (Core dumped) with datasets
|
|
2
|
2417
|
July 9, 2021
|
Deploying Seq2Seq using ONNX on GPU
|
|
0
|
743
|
March 24, 2022
|
ValueError: You should supply an encoding or a list of encodings to this method that includes input_ids, but you provided ['tokens', 'id', 'space_after', 'ner_tags', 'ner_ids']
|
|
2
|
2391
|
April 21, 2023
|
Add_faiss_index with multiple columns
|
|
0
|
731
|
August 19, 2023
|
Converting GPT2 to JavaScript?
|
|
1
|
1630
|
April 17, 2021
|
Combine multiple Lora's for group photo?
|
|
1
|
515
|
January 3, 2025
|
How to exclude layers in weight decay
|
|
1
|
2874
|
October 18, 2021
|
Accelerated Inference API not taking parameters?
|
|
5
|
1633
|
October 26, 2022
|
Push model to hugging face hub without Trainer
|
|
7
|
1406
|
May 14, 2024
|
Linear learning rate despite lr_scheduler_type="polynomial"
|
|
4
|
1763
|
September 2, 2021
|
TGI and turn off Flash Attention v2
|
|
4
|
1748
|
August 23, 2024
|
DPO training data format
|
|
7
|
1375
|
September 23, 2024
|
Using TRL on TPU
|
|
1
|
155
|
February 11, 2025
|
Batched BertForMaskedLM inference loss issue
|
|
0
|
688
|
February 23, 2022
|
Preprocessing for T5 Denoising
|
|
1
|
2713
|
May 20, 2021
|
GPTQ+PEFT model running very slowly at inference
|
|
4
|
1686
|
October 24, 2023
|
Open-LLM-Leaderboard for dummies
|
|
3
|
327
|
December 30, 2024
|
How to generate on multiple GPU's
|
|
3
|
1836
|
August 30, 2022
|
Multinode DeepSpeed T5 Experiment Issues with Hf-Trainer
|
|
2
|
1159
|
August 3, 2022
|
AttributeError: LayoutLMTokenClassification object has no attribute 'config'
|
|
3
|
1775
|
August 13, 2022
|
How to concatenate the word embedding for special tokens and words
|
|
1
|
2510
|
June 13, 2021
|
Properly loading a fine tuned model from directory
|
|
2
|
2040
|
August 25, 2020
|
How to continue to pre-train gpt2?
|
|
2
|
2031
|
July 1, 2023
|
Cache Proxy - Like with Docker Registries
|
|
1
|
444
|
October 21, 2024
|
DeBERTaV3 ONNX conversion error
|
|
2
|
2028
|
July 25, 2022
|
I Fine-tuned a llama 7b on a custom dataset, The response from inference generation start good, then words start to connect with out space
|
|
4
|
1544
|
July 19, 2023
|
What is the official way to run a wandb sweep with hugging face (HF) transformers?
|
|
2
|
1995
|
July 25, 2023
|
Finetuning LLama2-70B using 4-bit quantization on multi-GPU using Deepspeed ZeRO
|
|
1
|
2405
|
March 19, 2024
|