Persistent models
|
|
3
|
420
|
August 29, 2022
|
Sentence similarity - how to train it dynamically
|
|
0
|
837
|
September 18, 2023
|
(IMPOSSIBLE) LORA Finetuning with BASIC custom dataset
|
|
2
|
481
|
November 30, 2023
|
Cache for custom data loader
|
|
1
|
586
|
September 23, 2022
|
Getting entity offset from ONNX outputs
|
|
1
|
584
|
April 28, 2022
|
Fine Tuning bart-large-mnli on only Entailments
|
|
0
|
824
|
August 1, 2022
|
Freezing layers with SFTTrainer
|
|
2
|
267
|
March 8, 2025
|
How to convert ViTForMaskedImageModeling outputs to image
|
|
1
|
581
|
August 23, 2022
|
BART from finetuned BERT
|
|
2
|
472
|
September 9, 2021
|
CUDA OOM. Is it possible to distribute the usage of memory across 2gpu evenly?
|
|
1
|
324
|
August 9, 2023
|
Unable to deploy fine tuned model
|
|
5
|
187
|
March 11, 2025
|
Regression with multiple targets
|
|
0
|
814
|
May 12, 2022
|
Evaluation step take longer then training step
|
|
0
|
812
|
October 23, 2023
|
Question regarding adding a 4080 (and 3080?) to a 4090 rig for AI
|
|
2
|
466
|
May 15, 2024
|
How to implement early stopping in bert fine tuning for token classification
|
|
0
|
807
|
February 24, 2024
|
Autotrain training data format (text column)
|
|
0
|
805
|
November 3, 2023
|
Function Calling and RAG Features Using Open-Source LLMs
|
|
0
|
804
|
December 21, 2023
|
Rope Factor issues with meta-llama/Meta-Llama-3.1-70B
|
|
3
|
401
|
August 31, 2024
|
Train loss is not decreasing on siamese model based on xlm-roberta
|
|
1
|
567
|
February 22, 2024
|
Implentation of QA-LoRA
|
|
2
|
461
|
June 25, 2024
|
Replacing the LlamaDecoderLayer Class hugging Face With New LongNet
|
|
0
|
793
|
March 30, 2024
|
Generate embeddings with custom (non-text) dataset
|
|
0
|
790
|
January 24, 2023
|
How to fine-tune to 3 very different sized datasets (very large to very small)
|
|
0
|
785
|
February 24, 2023
|
Joining SpeechEncoderDecoder embedding chunks for processing longer audio
|
|
1
|
554
|
June 10, 2022
|
Dedicated endpoint getting 429 errors
|
|
4
|
197
|
May 21, 2025
|
Creating a docvqa dataset - gt_parses
|
|
1
|
553
|
June 11, 2024
|
Blip2 with a new LLM
|
|
0
|
778
|
August 15, 2023
|
TypeError: unhashable type: 'list', When trying to create a knowledge graph from a list of documents using `convert_to_graph_documents`
|
|
1
|
310
|
October 9, 2024
|
What hardware do you use to train your models? Cloud or local?
|
|
0
|
772
|
October 31, 2022
|
What is the best approach to let LLM to learn company internal legacy system
|
|
6
|
166
|
April 8, 2025
|
Missing files ? Missing config.json File After AutoTrain on Hugging Face
|
|
1
|
97
|
October 15, 2024
|
Fine-tuning `mistral-7B` for classification with QLoRA using peft
|
|
2
|
444
|
June 13, 2024
|
Visual Tokenization / Masking In BEIT & LayoutLMv3
|
|
1
|
542
|
December 23, 2022
|
Identifying max_steps for generativeText Dataset For Next SentencePrediction
|
|
0
|
766
|
November 5, 2021
|
Speech to text using whisper timestamped and gradio
|
|
2
|
442
|
October 3, 2023
|
Resuming accelerate-based pretraining with different batch size
|
|
0
|
764
|
January 31, 2023
|
Deployment of finetuned Mistral for Classification and Generation
|
|
4
|
340
|
June 10, 2024
|
Help with DeepSeek-V3-0324 Model Download
|
|
5
|
173
|
April 4, 2025
|
Whisper V3 finetuning with qlora
|
|
0
|
134
|
October 10, 2024
|
Loading Fine tuned whisper model (LOCAL)
|
|
0
|
748
|
November 13, 2023
|
Finetuning T5 for Summarisation - Poor results
|
|
1
|
525
|
April 28, 2024
|
Want to run kohya_ss from command prompt instead of browser
|
|
8
|
139
|
April 14, 2025
|
Pipelines for mutliple inputs don't produce reliable results
|
|
2
|
427
|
October 3, 2021
|
Simplifying Hugging Face Spaces API calls in Flutter using hugging_face_chat_gradio package
|
|
4
|
33
|
June 8, 2025
|
Adding another head to Vision encoder decoder model
|
|
4
|
328
|
May 7, 2024
|
When using an SDXL base and refiner, should LORAs be sent to both?
|
|
0
|
733
|
December 30, 2023
|
Multi GPU HF trainer in Jupyter Notebook
|
|
1
|
92
|
November 19, 2024
|
Longformer seemingly initializing global attention mask for every step
|
|
0
|
730
|
October 25, 2021
|
BERT Cross Validation with Tensorflow Text Classification
|
|
0
|
727
|
January 9, 2022
|
Multiple responses with async generate in TGI
|
|
1
|
514
|
April 23, 2024
|