Intermediate

Topic	Replies	Views	Activity
Persistent models	3	420	August 29, 2022
Sentence similarity - how to train it dynamically	0	837	September 18, 2023
(IMPOSSIBLE) LORA Finetuning with BASIC custom dataset	2	481	November 30, 2023
Cache for custom data loader	1	586	September 23, 2022
Getting entity offset from ONNX outputs	1	584	April 28, 2022
Fine Tuning bart-large-mnli on only Entailments	0	824	August 1, 2022
Freezing layers with SFTTrainer	2	267	March 8, 2025
How to convert ViTForMaskedImageModeling outputs to image	1	581	August 23, 2022
BART from finetuned BERT	2	472	September 9, 2021
CUDA OOM. Is it possible to distribute the usage of memory across 2gpu evenly?	1	324	August 9, 2023
Unable to deploy fine tuned model	5	187	March 11, 2025
Regression with multiple targets	0	814	May 12, 2022
Evaluation step take longer then training step	0	812	October 23, 2023
Question regarding adding a 4080 (and 3080?) to a 4090 rig for AI	2	466	May 15, 2024
How to implement early stopping in bert fine tuning for token classification	0	807	February 24, 2024
Autotrain training data format (text column)	0	805	November 3, 2023
Function Calling and RAG Features Using Open-Source LLMs	0	804	December 21, 2023
Rope Factor issues with meta-llama/Meta-Llama-3.1-70B	3	401	August 31, 2024
Train loss is not decreasing on siamese model based on xlm-roberta	1	567	February 22, 2024
Implentation of QA-LoRA	2	461	June 25, 2024
Replacing the LlamaDecoderLayer Class hugging Face With New LongNet	0	793	March 30, 2024
Generate embeddings with custom (non-text) dataset	0	790	January 24, 2023
How to fine-tune to 3 very different sized datasets (very large to very small)	0	785	February 24, 2023
Joining SpeechEncoderDecoder embedding chunks for processing longer audio	1	554	June 10, 2022
Dedicated endpoint getting 429 errors	4	197	May 21, 2025
Creating a docvqa dataset - gt_parses	1	553	June 11, 2024
Blip2 with a new LLM	0	778	August 15, 2023
TypeError: unhashable type: 'list', When trying to create a knowledge graph from a list of documents using `convert_to_graph_documents`	1	310	October 9, 2024
What hardware do you use to train your models? Cloud or local?	0	772	October 31, 2022
What is the best approach to let LLM to learn company internal legacy system	6	166	April 8, 2025
Missing files ? Missing config.json File After AutoTrain on Hugging Face	1	97	October 15, 2024
Fine-tuning `mistral-7B` for classification with QLoRA using peft	2	444	June 13, 2024
Visual Tokenization / Masking In BEIT & LayoutLMv3	1	542	December 23, 2022
Identifying max_steps for generativeText Dataset For Next SentencePrediction	0	766	November 5, 2021
Speech to text using whisper timestamped and gradio	2	442	October 3, 2023
Resuming accelerate-based pretraining with different batch size	0	764	January 31, 2023
Deployment of finetuned Mistral for Classification and Generation	4	340	June 10, 2024
Help with DeepSeek-V3-0324 Model Download	5	173	April 4, 2025
Whisper V3 finetuning with qlora	0	134	October 10, 2024
Loading Fine tuned whisper model (LOCAL)	0	748	November 13, 2023
Finetuning T5 for Summarisation - Poor results	1	525	April 28, 2024
Want to run kohya_ss from command prompt instead of browser	8	139	April 14, 2025
Pipelines for mutliple inputs don't produce reliable results	2	427	October 3, 2021
Simplifying Hugging Face Spaces API calls in Flutter using hugging_face_chat_gradio package	4	33	June 8, 2025
Adding another head to Vision encoder decoder model	4	328	May 7, 2024
When using an SDXL base and refiner, should LORAs be sent to both?	0	733	December 30, 2023
Multi GPU HF trainer in Jupyter Notebook	1	92	November 19, 2024
Longformer seemingly initializing global attention mask for every step	0	730	October 25, 2021
BERT Cross Validation with Tensorflow Text Classification	0	727	January 9, 2022
Multiple responses with async generate in TGI	1	514	April 23, 2024