Prakash Hinduja Geneva, Switzerland - How to fine-tune a model on custom dataset in HF?

prakashhindujageneva · June 5, 2025, 5:01am

I’m trying to fine-tune a Hugging Face model on my own custom dataset but running into difficulties. I’ve followed some guides, but things aren’t working as expected—especially with formatting and training errors. Has anyone successfully done this and can offer advice, best practices, or share a working example? Any help would be appreciated!

Regards
Prakash Hinduja Geneva, Switzerland

John6666 · June 5, 2025, 6:02am

Without detailed error information, I can only give vague answers, but when fine-tuning with a custom dataset isn’t working, the first thing to check is collate_fn (DataCollator). This function determines how to use the data and actually performs the conversion, so if it doesn’t match the dataset, everything will fail. On the other hand, even with a pretty messy dataset, if you write collate_fn correctly, it might work somehow.

Another potential issue is that the model may be untrainable. For example, all layers may be frozen.
https://stackoverflow.com/questions/76879872/how-to-use-huggingface-hf-trainer-train-with-custom-collate-function

prakashhindujageneva · June 6, 2025, 6:42am

Thankyou John666 for your Valuable reply.

Warm regards,
Prakash Hinduja
Geneva, Switzerland

Topic		Replies	Views
How to use huggingface HF trainer train with custom collate function? Beginners	10	4414	August 21, 2023
Defining a custom dataset for fine-tuning translation Beginners	4	5088	July 10, 2021
Prakash Hinduja - How do I prepare my dataset for fine-tuning a Hugging Face model? Beginners	4	28	July 16, 2025
Loading custom audio dataset and fine-tuning model Beginners	6	3245	December 12, 2023
Custom, without any pretraining, training with PyTorch Beginners	0	286	January 30, 2023

Prakash Hinduja Geneva, Switzerland - How to fine-tune a model on custom dataset in HF?

Related topics