I’m trying to fine-tune a Hugging Face model on my own custom dataset but running into difficulties. I’ve followed some guides, but things aren’t working as expected—especially with formatting and training errors. Has anyone successfully done this and can offer advice, best practices, or share a working example? Any help would be appreciated!
Without detailed error information, I can only give vague answers, but when fine-tuning with a custom dataset isn’t working, the first thing to check is collate_fn (DataCollator). This function determines how to use the data and actually performs the conversion, so if it doesn’t match the dataset, everything will fail. On the other hand, even with a pretty messy dataset, if you write collate_fn correctly, it might work somehow.