FineTuning a CasualLM with a text file

rishisuresh · September 17, 2024, 4:55am

Dear HF community, I need your help to confirm whether I am using the right approach to fine tune a pretrained LLM for text generation. I want to finetune a LLM with a text file containing stories. The objective at the end of this project is to have the finetuned model answer any questions from the stories or summarize the story in a few lines etc., Here is the approach I followed.

Installed all latest python libraries
Loaded Mistral 7B model from HF; Quantized using bitsandbytes and torch.dtype=bf16
Created a LoRaConfig with r=8 and created a PEFT model with tasktype = CASUAL_LM
Initialized the tokenizer. Assigned eos_token_id to pad_token_id of the tokenizer
Loaded my stories dataset (text file) and tokenized the strings with padding=true and truncation = true. Did not specify any max_length.
It created only the input_ids and attention_mask. I added “labels” and assigned the value of input_ids. So the input_id list and labels values are same. I did this as my finetuning failed once when I was trying with a diff LM and it asked for labels value in the tokenizer output.
I created trainingarguments and initialized the trainer with train dataset (80%) and test dataset(20%) and ran the train() method.

The good news is that the finetuning runs for a couple of hours and finishes on Google Colab. However I dont think the model is answering my questions properly from the story dataset I provided.

Questions are.

Is my approach correct?
if it’s, what is going wrong here as the results are random.
if its not, what should be right approach to meet the objective stated in the first paragraph.

The code is not private. I can definitely share it here if its needed but it’s bit lengthy.

Thanks for your help.

Topic		Replies	Views
Help with autotrain/LLM finetuning please Beginners	3	2145	August 11, 2023
Fine tuning LLM for text classification -- error with SFTTrainer Intermediate	2	1376	June 3, 2025
How to finetune an LLM with Image-Text pairs 🤗AutoTrain	0	588	May 13, 2024
Data format in run_lm_fine_tuning.py Beginners	2	415	September 8, 2020
Finetuning mT5 for specific language pair Models	0	144	October 17, 2024

FineTuning a CasualLM with a text file

Related topics