Fine-tune T5 model for Casual Language Modeling(CLM)

nofuture37 · April 25, 2023, 2:50am

Dear all,
I am new to NLP and has some strange questions, I try to explain them clearly.

My goal is to using a specific corpus to fine-tune t5-base model with a casual language modeling, I find this document and it use AutoModelForCasualLM, but this liabrary just not include series of t5 models.

So my question is:

How should I do to finetune t5 model for CLM object? In my understanding, CLM is a process of predicting token_2 from token_1 , token_3 from token_1, token_2 until the end of input sequence, so i am confused how to finish this process myself.
I try to spilt one my train data into something like this (ti == token_i, 1 == eos_token):
input_ids labels

[t1, 1, 1, 1, 1, 1, ...] [t1, t2, 1, 1, 1, 1, ...]
[t1, t2, 1, 1, 1, 1, ...] [t1, t2, t3, 1, 1, 1, ...]
[t1, t2, t3, 1, 1, 1, ...] [t1, t2, t3, t4, 1, 1, ...]
[t1, t2, t3, t4, 1, 1, ...] [t1, t2, t3, t4, t5, 1, ...]
The first problem is obvious, the expanded dataset is too large and requires more time to fine-tune; The second problem is that this seems strange, and I don’t know if this fulfills the CLM’s mission requirements. This is the only idea that i can catch up to solve this problem, does it work?

Thanks!!

nofuture37 · April 26, 2023, 6:24am

As a supplement, I used ‘T5ForConditionalGeneration.from_pretrained(“t5-base”)’

Topic		Replies	Views
Need help in fine-tuning T5-Base Model for a sequence task Beginners	0	169	May 8, 2024
Fine-tuning T5 Model on a Book for Unsupervised Learning Models	0	379	April 17, 2024
T5 Finetuning not converging Models	0	478	August 18, 2023
Finetuning mT5 for specific language pair Models	0	144	October 17, 2024
How to fine-tune T5-base model? Beginners	10	4585	July 28, 2021

Fine-tune T5 model for Casual Language Modeling(CLM)

Related topics