Fine-tuning T5 Model on a Book for Unsupervised Learning

I’m working on a project where I aim to fine-tune a T5 model or any other encoder-decoder model using an unsupervised learning approach to transfer knowledge from specific books.

My main goal is to train a model that becomes an expert on the book’s topic. However, I’m uncertain about the specific fine-tuning process to follow and which approach would yield the best results.

Here are my specific questions:

Given that I’m using an encoder-decoder model, what fine-tuning pipeline should I choose? I’ve heard that Masked Language Modeling (MLM) is effective for encoder models to learn knowledge. Is MLM suitable for an encoder-decoder architecture like T5, or should I consider other methods? Are there any potential pitfalls associated with fine-tuning on a specific book that I should be aware of? What suggestions do you have for optimizing the fine-tuning process to ensure the model becomes an expert on the book’s topic? Any advice or recommendations would be greatly appreciated.

Thanks for any advice!

1 Like