Using T5 pre-trained weight for Text style transfer

Kyoritzu · March 20, 2021, 4:28pm

Hi, I am trying to create a similar model to Riley et al., 2020 ([2010.03802] TextSETTR: Label-Free Text Style Extraction and Tunable Targeted Restyling), their model uses the the pre-trained weights for T5 model.

My approach is similar to theirs as I am trying create a model with a encoder decoder structure , which both are initialized with the T5 weights. Similar to Riley´s model mine will also include a “style extractor” which has same structure as the encoder which also needs to be initialized with the T5 weights.

I am able to access the weights using the from_pretrained() and state_dict() functions. The problem I am stuck with is loading/initializing my model with the weights. Since the model need to have same structure as the T5 model (from my understanding) to be able to load the weights. Any tips on this front ?

kav24 · June 25, 2021, 8:18pm

Were you able to figure this out? I am trying to implement a similar model and I am having a hard time understanding how the pieces connect.

Topic		Replies	Views
How to use architecture of T5 without pretrained weights Beginners	2	2380	September 14, 2022
T5 training from scratch Beginners	5	2178	November 5, 2020
Example of how to pretrain T5? 🤗Transformers	15	16034	March 16, 2023
How is T5 pretrained? 🤗Transformers	3	512	July 12, 2021
How to use the encoder only from T5? Beginners	0	673	April 9, 2022

Using T5 pre-trained weight for Text style transfer

Related topics