How does the finetune on transformer (t5) work

Ah I’m definitely not the right person to answer this, but I think you should be able to just alter the model.lm_head to something like:

model.lm_head = nn.Sequential(
    nn.Linear(in_features=model.lm_head.in_features, out_features=<SOMENUMBER>, bias=False),
    nn.Linear(in_features=<SOMENUMBER>, out_features=model.lm_head.in_features, bias=False)
)

And you can add other stuff in there too I think, as long the final output layer matches the expected dimensions (unless you want to change that too How do I change the classification head of a model? - #19 by nielsr)

If you’re just adding layers to the head I don’t think that you need edit the source code. If you needed to change stuff within the network, e.g. to make changes to the T5Block, I think that’s when you would dive into the source code.

I’m curious to learn more about this though, so do update this post with anything new that you learn about this!

1 Like