I have 2 questions regarding fine-tuning t5:-
Is there anyway to change the lm_head on T5ForConditionalGeneration to intiliaze it from scratch to support new vocabulary size ?
I did it by changing the T5ForConditionalGeneration code and add a new layer called final_layer, but I was wondering if there is an easier way.
Is T5 generate method use teacher forcing or not ?
When you modify the
vocab, you also need to resize the the token embeddings. The right way to do this is
- Add the new tokens to the
tokenizer.add_tokens(list of new toknes)
- Resize token embeddings
teacher forcing is used while training.
generate does not use teacher forcing since it’s not used in training and meant for generating after training.
Thanks @valhalla for your explanation.
To confirm my understanding.
Resizing the embedding will add extra rows/columns for the new tokens, which is initialised with random weights, correct ?
Will use teacher forcing during training, is there anyway to disable teacher forcing in the library, or I have to implement it my self by feeding the model one output at a time sequentially ?
Here’s what I used to add some tokens:
from transformers import T5Tokenizer
from transformers import T5ForConditionalGeneration
tokenizer = T5Tokenizer.from_pretrained('t5-small')
model = T5ForConditionalGeneration.from_pretrained('t5-small')
local_dir = "./cryptic_special"
model_name = "t5-small"
special_tokens = ["<DEFN>",
tokenizer_special = T5Tokenizer.from_pretrained(model_name, additional_special_tokens=special_tokens)
Then you just adapt the fine_tune script to point to the local_dir (for model and tokenizer)
Thanks a lot for the example.
Perfect, thanks for the explanation.
This didn’t work for me, how can you reload the model once you’ve resized the embedding?
The rest of the model resizes, but it seems the LM_HEAD will not, eg:
size mismatch for lm_head.weight: copying a param with shape
torch.Size([32128, 768]) from checkpoint, the shape in current model is
Disregard this, it was a bug that was fixed in: