Variable num_predict in target_mapping for XLNet

maveriq · January 1, 2021, 11:26am

I have a pretraining task for XLNet. One of the inputs for XLNetLMHeadModel is target_mapping that is of the shape (batch_size,num_predict,seq_len).

I want to predict for all tokens in an input sentence, which means num_predict will vary within a batch, for sentences of different length. This leads to error while building a data_loader in PyTorch. Can anyone suggest a workaround for this problem?

Thanks

BramVanroy · January 2, 2021, 8:18am

I might be wrong here but I would assume that in the language modeling task, num_predict is actually the size of the vocabulary because for each mask you try to predict the highest probability token in the vocab (in MLM). Seq_len is the max length of sequences you want to be able to model. If a sentence is smaller then you just pad it.

maveriq · January 2, 2021, 9:34am

Dear @BramVanroy, thank you for the reply, but num_predict is not the size of vocabulary. It’s the number of predictions to be made for that particular input sentence. seq_len is not the issue because as you pointed out, it can be padded. But I am not sure if the same is true for num_predict, hence this question

BramVanroy · January 2, 2021, 9:58am

You are absolutely right.

But if you want to predict all tokens, can’t you just leave target_mapping to the default (None)? From the docs:

If target_mapping is None, then num_predict corresponds to sequence_length.

Or is your point that this mapping does not take into account the different sequence lengths in the batch? If that is the question, I cannot help with that. I do not have enough experience with XLNet. Perhaps someone else can chime in.

Topic		Replies	Views
Text generation with XLNet not working 🤗Transformers	1	942	July 21, 2020
Pretrained XLM model with TLM objective generates nonsensical predictions Models	0	537	June 15, 2021
Getting outputs of mode.predict() per sentence input Models	3	2446	June 21, 2021
Padding strategy for classification Beginners	3	2518	July 20, 2020
Mask modelling on specific words Beginners	1	1051	March 25, 2021

Variable num_predict in target_mapping for XLNet

Related topics