I have a pretraining task for XLNet. One of the inputs for XLNetLMHeadModel is target_mapping that is of the shape (batch_size,num_predict,seq_len).
I want to predict for all tokens in an input sentence, which means num_predict will vary within a batch, for sentences of different length. This leads to error while building a data_loader in PyTorch. Can anyone suggest a workaround for this problem?