Overlapping data between pre-training and fine-tuning stages

Hi

I am currently pre-training a RoBERTa model for my own data. If some part of the data that is used for language modeling task during the pre-training process is also used in fine-tuning process, does this cause a biased results at the end of the fine-tuning process ?

1 Like