Best Pre-training Strategy

Hey community, I hope you’re models are converging fast :smile:

I’m trying to pre-train a BERT model on short query sentences/words, and i’m wondering what’s the best pre-training strategy to adapt in this situation?

Thank in advance.