Referring to this Colab Notebook:
While initializing the RobertaConfig
:
config = RobertaConfig(
vocab_size=52_000,
max_position_embeddings=514,
num_attention_heads=12,
num_hidden_layers=6,
type_vocab_size=1,
)
Why do we set max_position_embeddings
to 514 when the maximum sequence length is set to 512 in the notebook?