BERT: What is the shape of each Transformer Encoder block in the final hidden state?

In the Transformer paper (Vaswani et al), the output dimension of encoder is d_model = 512. Is the hidden size in BERT (denoted as H in the BERT paper) actually the d_model in Transformer? But in BERT-base, this number changes from 512 to 768?