What is ns, nd variable

In transformers/src/transformers/modeling_gpt2.py .

what is nd, ns variable in line 150

def _attn(self, q, k, v, attention_mask=None, head_mask=None, output_attentions=False):
    w = torch.matmul(q, k)
    if self.scale:
        w = w / (float(v.size(-1)) ** 0.5)
    nd, ns = w.size(-2), w.size(-1)
    mask = self.bias[:, :, ns - nd : ns, :ns]

because gpt model performs self-attention, isn’t the nd and ns always same?
What is the meaning of “ns - nd: ns”

As you can see in line 187,

query, key, value = x.split(self.split_size, dim=2)

the query and key should have same sequence length which is nd and ns

Thank you