DistilBERT multiclass classification example

def forward(self,input_ids,attention_mask):
      output_1 = self.l1(input_ids=input_ids, attention_mask=attention_mask)
      print(output_1.last_hidden_state.shape)
      hidden_state=output_1[0]
      # assert output_1.last_hidden_state.shape == output_1[0].shape
      pooler=hidden_state[:,0]
      assert output_1.last_hidden_state.shape == pooler.shape
      logits=self.classifier(pooler)
      return logits 

Why the need for slicing hidden_state[:,0]. What does this signify? I’m unable to understand this step.