Wav2vec - <s></s> tokens

The wav2vec2.0 base 960h model never seems to return a beginning of sentence or end of sentence token (or ’ or unknown, so far)–using greedy decoding. Is that expected? I can’t seem to find this discussed anywhere. Or am I just feeding in audio that is too difficult for the model to determine the eos/bos? If so, can someone provide a counter-example?