What is the latency expectation of DeBerta when doing batch inference

We are using DeBerta to do sentence embedding. The latency for embedding one sentence is about 30 ms. And If I start to do batch inference, the latency increase linearly as the batch size increases. Is this expected? We are running on a A10 GPU with triton inference server.

for batch in dataloader:
                tokens = self._tokenizer(
                    batch,
                    truncation=True,
                    padding=True,
                    return_tensors="np",
                    max_length=self._max_len,
                )
                # Running inference on tokens