Reduce inference time with batches

Hi everyone.

I want to further improve the inference time from BERT.
Here is the code below:

for sentence in list(data_dict.values()):
    tokens = {'input_ids': [], 'attention_mask': []}
    new_tokens = tokenizer.encode_plus(sentence, max_length=512,
                                       truncation=True, padding='max_length',
                                       return_tensors='pt',
                                       return_attention_mask=True)
    tokens['input_ids'].append(new_tokens['input_ids'][0])
    tokens['attention_mask'].append(new_tokens['attention_mask'][0])

    # reformat list of tensors into single tensor
    tokens['input_ids'] = torch.stack(tokens['input_ids'])
    tokens['attention_mask'] = torch.stack(tokens['attention_mask'])

    outputs = model(**tokens)
    embeddings = outputs[0]

How can I provide batches like in training instead of the whole dataset?