Hi Great Community,
Is it possible to have a progress bar to track the tokenisation process when calling the following method?
tokenizer(large_batch, padding=True, truncation=True,max_length=512)
Hi Great Community,
Is it possible to have a progress bar to track the tokenisation process when calling the following method?
tokenizer(large_batch, padding=True, truncation=True,max_length=512)
I would also like a progress bar for tokenizing! Maybe a verbose setting? Did you ever hear back about this?
tqdm library allows you to output a progress bar for the tokenizer.
from tqdm import tqdm
def tokenizer_with_progress(large_batch):
tokenized_texts = []
for text in tqdm(large_batch, desc="Tokenizing", unit="text"):
tokenized_texts.append(tokenizer(text, padding=True, truncation=True, max_length=512))
return tokenized_texts
train_encodings = tokenizer_with_progress(training_sentences)
test_encodings = tokenizer_with_progress(testing_sentences)