Model inference on tokenized dataset

Well it doesn’t seem like I can edit this original post anymore but after testing 4 additional variants here is a version that I believe is working (will be another 10 hours before it finishes and I can confirm):

# device = 0 puts the pipeline on GPU, otherwise it will only use CPU.
pipe = pipeline("text-classification", model = model, tokenizer = tokenizer, device = 0)

# Hide the large number of deprecation warnings.
warnings.filterwarnings("ignore", category = DeprecationWarning)

preds = []
# GPU RAM usage continues to grow through inference :( Something is not being deleted correctly.
for i, outputs in enumerate(tqdm(pipe(KeyDataset(raw_dataset, "text"), batch_size = 128),
                                 total = len(raw_dataset))):
    preds.append(outputs)

I haven’t been able to get a version working using the pretokenized version of the dataset. It’s also unfortunate that GPU RAM usage grows over time, because I have to make sure the batch size keeps enough RAM available to not fail due to out-of-memory near the end of the loop. I don’t know if this could be because outputs needs to be explicitly deleted from GPU RAM or something.

3 Likes