I just trained a
BertForSequenceClassification classifier but come on problems when trying to predict.
When I use the
predict method of trainer on encodings I precomputed, I’m able to obtain predictions for ~350 samples from test set in less than 20 seconds.
However, when I load the model from storage and use a pipeline, the code runs for more than 10 mins, even adding batching doesn’t seem to help.
classifier = pipeline("text-classification", model=model,tokenizer=tokenizer,max_length=512, truncation=True) classifier(X_test.Text.to_list(),batch_size=10)
What can explain this difference ?