I have a custom dataset and trained distilBERT model on that dataset. I would like to do inference. So on my inference .py file. I load the model from the default weights of distilibert (from the hub). Then, if in my directory there are my optimized weights, I load on top of the model my weights (It should basically overwrite the standard weights I get no errors). Then I instantiate the Trainer and I call the predict function for work with batches in the same way of the Training part.
Now I have a very larga dataset, and it take a lot of time for doing inference. But something interesting is the following:
Subset of 5000 samples are inferred in 36 seconds with batch of size 64
Subset of 50000 samples are inferred in 12.36 minutes with batch size 64
so a 21x factor having a 10x size dataset. Do someone here know the reason?
I’m doing it on simple laptop with a 3060RTX gpu
This behavior seems to be correlated with the type of dataset. As I tested with another dataset of the same size and it is 8 times faster. So I suspect it is something related with the dataset, maybe the Tokenizer? The complexity of the sentences in the other dataset? I’m using fast Tokenizer pre-trained