Is Transformers using GPU by default?

I’m instantiating a model with this

tokenizer = AutoTokenizer.from_pretrained("nlptown/bert-base-multilingual-uncased-sentiment")
model = AutoModelForSequenceClassification.from_pretrained("nlptown/bert-base-multilingual-uncased-sentiment")

Then running a for loop to get prediction over 10k sentences on a G4 instance (T4 GPU). GPU usage (averaged by minute) is a flat 0.0%. What is wrong? How to use GPU with Transformers?

This like with every PyTorch model, you need to put it on the GPU, as well as your batches of inputs.


You can take a look at this issue How to make transformers examples use GPU? · Issue #2704 · huggingface/transformers · GitHub It includes an example for how to put your model on GPU.

device = "cuda:0" if torch.cuda.is_available() else "cpu"
sentence  = 'Hello World!'
tokenizer = AutoTokenizer.from_pretrained('bert-large-uncased')
model     = BertModel.from_pretrained('bert-large-uncased')

inputs    = tokenizer(sentence, return_tensors="pt").to(device)
model     =
outputs   = model(**inputs)