There is NLP model trained on Pytorch to be run in Jetson Xavier. I installed Jetson stats to monitor usage of CPU and GPU. When I run the Python script, only CPU cores work on-load, GPU bar does not increase. I have searched on Google about that with keywords of " How to check if pytorch is using the GPU?" and checked results on stackoverflow.com etc. According to their advices to someone else facing similar issue, cuda is available and there is cuda device in my Jetson Xavier. However, I don’t understand why GPU bar does not change, CPU core bars go to the ends.
I don’t want to use CPU, it takes so long to compute. In my opinion, it uses CPU, not GPU. How can I be sure and if it uses CPU, how can I change it to GPU?
Note: Model is taken from huggingface transformers library. I have tried to use cuda() method on the model. (model.cuda()) In this scenario, GPU is used but I can not get an output from model and raises exception.
model.cuda()
or model.to("cuda")
puts the model on GPU, you also need to put the inputs on GPU when model is on GPU, may be that’s why you are getting the exception.
Could post your code snippet and the exception, so we can take a look ?
Here is the exception and code.
Expected object of device type cuda but got device type cpu for argument #3 ‘index’ in call to _th_index_select
from transformers import AutoTokenizer, AutoModelForQuestionAnswering, pipeline
import torch
BERT_DIR = "savasy/bert-base-turkish-squad"
tokenizer = AutoTokenizer.from_pretrained(BERT_DIR)
model = AutoModelForQuestionAnswering.from_pretrained(BERT_DIR)
nlp=pipeline("question-answering", model=model, tokenizer=tokenizer)
def infer(question,corpus):
try:
ans = nlp(question=question, context=corpus)
return ans["answer"], ans["score"]
except:
ans = None
pass
return None, 0
if you are using pipeline
then you won’t need to put the model on GPU manually, pipline
can handle that using the device
parameter, just pass the gpu device number and it should work. Also you can just pass the BERT_DIR
to model
parameter, pipeline
can load model itself. Try this and let me know.
nlp = pipeline("question-answering", model=BERT_DIR, device=0)
3 Likes
Thank you valhalla. It works on GPU now.