Hi,
I’m using a simple pipeline on Google Colab but GPU usage remains at 0 when performing inference on a large number of text inputs (according to Colab monitor).
Here’s what I’ve tried:
model = pipeline("feature-extraction", device=torch.device("cuda"))
model = pipeline("feature-extraction", device=0)
I can confirm that I’m using a T4-GPU runtime. Neither uses GPU.
Transformers version: 4.37.2
CUDA version:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Aug_15_22:02:13_PDT_2023
Cuda compilation tools, release 12.2, V12.2.140
Build cuda_12.2.r12.2/compiler.33191640_0