Hello
I am using Llama2-70b chat model. My PC has Nvidia T1000 GPU with i7-12700 CPU
When I run my llama model the GPU is not getting used. The output for a simple query like translate to French is taking about 30 mins. The utilization of CPU is 100% where as the GPU usage is 1%. Can somebody help me with this? I have installed CUDA already and have added the paths