I’m having a lot of trouble getting the GPT-J-6B model to load (either GPU or CPU), and it seems that no matter what I do I’m getting an error stating that the probability tensors contain invalid values (I suspect NaN).
System (VM) Specs:
Ryzen 3800X (12 cores allocated to VM)
64GB RAM (48GB allocated to VM)
AMD Radeon 6900X (passed through to VM)
I have installed and configured ROCm and PyTorch ROCm.
Below is how I’m (attempting to) load the model on a GPU, and the error message I’m getting:
# Filename: test.py from transformers import GPTJForCausalLM, AutoTokenizer import torch model = GPTJForCausalLM.from_pretrained("EleutherAI/gpt-j-6b"), revision="float16", torch_dtype=torch.float16, low_cpu_mem_usage=True).to("cuda") tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-j-6b") prompt = ("This is a test prompt that") input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to("cuda") gen_tokens = model.generate(input_ids, do_sample=True, temperature=0.9, max_length=50) gen_text = tokenizer.batch_decode(gen_tokens) print(gen_text)
After loading this in a python3 terminal with “import test” I receive this error:
Traceback (most recent call last): File "«stdin>", line 1, in <module> File " /webapp/test.py", line 11, in ‹module> gen_tokens = model.generate(input_ids, do_sample=True, temperature=0.9, max_length=100) File "/home/gigglez/.local/lib/python3.10/site-packages/torch/utils/_contextli b.py", line 115, in decorate_context return func(*args, **kwargs) File " /home/gigglez/.local/lib/python3.10/site-packages/transformers/generatio n/utils.py", line 1452, in generate return self. sample( File "/home/gigglez/.local/lib/python3.10/site-packages/transformers/generatio n/utils.py" line 2504, in sample next tokens = torch.multinontal(probs, num_samples=1) . squeeze (1) RuntimeError: probability tensor contains either 'inf', 'nan' or element < 0
Everything I can find on this error suggests that something in the conversion to float16 is making the values in the tensors so small or large that they become NaN… but since I’m not doing any manipulation of the model, and I’m loading exactly as documented in the HuggingFace transformers guide(with the exception that I had to add the low_cpu_mem_usage=True parameter to GPTJForCausalLM.from_pretrained() as suggested in this thread to deal with an issue where the system was running out of RAM)
Why do these tensors contain invalid values?
Do I need to be using some specific version of the transformers library?
How do I inspect the invalid values?
Is there something I can do to fix/edit the invalid values?
Do AMD GPUs (or does ROCm) have some issue that prevents the transformers library from functioning?