Issues running GPT-J-6B

Hello, I’ve spent days trying to figure out an issue I’ve been having with GPT-J, while using the example code.

from transformers import GPTJForCausalLM, AutoTokenizer
import torch

model = GPTJForCausalLM.from_pretrained(
    "EleutherAI/gpt-j-6B", revision="float16", torch_dtype=torch.float16, low_cpu_mem_usage=True
)
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-j-6B")

prompt = (
    "In a shocking finding, scientists discovered a herd of unicorns living in a remote, "
    "previously unexplored valley, in the Andes Mountains. Even more surprising to the "
    "researchers was the fact that the unicorns spoke perfect English."
)

input_ids = tokenizer(prompt, return_tensors="pt").input_ids

gen_tokens = model.generate(
    input_ids,
    do_sample=True,
    temperature=0.9,
    max_length=50,
)
gen_text = tokenizer.batch_decode(gen_tokens)[0]

error:

    return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'

this one error has tormented me for so long. I have a GTX 1060 gpu with cuda 11.7 installed. I’ve looked everywhere and yet to find a fix for it, and I can’t simply run it on my cpu since it ends up running out of memory.

Thanks

you can not run float 16 on CPU as it is not supported, you can run float 16 on gpu, try to add to line .to(“cuda”) at the end of your model.

model = model.to(“cuda”)

It will use GPU