Hello, I’ve spent days trying to figure out an issue I’ve been having with GPT-J, while using the example code.
from transformers import GPTJForCausalLM, AutoTokenizer import torch model = GPTJForCausalLM.from_pretrained( "EleutherAI/gpt-j-6B", revision="float16", torch_dtype=torch.float16, low_cpu_mem_usage=True ) tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-j-6B") prompt = ( "In a shocking finding, scientists discovered a herd of unicorns living in a remote, " "previously unexplored valley, in the Andes Mountains. Even more surprising to the " "researchers was the fact that the unicorns spoke perfect English." ) input_ids = tokenizer(prompt, return_tensors="pt").input_ids gen_tokens = model.generate( input_ids, do_sample=True, temperature=0.9, max_length=50, ) gen_text = tokenizer.batch_decode(gen_tokens)
return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled) RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'
this one error has tormented me for so long. I have a GTX 1060 gpu with cuda 11.7 installed. I’ve looked everywhere and yet to find a fix for it, and I can’t simply run it on my cpu since it ends up running out of memory.