Hello! I’m a CS student. I’m also a junior developer (mostly skilled in backend, but I can also manage some full-stack tasks) and I’m currently choosing an argument for my thesis. It will be on ML, and currently I’ve found GPT-J (and GPT-3, but that’s not the topic) really fascinating. I’m trying to move the text generation in my local computer, but my ML experience is really basic with classifiers and I’m having issues trying to run GPT-J 6B model on local. This might also be caused due to my medium-low specs PC (GPU is an AMD rx 480 4GB, 16GB ram, CPU AMD Ryzen 5 3600 6-Core Processor 3.60 GHz and as I’ve read online I might not even be able to run GPT-J locally).
So, my first question is: can I actually run GPT-J in local? Even if it’s slow, speed is not currently my goal. Yes, I also considered using Colab, but for now I need to run it even without internet, so in local.
Second question (supposing that the first one is a “Yes, you can run it in local”):
I’m trying to run it with this test code:
from transformers import GPTJForCausalLM, AutoTokenizer
import torch
model = GPTJForCausalLM.from_pretrained("EleutherAI/gpt-j-6B", revision="float16", torch_dtype=torch.float16, low_cpu_mem_usage=True)
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-j-6B")
context = """In a shocking finding, scientists discovered a herd of unicorns living in a remote,
previously unexplored valley, in the Andes Mountains. Even more surprising to the
researchers was the fact that the unicorns spoke perfect English."""
input_ids = tokenizer(context, return_tensors="pt").input_ids
gen_tokens = model.generate(input_ids, do_sample=True, temperature=0.9, max_length=100,)
gen_text = tokenizer.batch_decode(gen_tokens)[0]
print(gen_text)
It downloaded the 6B model, but it got stuck on the generating function. This seems like a coding error, but it’s actually weird because it’s not in my code. Error is:
return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: “LayerNormKernelImpl” not implemented for ‘Half’
Also tried to run it with tensorflow:
from transformers import GPTJForCausalLM, AutoTokenizer
import tensorflow as tf
model = GPTJForCausalLM.from_pretrained("EleutherAI/gpt-j-6B", revision="float16", torch_dtype=tf.float16, low_cpu_mem_usage=True)
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-j-6B")
context = """In a shocking finding, scientists discovered a herd of unicorns living in a remote,
previously unexplored valley, in the Andes Mountains. Even more surprising to the
researchers was the fact that the unicorns spoke perfect English."""
input_ids = tokenizer(context, return_tensors="pt").input_ids
print ("Generating...")
gen_tokens = model.generate(input_ids, do_sample=True, temperature=0.9, max_length=100,)
gen_text = tokenizer.batch_decode(gen_tokens)[0]
print(gen_text)
Got a different error:
File “filePath\venv\lib\site-packages\transformers\modeling_utils.py”, line 1044, in _set_default_torch_dtype
- if not dtype.is_floating_point:*
AttributeError: ‘DType’ object has no attribute ‘is_floating_point’
I don’t actually know where to start using gpt-j in local, because I’ve tried a bunch of online guides (also huggingface’s example code) but it still gives me some errors
Thanks anyway!