Hi there, I’m trying to use the optimum-cli
command to export a neuron version of the meta-llama/Llama-2-13b-hf
model, but I’m getting a Python error when I try it.
I’m running on Ubuntu 20.04 with PyTorch 1.13 on an inf2.8xlarge instance.
The command I run from within the aws_neuron_venv_pytorch
venv:
optimum-cli export neuron -m meta-llama/Llama-2-13b-hf --sequence_length 4096 --batch_size 1 llama-2-13b-neuron-4096-1
The output:
/usr/lib/python3.8/runpy.py:127: RuntimeWarning: 'optimum.exporters.neuron.__main__' found in sys.modules after import of package 'optimum.exporters.neuron', but prior to execution of 'optimum.exporters.neuron.__main__'; this may result in unpredictable behaviour
warn(RuntimeWarning(msg))
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/opt/aws_neuron_venv_pytorch/lib/python3.8/site-packages/optimum/exporters/neuron/__main__.py", line 342, in <module>
main()
File "/opt/aws_neuron_venv_pytorch/lib/python3.8/site-packages/optimum/exporters/neuron/__main__.py", line 325, in main
input_shapes = normalize_input_shapes(task, args)
File "/opt/aws_neuron_venv_pytorch/lib/python3.8/site-packages/optimum/exporters/neuron/__main__.py", line 109, in normalize_input_shapes
mandatory_axes = neuron_config_constructor.func.get_mandatory_axes_for_task(task)
AttributeError: type object 'LLamaNeuronConfig' has no attribute 'get_mandatory_axes_for_task'
Traceback (most recent call last):
File "/opt/aws_neuron_venv_pytorch/bin/optimum-cli", line 8, in <module>
sys.exit(main())
File "/opt/aws_neuron_venv_pytorch/lib/python3.8/site-packages/optimum/commands/optimum_cli.py", line 163, in main
service.run()
File "/opt/aws_neuron_venv_pytorch/lib/python3.8/site-packages/optimum/commands/export/neuronx.py", line 155, in run
subprocess.run(full_command, shell=True, check=True)
File "/usr/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'python3 -m optimum.exporters.neuron -m meta-llama/Llama-2-13b-hf --sequence_length 4096 --batch_size 1 llama-2-13b-neuron-4096-1' returned non-zero exit status 1.