Optimum-cli export error exporting Llama 2 HF to inf2

Hi there, I’m trying to use the optimum-cli command to export a neuron version of the meta-llama/Llama-2-13b-hf model, but I’m getting a Python error when I try it.

I’m running on Ubuntu 20.04 with PyTorch 1.13 on an inf2.8xlarge instance.

The command I run from within the aws_neuron_venv_pytorch venv:

optimum-cli export neuron -m meta-llama/Llama-2-13b-hf --sequence_length 4096 --batch_size 1 llama-2-13b-neuron-4096-1

The output:

/usr/lib/python3.8/runpy.py:127: RuntimeWarning: 'optimum.exporters.neuron.__main__' found in sys.modules after import of package 'optimum.exporters.neuron', but prior to execution of 'optimum.exporters.neuron.__main__'; this may result in unpredictable behaviour
  warn(RuntimeWarning(msg))
Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/opt/aws_neuron_venv_pytorch/lib/python3.8/site-packages/optimum/exporters/neuron/__main__.py", line 342, in <module>
    main()
  File "/opt/aws_neuron_venv_pytorch/lib/python3.8/site-packages/optimum/exporters/neuron/__main__.py", line 325, in main
    input_shapes = normalize_input_shapes(task, args)
  File "/opt/aws_neuron_venv_pytorch/lib/python3.8/site-packages/optimum/exporters/neuron/__main__.py", line 109, in normalize_input_shapes
    mandatory_axes = neuron_config_constructor.func.get_mandatory_axes_for_task(task)
AttributeError: type object 'LLamaNeuronConfig' has no attribute 'get_mandatory_axes_for_task'
Traceback (most recent call last):
  File "/opt/aws_neuron_venv_pytorch/bin/optimum-cli", line 8, in <module>
    sys.exit(main())
  File "/opt/aws_neuron_venv_pytorch/lib/python3.8/site-packages/optimum/commands/optimum_cli.py", line 163, in main
    service.run()
  File "/opt/aws_neuron_venv_pytorch/lib/python3.8/site-packages/optimum/commands/export/neuronx.py", line 155, in run
    subprocess.run(full_command, shell=True, check=True)
  File "/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'python3 -m optimum.exporters.neuron -m meta-llama/Llama-2-13b-hf --sequence_length 4096 --batch_size 1 llama-2-13b-neuron-4096-1' returned non-zero exit status 1.
1 Like