The first one is about Llava-llama-3-8b-v1-1. The script looks like this:
cat conv.py
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
import onnx
from onnxruntime import InferenceSession
from PIL import Image
def convert_to_onnx(model, tokenizer, model_name='test.onnx'):
dummy_input = tokenizer("Inserting some dummy text here since I don't have your actual input.",
return_tensors='pt').input_ids
torch.onnx.export(
model, # model being run
dummy_input, # model input (or a tuple for multiple inputs)
model_name, # where to save the model (can be a file or file-like object)
export_params=True, # store the trained parameter weights inside the model file
opset_version=11, # the ONNX version to export the model to
do_constant_folding=True # whether to execute constant folding for optimization
)
print(f"Model converted to {model_name}.")
def process_model(onnx_model, image, frage):
print("\nQuestion: " + question)
# Placeholder for actual InferenceSession and processing
session = InferenceSession(onnx_model)
input_name = session.get_inputs()[0].name
#image_flags = torch.Tensor([0, 1, 2]) # Example initialization
#image_flags = image_flags.squeeze(-1)
# Placeholder: Convert your image into the required input format
# result = session.run(None, {input_name: image}) [ adjust this according to your input type ]
# print(result)
path="xtuner/llava-llama-3-8b-v1_1"
model = AutoModelForCausalLM.from_pretrained(path, trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(path, trust_remote_code=True)
convert_to_onnx(model, tokenizer)
image = Image.open("tmp2.jpg")
image.show()
question = "Please describe the image."
process_model("test.onnx", image, frage)
The output looks like this:
[β¦]
odel-00009-of-00009.safetensors: 92%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 965M/1.05G [02:16<00:12, 7.1model-00009-of-00009.safetensors: 93%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 975M/1.05G [02:17<00:10, 7.1model-00009-of-00009.safetensors: 94%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 986M/1.05G [02:19<00:09, 7.1model-00009-of-00009.safetensors: 95%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 996M/1.05G [02:20<00:07, 6.8model-00009-of-00009.safetensors: 96%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 1.01G/1.05G [02:22<00:06, 6.9model-00009-of-00009.safetensors: 97%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 1.02G/1.05G [02:23<00:04, 7.0model-00009-of-00009.safetensors: 98%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 1.03G/1.05G [02:25<00:03, 7.0model-00009-of-00009.safetensors: 99%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | 1.04G/1.05G [02:26<00:01, 7.1model-00009-of-00009.safetensors: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1.05G/1.05G [02:28<00:00, 7.1model-00009-of-00009.safetensors: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 1.05G/1.05G [02:28<00:00, 7.08MB/s]
Downloading shards: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 9/9 [37:50<00:00, 252.25s/it]
Loading checkpoint shards: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 9/9 [00:31<00:00, 3.50s/it]
generation_config.json: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 126/126 [00:00<00:00, 715kB/s]
tokenizer_config.json: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 51.0k/51.0k [00:00<00:00, 909kB/s]
tokenizer.json: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 9.08M/9.08M [00:01<00:00, 6.14MB/s]
special_tokens_map.json: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 301/301 [00:00<00:00, 1.78MB/s]
/home/martin/esn_vqa/lib/python3.12/site-packages/transformers/models/llama/modeling_llama.py:726: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We canβt record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if sequence_length != 1:
Traceback (most recent call last):
File β/home/martin/esn_vqa/conv_moon.pyβ, line 35, in
convert_to_onnx(model, tokenizer)
File β/home/martin/esn_vqa/conv_moon.pyβ, line 10, in convert_to_onnx
torch.onnx.export(
File β/home/martin/esn_vqa/lib/python3.12/site-packages/torch/onnx/init.pyβ, line 375, in export
export(
File β/home/martin/esn_vqa/lib/python3.12/site-packages/torch/onnx/utils.pyβ, line 502, in export
_export(
File β/home/martin/esn_vqa/lib/python3.12/site-packages/torch/onnx/utils.pyβ, line 1564, in _export
graph, params_dict, torch_out = _model_to_graph(
^^^^^^^^^^^^^^^^
File β/home/martin/esn_vqa/lib/python3.12/site-packages/torch/onnx/utils.pyβ, line 1113, in _model_to_graph
graph, params, torch_out, module = _create_jit_graph(model, args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File β/home/martin/esn_vqa/lib/python3.12/site-packages/torch/onnx/utils.pyβ, line 997, in _create_jit_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File β/home/martin/esn_vqa/lib/python3.12/site-packages/torch/onnx/utils.pyβ, line 904, in _trace_and_get_graph_from_model
trace_graph, torch_out, inputs_states = torch.jit._get_trace_graph(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File β/home/martin/esn_vqa/lib/python3.12/site-packages/torch/jit/_trace.pyβ, line 1500, in _get_trace_graph
outs = ONNXTracedModule(
^^^^^^^^^^^^^^^^^
File β/home/martin/esn_vqa/lib/python3.12/site-packages/torch/nn/modules/module.pyβ, line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File β/home/martin/esn_vqa/lib/python3.12/site-packages/torch/nn/modules/module.pyβ, line 1747, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File β/home/martin/esn_vqa/lib/python3.12/site-packages/torch/jit/_trace.pyβ, line 139, in forward
graph, out = torch._C._create_graph_by_tracing(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File β/home/martin/esn_vqa/lib/python3.12/site-packages/torch/jit/_trace.pyβ, line 133, in wrapper
out_vars, _ = _flatten(outs)
^^^^^^^^^^^^^^
RuntimeError: Only tuples, lists and Variables are supported as JIT inputs/outputs. Dictionaries and strings are also accepted, but their usage is not recommended. Here, received an input of unsupported type: DynamicCache