I am currently using ONNX musicgen to infer musics. However, I encountered a problem, I just want to ask how to generate inputs when the inputs like this
=== Model: decoder_model_merged.onnx ===
Inputs:
Name: encoder_attention_mask Shape: [‘batch_size’, ‘encoder_sequence_length’] DType: int64
Name: input_ids Shape: [‘total_batch_size_x_num_codebooks’, ‘decoder_sequence_length’] DType: int64
Name: encoder_hidden_states Shape: [‘total_batch_size’, ‘encoder_sequence_length’, 768] DType: float32
Name: past_key_values.0.decoder.key Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.0.decoder.value Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.0.encoder.key Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.0.encoder.value Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.1.decoder.key Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.1.decoder.value Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.1.encoder.key Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.1.encoder.value Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.2.decoder.key Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.2.decoder.value Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.2.encoder.key Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.2.encoder.value Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.3.decoder.key Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.3.decoder.value Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.3.encoder.key Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.3.encoder.value Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.4.decoder.key Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.4.decoder.value Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.4.encoder.key Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.4.encoder.value Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.5.decoder.key Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.5.decoder.value Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.5.encoder.key Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.5.encoder.value Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.6.decoder.key Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.6.decoder.value Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.6.encoder.key Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.6.encoder.value Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.7.decoder.key Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.7.decoder.value Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.7.encoder.key Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.7.encoder.value Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.8.decoder.key Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.8.decoder.value Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.8.encoder.key Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.8.encoder.value Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.9.decoder.key Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.9.decoder.value Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.9.encoder.key Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.9.encoder.value Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.10.decoder.key Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.10.decoder.value Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.10.encoder.key Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.10.encoder.value Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.11.decoder.key Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.11.decoder.value Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.11.encoder.key Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.11.encoder.value Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.12.decoder.key Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.12.decoder.value Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.12.encoder.key Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.12.encoder.value Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.13.decoder.key Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.13.decoder.value Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.13.encoder.key Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.13.encoder.value Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.14.decoder.key Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.14.decoder.value Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.14.encoder.key Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.14.encoder.value Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.15.decoder.key Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.15.decoder.value Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.15.encoder.key Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.15.encoder.value Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.16.decoder.key Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.16.decoder.value Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.16.encoder.key Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.16.encoder.value Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.17.decoder.key Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.17.decoder.value Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.17.encoder.key Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.17.encoder.value Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.18.decoder.key Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.18.decoder.value Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.18.encoder.key Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.18.encoder.value Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.19.decoder.key Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.19.decoder.value Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.19.encoder.key Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.19.encoder.value Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.20.decoder.key Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.20.decoder.value Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.20.encoder.key Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.20.encoder.value Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.21.decoder.key Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.21.decoder.value Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.21.encoder.key Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.21.encoder.value Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.22.decoder.key Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.22.decoder.value Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.22.encoder.key Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.22.encoder.value Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.23.decoder.key Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.23.decoder.value Shape: [‘total_batch_size’, 16, ‘past_decoder_sequence_length’, 64] DType: float32
Name: past_key_values.23.encoder.key Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: past_key_values.23.encoder.value Shape: [‘total_batch_size’, 16, ‘encoder_sequence_length_out’, 64] DType: float32
Name: use_cache_branch Shape: [1] DType: bool
Blockquote
also when come to the second stop of generation, I encounter this problem
2025-03-21 22:59:12.9606373 [E:onnxruntime:, sequential_executor.cc:572 onnxruntime::ExecuteKernel] Non-zero status code returned while running Reshape node. Name:‘/decoder/model/decoder/layers.0/encoder_attn/Reshape_4’ Status Message: D:\a_work\1\s\onnxruntime\core/providers/cpu/tensor/reshape_helper.h:30 onnxruntime::ReshapeHelper::ReshapeHelper i < input_shape.NumDimensions() was false. The dimension with value zero exceeds the dimension size of the input tensor.
2025-03-21 22:59:12.9689039 [E:onnxruntime:, sequential_executor.cc:572 onnxruntime::ExecuteKernel] Non-zero status code returned while running If node. Name:‘optimum::if’ Status Message: Non-zero status code returned while running Reshape node. Name:‘/decoder/model/decoder/layers.0/encoder_attn/Reshape_4’ Status Message: D:\a_work\1\s\onnxruntime\core/providers/cpu/tensor/reshape_helper.h:30 onnxruntime::ReshapeHelper::ReshapeHelper i < input_shape.NumDimensions() was false. The dimension with value zero exceeds the dimension size of the input tensor.
Traceback (most recent call last):
File “d:\allcode\on-device-music-generator\text_code\musicgen_generate.py”, line 444, in
musicgen_onnx.test_onnx()
File “d:\allcode\on-device-music-generator\text_code\musicgen_generate.py”, line 132, in test_onnx
logits, past_key_values = self.run_cached_step(decoder_inputs, past_key_values)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “d:\allcode\on-device-music-generator\text_code\musicgen_generate.py”, line 251, in run_cached_step
outputs = self.sessions[‘decoder_merged’].run(None, decoder_inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\31819\anaconda3\envs\all-stack-ai\Lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py”, line 270, in run
return self._sess.run(output_names, input_feed, run_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running If node. Name:‘optimum::if’ Status Message: Non-zero status code returned while running Reshape node. Name:‘/decoder/model/decoder/layers.0/encoder_attn/Reshape_4’ Status Message: D:\a_work\1\s\onnxruntime\core/providers/cpu/tensor/reshape_helper.h:30 onnxruntime::ReshapeHelper::ReshapeHelper i < input_shape.NumDimensions() was false. The dimension with value zero exceeds the dimension size of the input tensor.
Blockquote
does anyone know how to solve it