Attribute error "nonetype object has no shape"

with torch.no_grad(): output_ids = model.generate( input_ids=input_ids, images=image_tensor, max_new_tokens=256, do_sample=True ) leads to attribute error none type object has no shape

print(“Prompt:”, prompt) print(“Type of input_ids:”, input_ids.dtype) print(“Shape of input_ids before generate:”, input_ids.shape) print(“Shape of image_tensor:”, image_tensor.shape) print(“Type of image_tensor:”, image_tensor.dtype) all print statement giving correct.could anyone suggest what leads to error.am doing inference check after finetuned multimodal model

1 Like

You’re probably passing images=image_tensor into a model that doesn’t expect or properly handle that argument. Inside the generate() method, HuggingFace or custom model logic might be expecting the images argument to be processed by a prepare_inputs_for_generation() or forward() call — and if not correctly implemented, it returns None, causing the error when .shape is accessed.

Side possibility, does the model support multimodality?
For example, a typical AutoModelForCausalLM does not support an images parameter.

Try this

output_ids = model.generate(input_ids=input_ids, max_new_tokens=256, do_sample=True)

If this works without error, your model or its generation path is not correctly set up to handle multimodal input.

Leave a like if this helped you at all :slight_smile:

Fix:
You are passing images=image_tensor to model.generate, but your model or generation config likely does not support the images argument—or your finetuned model doesn’t handle multimodal input as expected.

Direct script correction:

Try this: Remove images if the model doesn’t support multimodal inference
output_ids = model.generate(
input_ids=input_ids,
max_new_tokens=256,
do_sample=True
)

If your model does support images, make sure image_tensor is not None and is properly preprocessed. Otherwise, the error means image_tensor is None when accessed inside the generate function.

Check:

assert image_tensor is not None, “image_tensor is None”

Solution provided by Triskel Data Deterministic AI.

1 Like

liuhaotian/llava-v1.5-7b this is my base model i have fine tuned and pushed in huggingface.am giving input both image and text for inference check.please guide me

1 Like

If you want to use images, it seems that you need to pass the pixel_values argument instead of the images argument for the LLaVa model.

https://stackoverflow.com/questions/1109422/getting-list-of-pixel-values-from-pil