classify_interface = gr.Interface(
fn=predict_disease,
inputs=gr.Image(type="pil", label="Image"),
outputs=[gr.Label(num_top_classes=3, label="Predictions"),
gr.Number(label="Prediction time (secs)")],
examples=example_list,
title=title_classify,
description=description_classify,
article=article_classify
)
attention_interface = gr.Interface(
fn=plot_attention,
inputs=[gr.Image(type="pil", label="Image"),
gr.Dropdown(choices=["1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12"],
label="Attention Layer", value="6")],
outputs=gr.Gallery(value=attention_maps, label="Attention Maps").style(grid=(3, 4)),
examples=example_list,
title=title_attention,
description=description_attention,
article=article_attention
)
demo = gr.TabbedInterface([classify_interface, attention_interface],
["Identify Disease", "Visualize Attention Map"],
title="NatureAI Diagnostics🧑🩺")
if __name__ == "__main__":
demo.launch()
So basically there are two tabs, on one the user can see the predictions for the uploaded image, on the other, depending on which encoder layer the user chooses from the dropdown, it will return a grid of images displaying the attention weights of the transformer network for that particular layer.
The main issue is this:
The final error message basically says:
File "app.py", line 88, in plot_attention
with nopdb.capture_call(vision_transformer.blocks[int(layer_num)-1].attn.forward) as attn_call:
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'
(Unable to post the full screenshot since new users can only post 1 media attachment)
Basically the default value for my dropdown that I set using value="6"
gets replaced with None Type. This is the plot_attention function:
def plot_attention(image, layer_num):
"""Given an input image, plot the average attention weight given to each image patch by each attention head."""
attention_map_outputs = []
input_data = data_transforms(image)
with nopdb.capture_call(vision_transformer.blocks[int(layer_num)-1].attn.forward) as attn_call:
predict_tensor(img_transformed)
attn = attn_call.locals['attn'][0]
with torch.inference_mode():
# loop over attention heads
for h_weights in attn:
h_weights = h_weights.mean(axis=-2) # average over all attention keys
h_weights = h_weights[1:] # skip the [class] token
output_img = plot_weights(input_data, h_weights)
attention_map_outputs.append(output_img)
return attention_map_outputs
attention_maps = plot_attention(random_image, "6")
I used print(image.shape) | print(layer_num)
at the start of the function to see if it was actually even receiving any input when I called the function for the first time in attention_maps = plot_attention(random_image, "6")
, turns out it was. But when it gets to the interface section, the default value gets set to None.
When I try to launch it locally or on Colab, everything works fine. But the weird thing is even when I manually set value=None
on my local machine or on Colab, for some reason the app still works fine? I’m really new to Gradio so I’m sincerely sorry if this is an obvious mistake which I’m unable to see, I just wanted to try something out as a pet project and would really appreciate some help. Thank you.