Pytorch-Image models

In the VisionTransformer class, the default act_layer is None . If we do not provide it - this will lead to a TypeError in MLP because none of the classes (Block , MLP , or VisionTransformer ) handle this case. Obvious error message:
TypeError: ‘NoneType’ object is not callable

1 Like

Fix:
Always set act_layer to a valid activation function (e.g., nn.GELU, nn.ReLU) when instantiating VisionTransformer.
Example:

import torch.nn as nn
model = VisionTransformer(act_layer=nn.GELU)

If not set, you’ll get TypeError: ‘NoneType’ object is not callable.

Solution provided by Triskel Data Deterministic AI.

1 Like

Hello @mohitb1i ,

In which PyTorch version are you experiencing this error?


Machine Learning Engineer at RidgeRun.ai
Contact us: support@ridgerun.ai

1 Like

I understand, but I am saying the default value of act_layer should be nn.GELU or just set it in instantiation, like:

block_fn(
...
act_layer = act_layer or nn.GELU,
...
)
1 Like

No it is a vision-transformer code from hugging face,
original repo

code of Vision Transformer

1 Like

Upon reviewing the code, it appears that this behavior likely stems from the fact that the VisionTransformer class is not meant to be instantiated directly. Instead, the recommended approach is to use the timm.create_model function, which handles proper initialization of the available Vision Transformer variants. For example, calling models like vit_small_patch16_224 or vit_large_patch32_384 through timm.create_model returns a fully configured VisionTransformer instance.

However, if you choose to instantiate the VisionTransformer class directly, you are probably responsible for explicitly providing certain arguments—such as the act_layer—as you noted earlier.


Machine Learning Engineer at RidgeRun.ai
Contact us: support@ridgerun.ai

2 Likes

import torch
import torch.nn as nn

class VisionTransformer(nn.Module):
def init(self, act_layer=None, **kwargs):
super().init()
# Default to GELU if none provided
if act_layer is None:
act_layer = nn.GELU

    # Support both nn.ReLU and nn.ReLU() styles
    self.act = act_layer() if isinstance(act_layer, type) else act_layer

    # Example MLP block using activation
    self.mlp = nn.Sequential(
        nn.Linear(768, 3072),
        self.act,
        nn.Linear(3072, 768)
    )

def forward(self, x):
    return self.mlp(x)

Example usage:

if name == “main”:
model = VisionTransformer()
x = torch.randn(1, 768)
out = model(x)
print(out.shape)

Solution provided by Triskel Data Deterministic AI.

1 Like

Thanks, it was an oversight.

1 Like

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.