In the VisionTransformer
class, the default act_layer
is None
. If we do not provide it - this will lead to a TypeError
in MLP
because none of the classes (Block
, MLP
, or VisionTransformer
) handle this case. Obvious error message:
TypeError: ‘NoneType’ object is not callable
Fix:
Always set act_layer to a valid activation function (e.g., nn.GELU, nn.ReLU) when instantiating VisionTransformer.
Example:
import torch.nn as nn
model = VisionTransformer(act_layer=nn.GELU)
If not set, you’ll get TypeError: ‘NoneType’ object is not callable.
Solution provided by Triskel Data Deterministic AI.
Hello @mohitb1i ,
In which PyTorch version are you experiencing this error?
Machine Learning Engineer at RidgeRun.ai
Contact us: support@ridgerun.ai
I understand, but I am saying the default value of act_layer should be nn.GELU or just set it in instantiation, like:
block_fn(
...
act_layer = act_layer or nn.GELU,
...
)
No it is a vision-transformer code from hugging face,
original repo
Upon reviewing the code, it appears that this behavior likely stems from the fact that the VisionTransformer
class is not meant to be instantiated directly. Instead, the recommended approach is to use the timm.create_model
function, which handles proper initialization of the available Vision Transformer variants. For example, calling models like vit_small_patch16_224
or vit_large_patch32_384
through timm.create_model
returns a fully configured VisionTransformer
instance.
However, if you choose to instantiate the VisionTransformer
class directly, you are probably responsible for explicitly providing certain arguments—such as the act_layer
—as you noted earlier.
Machine Learning Engineer at RidgeRun.ai
Contact us: support@ridgerun.ai
import torch
import torch.nn as nn
class VisionTransformer(nn.Module):
def init(self, act_layer=None, **kwargs):
super().init()
# Default to GELU if none provided
if act_layer is None:
act_layer = nn.GELU
# Support both nn.ReLU and nn.ReLU() styles
self.act = act_layer() if isinstance(act_layer, type) else act_layer
# Example MLP block using activation
self.mlp = nn.Sequential(
nn.Linear(768, 3072),
self.act,
nn.Linear(3072, 768)
)
def forward(self, x):
return self.mlp(x)
Example usage:
if name == “main”:
model = VisionTransformer()
x = torch.randn(1, 768)
out = model(x)
print(out.shape)
Solution provided by Triskel Data Deterministic AI.
Thanks, it was an oversight.
This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.