Loading huggingface efficient model weights on to a efficient_pytorch model with same architecture gives different results

from efficientnet_pytorch import EfficientNet
from transformers import EfficientNetForImageClassification
eff = EfficientNet.from_pretrained('efficientnet-b7',num_classes=50)
hf_model = EfficientNetForImageClassification.from_pretrained("checkpoint-6500")

eff_state_dict = eff.state_dict()
for idx, ((k1, v1), (k2, v2)) in enumerate(zip(eff.named_parameters(), hf_model.named_parameters())):
    eff_state_dict[k1].copy_(v2.data)

eff.load_state_dict(eff_state_dict)

I want to convert the huggingface model to onnx but aten:_convulation_mode is not current supported for conversion. I had previously converted efficientnet-b7 from efficientnet_pytorch to onnx and it worked. So, I have tried replacing the trained effnetb7 values in a efficientnet_pytorch but the results are different. What am I missing here?