Potential discrepancy between the weights between Huggingface and Timm for google/vit-base-patch16-224-in21k

If I understand correctly, the google/vit-base-patch16-224-in21k corresponds to timm’s vit_base_patch16_224.augreg_in21k.

However, I found HuggingFace’s has a Pooler layer that timm’s doesn’t have.

Besides, I checked some specific weights, e.g.,

  1. Huggingface: embeddings.patch_embeddings.projection.weight
  2. timm: patch_embed.proj.weight
    They are not equal.

Other minor things could be eps of LayerNorm.

I’m wondering if the correct weights have been converted.