Wrong citation in ViT model cards

Hey!

For the ViT models like “vit-base-patch32-384,” you included Wu et al.'s paper (“Visual Transformers: Token-based Image Representation and Processing for Computer Vision”) in the " BibTeX entry and citation info" part, which I think is an error. I guess the correct reference should be to Dosovitsk et al.'s paper ([2010.11929] An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale). Here is the link to a ViT model:
google/vit-base-patch32-384 · Hugging Face