Image captioning for Spanish with pre-trained vision and text model For this project, a pre-trained image model like ViT can be used as an encoder, and a pre-trained text model like BERT and/or GPT2 can be used as a decoder. Model Pre-trained ViT , BERT models can be found on the model hub. We cou…

Image captioning for Spanish with pre-trained vision and text model

srisweet June 30, 2021, 9:54am 13

Hi @valhalla ,
Just created this thread with JAX/Flax resources…

1 Like

Topic		Replies	Views
Image captioning for Japanese with pre-trained vision and text model Flax/JAX Projects	0	1165	June 23, 2021
Image captioning for French with pre-trained vision and text model Flax/JAX Projects	6	2159	January 4, 2022
Image captioning for Indonesia with pre-trained vision Flax/JAX Projects	4	485	June 29, 2021
CLIP like contrastive vision-language models for Spanish with pre-trained text and vision models Flax/JAX Projects	4	397	June 29, 2021
Multilingual Image Captioning Flax/JAX Projects	10	1284	July 6, 2021