Image captioning for Spanish with pre-trained vision and text model

valhalla · June 23, 2021, 5:13pm

Hey @dmatos2012 , don’t worry about experience. We always try to make things easier for everyone and we have a super cool speaker lineup for getting familiar with JAX/Flax/Transformers. And we will try to answer all questions:)

@mrm8488 For image captioning it’ll be more like an encoder-decoder model. The encoder will be an image model and the decoder can be any transformer model with cross-attention which will take hidden_states from image model and will generate text auto-regressively

Topic		Replies	Views
Image captioning for Japanese with pre-trained vision and text model Flax/JAX Projects	0	1170	June 23, 2021
Image captioning for French with pre-trained vision and text model Flax/JAX Projects	6	2159	January 4, 2022
Image captioning for Indonesia with pre-trained vision Flax/JAX Projects	4	486	June 29, 2021
CLIP like contrastive vision-language models for Spanish with pre-trained text and vision models Flax/JAX Projects	4	397	June 29, 2021
Multilingual Image Captioning Flax/JAX Projects	10	1284	July 6, 2021

Image captioning for Spanish with pre-trained vision and text model

Related topics