There are many image captioning systems exist for english language, here in this project we will develop an Image captioning system for an Indian language
If we have time and resource, we can extend this to other languages as well.
Dataset can be created by translating captions of existing Flickr30k or any other image captioning dataset
Vision encoder Decoder model: Vision Encoder Decoder Models — transformers 4.12.2 documentation
To chat and organise with other people interested in this project, head over to our Discord and:
- Follow the instructions on the
- Join the
Just make sure you comment here to indicate that you’ll be contributing to this project