Space for Captioning Image Datasets

I am designing a Space that will allow people to submit image captions. The image captions will get stored in an external DB so I can then use them to train/finetune Stable Diffusion models using KoyhaSS.

I plan on building the application in gradio because it appears to have decent integration with Spaces.

The external DB will be something with a large free tier, like Firebase. DB secrets will be stored in a private dataset, as a workaround for HF not having secret management.

The main page of the space displays:

  • Image from configured dataset
  • Text box to enter caption
  • Submit Button

Add Later:

  • Button to load next image they have not captioned yet
  • Progress Bar with %complete and, images captioned/total images in dataset

When the user clicks submit, a hash representing the user, image ref, and caption are stored in the database and then the application displays the next image in the dataset.

If the user has already submitted a caption for the image, it prefills the text box from the db.

I am more than happy to share it completely open source.
Can someone help me build the Gradio UI?

Thank you!