Dataloader with streaming dataset for image captioning (BLIP finetune)

Hi I’m hoping to finetune BLIP on this dataset :

Instead of loading the entire dataset, I’ll like to stream to data. That’s where I’m stuck. I’m not sure how to write the Dataloader.

Referencing this notebook:

On how to to finetune, I’m actually getting no where. Hope I’ve given sufficient information. Thank you.

1 Like

What about this feature of the datasets library?