Kosmos-2 Fine tuning

@ydshieh, I added labels in model itself because I found one discussion around it e.g. Finetune BLIP on customer dataset #20893 - #2 by dxlong2000.

To be honest, I wasn’t too clear on that. Can you give me code snippet to create labels if you can. If not give me some sort of python code which I can refer and I will taking from there.

This model works on next token prediction so I thought to use Data Collator but there is None that supports multimodel as of now.

I really appreciate your help and it’s always very useful.

And, I think if my code will then @cdh code will work too because we both are trying to achieve almost similar things in different way.