Hi All,
I’m trying to load a custom dataset for text-to-image fine-tuning but I’m not sure how the data needs to be formatted. Right now I have a folder of png files and a csv that maps png paths to captions, with column names of “image” and “text”. But it seems like it needs a different format.
Any help appreciated!
gulp… yeah, rtfm: Create an image dataset
Apologies. Maybe my post will prevent further ones like it.
Actually, wait… I’ve created a file with the format given under “Image Captioning” on the doc page, but I’m hitting an error when running train_text_to_image.py
:
Exception has occurred: ArrowInvalid
JSON parse error: Missing a name for object member. in row 0
Obviously something is wrong, but I’m not sure what…
I have a data folder with all my png files and my metadata.jsonl
file, formatted as:
{"file_name": "something_1.png", "text": ["caption_1", "caption_2", ..., "caption_n"]}
{"file_name": "something_2.png", "text": ["caption_1", "caption_2", ..., "caption_n"]}
...
{"file_name": "something_n.png", "text": ["caption_1", "caption_2", ..., "caption_n"]}
What am I not understanding here… ?
Okay, just had to ask json to dump on my lines…
json_formatted = json.dumps(line_dict)