I have a custom GeneratorBasedBuilder subclass that I’m trying to get running. It seems like I’m almost there, but my trainer is failing with a type error when trying to access a generated sample—to be more specific, I’m working with the train_text_to_image.py script and it’s failing at the preprocess_train(examples) function…
From what I understand, this is because the actual data is a PIL.Image, but my generator’s features dict lists the features as "image": datasets.Value("string"), which is obviously incorrect. However, it doesn’t seem like I can indicate an Image as a type here, so how am I supposed to do this?..
In my _generate_examples function I’m creating two images and a string as the types for my three columns.
features=datasets.Features(
{
"image": [Image.Image],
"text": datasets.Value(("string")),
"conditioning_image": [Image.Image]
}
)
In addition to the above, I’ve tried the original (template) datasets.Value("string"), Image (which fails with cannot pickle module), and Image.Image…
I’m doing some preprocessing in my _generate_examples function—basically adding a random mask—which I’d rather not do at the caller, if possible…
My _generate_examples function returns:
yield idx, { "image": target_img, "text": prompts, "conditioning_image": cond_img }
in case that helps… “target_img” and “cond_img” are both type PIL.Image
Why is this difficult… ???