I have a custom GeneratorBasedBuilder
subclass that I’m trying to get running. It seems like I’m almost there, but my trainer is failing with a type error when trying to access a generated sample—to be more specific, I’m working with the train_text_to_image.py
script and it’s failing at the preprocess_train(examples)
function…
From what I understand, this is because the actual data is a PIL.Image
, but my generator’s features
dict lists the features as "image": datasets.Value("string")
, which is obviously incorrect. However, it doesn’t seem like I can indicate an Image as a type here, so how am I supposed to do this?..
In my _generate_examples
function I’m creating two images and a string as the types for my three columns.
features=datasets.Features(
{
"image": [Image.Image],
"text": datasets.Value(("string")),
"conditioning_image": [Image.Image]
}
)
In addition to the above, I’ve tried the original (template) datasets.Value("string")
, Image
(which fails with cannot pickle module
), and Image.Image
…
I’m doing some preprocessing in my _generate_examples
function—basically adding a random mask—which I’d rather not do at the caller, if possible…
My _generate_examples
function returns:
yield idx, { "image": target_img, "text": prompts, "conditioning_image": cond_img }
in case that helps… “target_img” and “cond_img” are both type PIL.Image
Why is this difficult… ???