I am creating a dataset of multimodal documents, where each document contains a sequence of interleaved text and images. Each example in the dataset pertains to a single document, an entry for paragraphs, and an entry for images.
For example, the text entry might look like [a, None, b, None] and the image entry might look like [None, img1, None, img2]. This is inspired by Obelisc as found here: HuggingFaceM4/OBELISC · Datasets at Hugging Face
How can I make the resulting parquet store the bytes of the Image object such that I can read the dataset back in later?