when loading a dataset, I have the following arrow format - label: int, image: struct<bytes:binary, path:string>. when using load_dataset() method, the image is automatically converted to Pil image format, and the path is lost. is there a way to avoid that behavior?
In my example -
dataset = load_dataset("chronopt-research/cropped-vggface2-224")
for i in range(0, len(dataset['train']), batch_size):
batch = dataset['train'][i:i + batch_size]
images = batch['image'] # Original 224x224 images
labels = batch['label'] # Labels for each image
the images I get are only the Pil image object, which doesn’t include the path or file name from the original arrow files.