Image classification using AutoTrain: dataset preperation authentication and column mapping

I’ve previously successfully tested using AutoTrain for image classification problems but I am currently running into issues uploading data.

Authentication: most of the datasets I want to test have been uploaded to the hub but cannot be shared publicly. When I try and select these datasets, I get an error.

I didn’t see an obvious way to pass in an auth token. Is this possible?

Column selection: When adding a public dataset (in this case, using biglam/encyclopaedia_britannica_illustrated), the mapping options for image/labels differ from the underlying dataset. In this case, the original dataset exposes an ‘image’ and ‘label’ column (plus some other metadata columns). When loading in autotrain image seems to be expanded to image.src, image.height and image.width.

I’m unsure if these are internal attributes or mean to be publicly exposed? Choosing what I assume would be the correct image column image.src as the image column in the mapping results in an error when loading.

Under the training tab an error is triggered when format_source is run:

Error type: InvalidColMappingError
Details: Column mapping {'label': 'target', 'image.src': 'image'} is invalid for data with columns ['image', 'label', 'id', 'meta'].
Column 'image.src' not found in data.

I assume this is because the internal loader is looking for image.src in the dataset and not finding it.

Apologies if this has been addressed before; I dug around for other issues but didn’t see anything related.

Tagging @abhishek, who might be the best person to address this.