ExitCode 1 ErrorMessage "KeyError: 'Image' when using entry_point script in Huggingface Estimator

Hi there

It’s worth noting that being able to load the dataset in the notebook instance does NOT mean you can successfully load it with the training script.

The reason for that is that the training script runs in a seperate EC2 instance that has no knowledge of your notebook instance. This is by design: You want a small, cheap notebook instance to orchestrate the data prep and training setup but you (potentially) want a powerful, expensive instance to run the actual training on. To learn more about training HF models on SageMaker, have a look at this example: https://github.com/huggingface/notebooks/tree/main/sagemaker/01_getting_started_pytorch

What does that mean for your particular case? Without seeing the notebook where you orchestrate the setup I can only guess, but it looks like you have either (a) not stored the dataset in the correct S3 bucket or (b) have not told the training job the correct S3 path for the training job.

Again, check out @philschmid’s example notebook, that should give you an idea how to pass on dataset paths to the training job.

Cheers
Heiko