HuggingFace with Sagemaker tutorial doesn't work


I am going through the HuggingFace - Sagemaker tutorial that you provide in github. I am working with the notebook 1 : 01_getting_started_pytorch. I am using Sagemaker Studio with the image: Python 3 (PyTorch 1.6 Python 3.6 CPU Optimized)

When I try to download the imdb dataset I have this error:

Do someone know how to fix this issue?

Thanks in advance.


Hey @Oigres,

which datasets version have you installed?

1 Like

I tried to reproduce your error but for me it worked with the notebook.

Could you try removing the cache at /root/.cache/huggingface

1 Like


Thank you for you attention. I tried with these versions:

  1. Using the “pip install” that come already with the notebook:

!pip install "sagemaker>=2.48.0" "transformers==4.6.1" "datasets[s3]==1.6.2" --upgrade

I this case the version installed are: dataset 1.6.2 and transformers 4.6.1

  1. Using the “pip install” that is used in the youtube video related to this tutorial:
    !pip install transformers “datasets[s3]” --upgrade

I this case the version installed are: dataset 4.9.2 and transformers 1.11.0

In both cases I got the same error.

I used the notebook as it is with

!pip install "sagemaker>=2.48.0" "transformers==4.6.1" "datasets[s3]==1.6.2" --upgrade

Have you tried restarting the kernel?
And you might need to use sudo rm -rf

1 Like


Thank you very much. You did help me a lot. It was kind of these errors that you can’t find easily over there.

Just a thing: in order to help others, can you edit your answer…? I’m sure that where you wrote: ~/.cache/huggingface you meant /root/.cache/huggingface