Transformers 4.6.0 on SageMaker?

Hi all,

Is there a timeline for when Transformers 4.6.0 will be supported in the HuggingFace SDK on SageMaker?

I’ve recently been having issues with CUDA running out of memory while training a distilBert model:

RuntimeError: CUDA out of memory. Tried to allocate 6.87 GiB (GPU 0; 15.78 GiB total capacity; 7.35 GiB already allocated; 2.79 GiB free; 11.78 GiB reserved in total by PyTorch)

It seems like this has been acknowledged and fixed in a recent commit - also similarly described here. Looks like this has also been added to Transformers 4.6.0 and I can confirm that using this latest version (without the HuggingFace SDK) fixes the OOM issues for me.

When configuring the HuggingFace estimator, it seems like the latest supported version of Transformers is version 4.5.0

ValueError: Unsupported huggingface version: 4.6.0. You may need to upgrade your SDK version (pip install -U sagemaker) for newer huggingface versions. Supported huggingface version(s): 4.4.2, 4.5.0, 4.4, 4.5.

Does anyone have an idea of when we can expect version 4.6.0 to be supported?

Thanks!

hey @nreamaroon,

We already opened a PR for a DLC with transformers 4.6.0. I hope we can get as merged as soon as possible.

3 Likes

Hi @philschmid -

Thank you for all your work on this. I have just tried to use Transformers version 4.6.0 in a Huggingface estimator in Sagemaker, but get the same error as reported by @nreamaroon

ValueError: Unsupported huggingface version: 4.6. You may need to upgrade your SDK version (pip install -U sagemaker) for newer huggingface versions. Supported huggingface version(s): 4.4.2, 4.5.0, 4.4, 4.5.

I saw that the PR that you referenced has now been merged so am wondering if there’s anything else I need to do in order to be able to use Transformers version 4.6.0?

Thanks again!

Hi @benG,

There were some issues with the python sdk. But it is now merged and should be released tomorrow. You can use the new sagemaker-sdk with pip install git+https://github.com/aws/sagemaker-python-sdk.git or you can wait until tomorrow and install pip install sagemaker and then define transformers_version='4.6' (4.6.1).

huggingface_estimator = HuggingFace(entry_point='train.py',
                            source_dir='./scripts',
                            instance_type='ml.p3.2xlarge',
                            instance_count=1,
                            role=role,
                            transformers_version='4.6',
                            pytorch_version='1.7',
                            py_version='py36',
                            hyperparameters = hyperparameters)
2 Likes

Hi @philschmid.

In the AWS SageMaker HF workshops notebooks, we need to install the following versions
!pip install "sagemaker>=2.48.0" "transformers==4.6.1" "datasets[s3]==1.6.2"

Are they still the right versions?
How could we know in the future which compatible versions to install? Thanks.

Hey @pierreguillou,

Yes, versions still are compatible, but you can upgrade them as you want. You can find an overview of all DLCs + versions at Reference, which should be updated soon. I created a PR to update them add new container version by philschmid · Pull Request #514 · huggingface/huggingface_hub · GitHub.
The latest Transformers version is currently 4.12.3. You can also check here for notebook example with newer versions notebooks/sagemaker-notebook.ipynb at master · huggingface/notebooks · GitHub

1 Like

Thanks @philschmid

Just to be sure, here is my understanding.

Notebook

In the notebook notebooks/sagemaker-notebook.ipynb at master · huggingface/notebooks · GitHub that you mentioned, the libraries versions are the following ones:

  • sagemaker>=2.69.0
  • transformers==4.12.3
  • datasets==1.13

The installation is done by the following code:

!pip install "sagemaker>=2.69.0" "transformers==4.12.3" --upgrade
# using older dataset due to incompatibility of sagemaker notebook & aws-cli with > s3fs and fsspec to >= 2021.10
!pip install  "datasets==1.13" --upgrade

I noticed that unlike the first notebook (used in your workshop), you no longer install datasets[s3] but directly datasets. I’m just curious as to why and what are the differences between these 2 libraries? Thanks.

Reference & PR

In the Reference page, the transformers version goes up to but 4.6.1 (and compatible datasets 1.6.2) but your recent PR shows that the current situation of Training DLC is
transformers 4.12.3, datasets 1.15.1, PyTorch 1.9.1, python 3.8

Does it mean I could run the following code in a notebook instance of AWS SageMaker?

!pip install "sagemaker>=2.69.0" "transformers==4.12.3" --upgrade
!pip install  "datasets==1.15.1" --upgrade

If yes, I would need to update as well Pytorch and python as I do not see in the available kernel, this configuration. How can I do this? (or do you recommend to create a new conda kernel? through the jupyter terminal? could you give use the commands?)

1 Like

The PR is the overview of the version inside the DLCs. Meaning the container you use when running huggingface_estimator.fit(). In the notebook, you can install every version you want, but it makes most of the time sense to use the same version as you will in the HuggingFace estimator.

Thanks for finding the pip installation it should be datasets[s3] to make sure we install the S3 utils to push our datasets to s3.

I’m currently getting this error with the Sagemaker SDK trying to use Transformers==4.19.4. Hoping to use >=4.19.0 to take advantage of some deBERTa models. :slight_smile:

What error are you getting?

This -

ValueError: Unsupported huggingface version: 4.19.4. You may need to upgrade your SDK version (pip install -U sagemaker) for newer huggingface versions. Supported huggingface version(s): 4.6.1, 4.10.2, 4.11.0, 4.12.3, 4.17.0, 4.6, 4.10, 4.11, 4.12, 4.17.

In my pipeline’s requirements.txt I have it using sagemaker==2.108.0.

There is currently no available DLC with this transformers version, but you can add a requirements.txt with the version you want to install.

Ah, ok. So even if the DLC/Sagemaker SDK doesn’t have the version I need, I can just specify it myself as a requirement – thanks.

1 Like

Yes, thats correct but we are working on updating the version.

1 Like

Good to hear, thank you!

We’re getting errors in our terragrunt apply when we add any transformers version to our requirements.txt for some reason. Strangely the build works fine specifying any other packages besides transformers. We’ll keep digging on that unless you have any recommendations.