Transformers 4.6.0 on SageMaker?

Hi all,

Is there a timeline for when Transformers 4.6.0 will be supported in the HuggingFace SDK on SageMaker?

I’ve recently been having issues with CUDA running out of memory while training a distilBert model:

RuntimeError: CUDA out of memory. Tried to allocate 6.87 GiB (GPU 0; 15.78 GiB total capacity; 7.35 GiB already allocated; 2.79 GiB free; 11.78 GiB reserved in total by PyTorch)

It seems like this has been acknowledged and fixed in a recent commit - also similarly described here. Looks like this has also been added to Transformers 4.6.0 and I can confirm that using this latest version (without the HuggingFace SDK) fixes the OOM issues for me.

When configuring the HuggingFace estimator, it seems like the latest supported version of Transformers is version 4.5.0

ValueError: Unsupported huggingface version: 4.6.0. You may need to upgrade your SDK version (pip install -U sagemaker) for newer huggingface versions. Supported huggingface version(s): 4.4.2, 4.5.0, 4.4, 4.5.

Does anyone have an idea of when we can expect version 4.6.0 to be supported?

Thanks!

hey @nreamaroon,

We already opened a PR for a DLC with transformers 4.6.0. I hope we can get as merged as soon as possible.

3 Likes

Hi @philschmid -

Thank you for all your work on this. I have just tried to use Transformers version 4.6.0 in a Huggingface estimator in Sagemaker, but get the same error as reported by @nreamaroon

ValueError: Unsupported huggingface version: 4.6. You may need to upgrade your SDK version (pip install -U sagemaker) for newer huggingface versions. Supported huggingface version(s): 4.4.2, 4.5.0, 4.4, 4.5.

I saw that the PR that you referenced has now been merged so am wondering if there’s anything else I need to do in order to be able to use Transformers version 4.6.0?

Thanks again!

Hi @benG,

There were some issues with the python sdk. But it is now merged and should be released tomorrow. You can use the new sagemaker-sdk with pip install git+https://github.com/aws/sagemaker-python-sdk.git or you can wait until tomorrow and install pip install sagemaker and then define transformers_version='4.6' (4.6.1).

huggingface_estimator = HuggingFace(entry_point='train.py',
                            source_dir='./scripts',
                            instance_type='ml.p3.2xlarge',
                            instance_count=1,
                            role=role,
                            transformers_version='4.6',
                            pytorch_version='1.7',
                            py_version='py36',
                            hyperparameters = hyperparameters)
2 Likes