Setting up environment in Sagemaker Studio

oschan77 · March 4, 2023, 3:40am

Hello everyone,

I am currently working on finetuning a DETR object detection model in Sagemaker Studio using a Hugging Face Estimator. I have installed the libraries with pip, as shown below:

!pip install transformers datasets huggingface_hub evaluate timm albumentations wandb sagemaker

However, when I tried to check the versions of Pytorch and transformers inside the SageMaker Studio notebook, I got the following error:

import torch
print(torch.__version__)
>>>ModuleNotFoundError: No module named 'torch'

!transformers-cli
>>>None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.

I am new to Sagemaker (I usually use Colab) and I am wondering what could be going wrong in the settings of the environment. Do I need to install Pytorch myself or should I change the settings inside Sagemaker? Or should I look into Deep Learning Containers? I would greatly appreciate any guidance or advice on how to resolve this issue. Thank you very much in advance for your help!

philschmid · March 6, 2023, 8:30am

It seems that you don’t have pytorch installed. You can either use a pytorch kernel or pip install torch.

oschan77 · March 7, 2023, 6:15am

Thank you so much for your help! Just a follow-up question, do I need to explicitly install packages for train.py? For instance, I installed albumentations in the Sagemaker notebook. However, when I import albumentations in train.py. There is a ModuleNotFoundError.

requirements.txt

transformers==4.17.0
datasets[s3]==1.18.4
huggingface_hub
timm
evaluate
albumentations
wandb
sagemaker
ipywidgets==7.0

train.py

import albumentations

>>>ModuleNotFoundError: No module named 'albumentations'

Do you have any idea about this issue? Now I am using subprocess.check_call to run pip install inside train.py as a workaround. Thanks!

import subprocess
subprocess.check_call(["pip", "install", "albumentations"])

philschmid · March 7, 2023, 7:52am

by train.py you when using HuggingFace estimator? Y

oschan77 · March 7, 2023, 7:56am

Yes, this is my estimator.

huggingface_estimator = HuggingFace(
  entry_point='train.py',
  source_dir='./scripts',
  instance_type='ml.g4dn.xlarge',
  instance_count=2,
  role=role,
  transformers_version='4.17',
  pytorch_version='1.10',
  py_version='py38',
  hyperparameters=hyperparameters,
)

philschmid · March 7, 2023, 8:14am

The container comes with the Hugging Face Libraries installed, e.g., transformers and datasets and pytorch ´, here 4.17 and 1.10 if you need additional libraries or versions, you can define them in the requirements.txt. SageMaker will then install them on job creation.

oschan77 · March 7, 2023, 8:27am

I have installed albumentations in the Sagemaker notebook in my requirements.txt, as follows:

!pip install -r requirements.txt

requirements.txt

transformers==4.17.0
datasets[s3]==1.18.4
huggingface_hub
timm
evaluate
albumentations
wandb
sagemaker
ipywidgets==7.0

If I run !pip list in the notebook, I can see that albumentations has been installed.

albumentations              1.3.0

However, train.py still cannot use albumentations. Or do you mean I can pass the requirements.txt file in HuggingFace estimator directly? Thanks!

oschan77 · March 9, 2023, 8:06am

@philschmid Any ideas? I wonder why others do not have the same issue. I have set up the whole environment with a new domain and it still does not work. Thanks!

philschmid · March 9, 2023, 8:56am

Is the train.py executed in the notebook or through HuggingFace estimator .fit if so you need to provide the requirements.txt in the same directory as the train.py that way sagemaker will install those packages when running the managed job.

oschan77 · March 9, 2023, 9:22am

train.py is executed by Huggingface Estimator instead of directly executed in the notebook. Do you mean my requirements.txt should be the ‘scripts’ folder?

My current file structure:

./notebook.ipynb
./requirements.txt
./scripts/train.py
./scripts/utils.py

philschmid · March 10, 2023, 8:21am

copy your requirements.txt into you scripts folder then it should install the dependencies.

Topic		Replies	Views
How to use albumentations in train.py in AWS Sagemaker? Beginners	0	531	March 7, 2023
HuggingFaceModel create fails with no GPU Amazon SageMaker	3	23	June 14, 2025
Huggingface / Pytorch versions on Sagemaker Amazon SageMaker	9	4385	December 20, 2022
HuggingFace with Sagemaker tutorial doesn't work Amazon SageMaker	5	1262	August 25, 2021
Training on Sagemaker with Trainer() Instance Amazon SageMaker	6	2279	November 3, 2021

Setting up environment in Sagemaker Studio

Related topics