Hello everyone,
I am currently working on finetuning a DETR object detection model in Sagemaker Studio using a Hugging Face Estimator. I have installed the libraries with pip, as shown below:
!pip install transformers datasets huggingface_hub evaluate timm albumentations wandb sagemaker
However, when I tried to check the versions of Pytorch and transformers inside the SageMaker Studio notebook, I got the following error:
import torch
print(torch.__version__)
>>>ModuleNotFoundError: No module named 'torch'
!transformers-cli
>>>None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
I am new to Sagemaker (I usually use Colab) and I am wondering what could be going wrong in the settings of the environment. Do I need to install Pytorch myself or should I change the settings inside Sagemaker? Or should I look into Deep Learning Containers? I would greatly appreciate any guidance or advice on how to resolve this issue. Thank you very much in advance for your help!
1 Like
It seems that you don’t have pytorch installed. You can either use a pytorch kernel or pip install torch
.
1 Like
Thank you so much for your help! Just a follow-up question, do I need to explicitly install packages for train.py? For instance, I installed albumentations in the Sagemaker notebook. However, when I import albumentations in train.py. There is a ModuleNotFoundError.
requirements.txt
transformers==4.17.0
datasets[s3]==1.18.4
huggingface_hub
timm
evaluate
albumentations
wandb
sagemaker
ipywidgets==7.0
train.py
import albumentations
>>>ModuleNotFoundError: No module named 'albumentations'
Do you have any idea about this issue? Now I am using subprocess.check_call to run pip install inside train.py as a workaround. Thanks!
import subprocess
subprocess.check_call(["pip", "install", "albumentations"])
by train.py
you when using HuggingFace
estimator? Y
Yes, this is my estimator.
huggingface_estimator = HuggingFace(
entry_point='train.py',
source_dir='./scripts',
instance_type='ml.g4dn.xlarge',
instance_count=2,
role=role,
transformers_version='4.17',
pytorch_version='1.10',
py_version='py38',
hyperparameters=hyperparameters,
)
The container comes with the Hugging Face Libraries installed, e.g., transformers
and datasets
and pytorch
´, here 4.17 and 1.10 if you need additional libraries or versions, you can define them in the requirements.txt
. SageMaker will then install them on job creation.
I have installed albumentations in the Sagemaker notebook in my requirements.txt, as follows:
!pip install -r requirements.txt
requirements.txt
transformers==4.17.0
datasets[s3]==1.18.4
huggingface_hub
timm
evaluate
albumentations
wandb
sagemaker
ipywidgets==7.0
If I run !pip list in the notebook, I can see that albumentations has been installed.
albumentations 1.3.0
However, train.py still cannot use albumentations. Or do you mean I can pass the requirements.txt file in HuggingFace estimator directly? Thanks!
@philschmid Any ideas? I wonder why others do not have the same issue. I have set up the whole environment with a new domain and it still does not work. Thanks!
Is the train.py
executed in the notebook or through HuggingFace
estimator .fit
if so you need to provide the requirements.txt
in the same directory as the train.py
that way sagemaker will install those packages when running the managed job.
train.py is executed by Huggingface Estimator instead of directly executed in the notebook. Do you mean my requirements.txt should be the ‘scripts’ folder?
My current file structure:
./notebook.ipynb
./requirements.txt
./scripts/train.py
./scripts/utils.py
copy your requirements.txt
into you scripts
folder then it should install the dependencies.
1 Like