Some issues when training model on Sagemaker

Hello world,
I’m getting two issues when I fine-tuning on my model using this sagemaker notebook.

  1. No GUI login prompt out when running notebook_login(), instead I’m getting this:


    As a workaround, I’m using hardcoded token

  2. Hit ResourceLimitExceeded Error when running huggingface_estimater.fit(…):

For item 2, 1. I have opened an issue on AWS support to request for increasing the limit but I will expect a slow reply from them. Is there any other way to get around this while getting GPU boost from Sagemaker?

FYI, I’m using my own AWS account (Free Tier account but having some credits).

Thanks.

Hello @ivanlau,

thanks for opening the thread.

To 1. where are you running the sagemaker notebook?

To 2. I think you can go with the ml.g4dn.xlarge it also has 1 GPU and shouldn’t need a limit increase for that.

@philschmid

To 1:
I’m running it on ml.t3.medium instance. Open it using JupyterLab environement (conda_pytorchp36)

To 2:
I changed it to your suggested instance but still same error:

settings:

Can you test the Jupyter Noteboook?

To 2. Okay then i guess to need to open a Support ticket. you can do this from AWS Console with service quota

@philschmid
Hi,
I have tested on Jupyter notebook. and ya, It’s working fine over there. It seems that JupyterLab ipywidgets is disabled by default or outdated? I not sure. but anyways, I can continue working on.

For 2, Yes I already opened ticket and following up with them.

Anyways, thanks for the help and your notebook too.
This is my first time doing ML on the cloud. Learnt a lot.

1 Like