Huggingface / Pytorch versions on Sagemaker

@marshmellow77, I’ve tried following the post GitHub - aws/sagemaker-huggingface-inference-toolkit and tried a few combinations.

Is it right that the only way for me to use transformers version >=4.17 with a Trainer object, I’ll have to

  • first load the model in python
  • save to disk in a directory
  • add the requirements.txt, in the directory
  • load the model with the requirements.txt

In code, something like:

import torch

from datasets import load_dataset
from transformers import EncoderDecoderModel
from transformers import AutoTokenizer
from transformers import Seq2SeqTrainer, Seq2SeqTrainingArguments

multibert = EncoderDecoderModel.from_encoder_decoder_pretrained(
    "bert-base-multilingual-uncased", "bert-base-multilingual-uncased"
)

multibert.save_to_disk("my-model/")

with open("my-model/code/requirements.txt", "w") as fout:
    fout.write("transformers==4.24")

Then inside my ./scripts/train-v4-24.py, load the model like:

# Instead of 
#  multibert = EncoderDecoderModel.from_encoder_decoder_pretrained(
#     "bert-base-multilingual-uncased", "bert-base-multilingual-uncased"
# )
# load the model like this

def train(): 
    multibert = EncoderDecoderModel.load_from_disk("my-model/")

    trainer = Trainer(
    model=model,
    args=training_args,
    compute_metrics=compute_metrics,
    train_dataset=train_dataset,
    eval_dataset=test_dataset,
    tokenizer=tokenizer,
    )

    trainer.train()

def __main__():
    train()
    

Then in the Sagemaker, I can do this without the version:

huggingface_estimator = HuggingFace(
        entry_point='train-v4-24.py',
        source_dir='./scripts',
        instance_type='ml.p3.2xlarge',
        instance_count=1,
        role=role,
        hyperparameters = hyperparameters
)

Is the above the only way to use the Trainer object with other versions >=4.17?