Is there any tutorial or example on how to do this?
I have prepared data according to the guildelines give here here
Here is my basic code for sage maker:
from sagemaker.huggingface import HuggingFace
distribution = {'smdistributed':{'dataparallel':{ 'enabled': True }}}
model_name='facebook/bart-large-mnli'
# hyperparameters, which are passed into the training job
hyperparameters={#'epochs': 1,
#'train_batch_size': 8,
'do_train' : True,
'do_eval' : True,
'model_name':model_name,
'task_name': 'mnli',
#'output_data_dir': '/opt/ml/output/data/',
'output_dir': '/opt/ml/model',
#'ignore_mismatched_sizes': True,
'overwrite_output_dir' : True
}
git_config = {'repo': 'https://github.com/huggingface/transformers.git','branch': 'v4.26.0'}
# creates Hugging Face estimator
huggingface_estimator = HuggingFace(
entry_point='run_glue.py',
source_dir='./examples/pytorch/text-classification',
instance_type='ml.p3dn.24xlarge',
instance_count=1,
role=role,
git_config=git_config,
transformers_version='4.26.0',
pytorch_version='1.13.1',
py_version='py39',
hyperparameters = hyperparameters,
distribution = distribution,
save_strategy= "no",
save_total_limit=1,
load_best_model_at_end=True
)
huggingface_estimator.fit({'train': training_input_path, 'test': testing_input_path})
Training data shape is as follows:
{‘label’: 2, ‘input_ids’: [0, 44758, 3457, 13, 5, 1263, 829, 31, 5, 1263, 8401, 4001, 438, 34, 5, 511, 7390, 2, 2, 713, 1246, 16, 4287, 92, 3457, 4, 2], ‘attention_mask’: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], ‘input_sentence’: ‘Create tests for the response received from the response whihc has the following formatThis example is Add new tests.’}
The problem is that after running this it takes 5 hours too uplaod a file to S3 and the size of the model.tar.gz is 158gb