Using custom csv data with run_summarization.py in sagemaker

@benG Philip is correct, the keys you use in the input dictionnary {'key1': 's3://...', ..., 'keyN': s3://...'} become local folders names in SageMaker Training instances, respectively

/opt/ml/input/data/key1/
...
/opt/ml/input/data/keyN/

so it seems you only missed to add those key names (train and test) when reading data within the SM Training instance

reference: SageMaker Training documentation How Amazon SageMaker Provides Training Information - Amazon SageMaker

2 Likes