Hi,
I am running a batch transform job with a model from the Huggingface hub (ProsusAI/finbert) in SageMaker, with a simple change in the configuration of the transformer to join the ID of the input with the inference results in the output, and I’m getting a Prediction Exception and the job fails. The error displayed is the following:
mms.service.PredictionException: 'str' object has no attribute 'pop' : 400
Has anyone tried to run the batch transform job with SageMaker joining the prediction results with an identifier in the input file and succeeded doing so? Or does anyone know how to handle this error?
The configuration I am using is similar to the one in this link: notebooks/sagemaker-notebook.ipynb at master · huggingface/notebooks · GitHub, the only modification I am doing, is setting up other parameters to the transformer
and the transform
to match this Associate Prediction Results with Input Records - Amazon SageMaker.
The input file looks like this:
{"id":"item#1419453569267240963","inputs":"RT LCID is sexier than TSLA It s a fact"}
{"id":"item#1419453569334341640","inputs":"If you were given 1 million and had to invest it all into a single asset what would it be Gold Silver BTC XRP"}
{"id":"item#1419453570710114308","inputs":"RT FINANCE To celebrate the launch of KingsMenCoin Solana s memecoin we are organizing an AIRDROP To be eligible Like RT"}
{"id":"item#1419453577395937283","inputs":"RT Big movers this week with catalyst 1 bzwr updates per and 2 fern big news 3 kync la"}
And my code is this one:
from sagemaker.huggingface.model import HuggingFaceModel
output_s3_path = s3_path_join("s3://",sagemaker_session_bucket,"batch_transform/output")
hub = {
'HF_MODEL_ID':'ProsusAI/finbert',
'HF_TASK':'text-classification'
}
huggingface_model = HuggingFaceModel(
transformers_version='4.6',
pytorch_version='1.7',
py_version='py36',
env=hub,
role=role,
)
batch_job = huggingface_model.transformer(
instance_count=1,
instance_type='ml.p3.2xlarge',
strategy='SingleRecord',
output_path=output_s3_path,
accept='application/json',
assemble_with='Line',
)
batch_job.transform(
data='s3://sagemaker-us-east-1-822164694494/batch_transform/input/preprocessed_item_test.jsonl',
content_type='application/json',
split_type='Line',
input_filter="$.inputs",
join_source= "Input",
output_filter="$['id','SageMakerOutput']"
)
And then this is the error in the logs:
2021-11-26 17:59:33,795 [INFO ] W-9000-ProsusAI__finbert com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 4877
2021-11-26 17:59:33,797 [WARN ] W-9000-ProsusAI__finbert com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-ProsusAI__finbert-1
2021-11-26 17:59:33,800 [INFO ] W-ProsusAI__finbert-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Prediction error
2021-11-26 17:59:33,800 [INFO ] W-ProsusAI__finbert-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Traceback (most recent call last):
2021-11-26 17:59:33,800 [INFO ] W-ProsusAI__finbert-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File "/opt/conda/lib/python3.6/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py", line 222, in handle
2021-11-26 17:59:33,800 [INFO ] W-9000-ProsusAI__finbert com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 1
2021-11-26 17:59:33,800 [INFO ] W-ProsusAI__finbert-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - response = self.transform_fn(self.model, input_data, content_type, accept)
2021-11-26 17:59:33,800 [INFO ] W-ProsusAI__finbert-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File "/opt/conda/lib/python3.6/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py", line 181, in transform_fn
2021-11-26 17:59:33,800 [INFO ] W-ProsusAI__finbert-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - predictions = self.predict(processed_data, model)
2021-11-26 17:59:33,801 [INFO ] W-9000-ProsusAI__finbert ACCESS_LOG - /169.254.255.130:35648 "POST /invocations HTTP/1.1" 400 4790
2021-11-26 17:59:33,801 [INFO ] W-ProsusAI__finbert-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File "/opt/conda/lib/python3.6/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py", line 142, in predict
2021-11-26 17:59:33,801 [INFO ] W-ProsusAI__finbert-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - inputs = data.pop("inputs", data)
2021-11-26 17:59:33,801 [INFO ] W-ProsusAI__finbert-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - AttributeError: 'str' object has no attribute 'pop'
2021-11-26 17:59:33,801 [INFO ] W-ProsusAI__finbert-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle -
2021-11-26 17:59:33,801 [INFO ] W-ProsusAI__finbert-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - During handling of the above exception, another exception occurred:
2021-11-26 17:59:33,801 [INFO ] W-ProsusAI__finbert-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle -
2021-11-26 17:59:33,802 [INFO ] W-ProsusAI__finbert-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Traceback (most recent call last):
2021-11-26 17:59:33,802 [INFO ] W-ProsusAI__finbert-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File "/opt/conda/lib/python3.6/site-packages/mms/service.py", line 108, in predict
2021-11-26 17:59:33,802 [INFO ] W-ProsusAI__finbert-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - ret = self._entry_point(input_batch, self.context)
2021-11-26 17:59:33,803 [INFO ] W-ProsusAI__finbert-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File "/opt/conda/lib/python3.6/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py", line 231, in handle
2021-11-26 17:59:33,803 [INFO ] W-ProsusAI__finbert-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - raise PredictionException(str(e), 400)
2021-11-26 17:59:33,803 [INFO ] W-ProsusAI__finbert-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - mms.service.PredictionException: 'str' object has no attribute 'pop' : 400
As always, any help is much appreciated!!