Monitoring Metric "Transform Fn"

dzorlu · August 20, 2021, 1:12am

Hi- It’s me again.

I want to get a better sense of the system latency for the speech model I have deployed.
AWS provides Invocation Endpoint Metrics like ModelLatency but that is more end-to-end. I am particularly interested in how much time is spent with the forward pass. I think I have two options:

1- HuggingFace model logs preprocess,predict, and postprocess times here but there is a bug as below where predict time is not captured correctly.

2- Here, the metrics Transform Fn is added to the context using the API described here.

My question is where does the Transform Fn go? I have hard time finding it on CloudWatch. I will also submit a PR to fix the predict time logging.

Thanks so much!
Deniz

philschmid · August 20, 2021, 1:47pm

Hey @dzorlu,

thank you for finding the bug, please let me know if you not manage to open PR, then I would take care of it.

Maybe @dan21c can tell more about the transform_fn and where is stored?

context.metrics.add_time("Transform Fn", round((predict_end - predict_start) * 1000, 2))

dzorlu · August 20, 2021, 8:02pm

Thanks @philschmid. Here is the PR.

Topic		Replies	Views
InvokeEndpoint Error : Predict function Invocation Timeout 🤗Transformers	3	3191	December 1, 2023
Deploying Sentence Transformer as sagemaker endpoint Amazon SageMaker	18	8147	March 26, 2024
Model_fn and predict_fn called multiple times? Amazon SageMaker	2	513	October 11, 2024
Errors: Batch transform on fine-tuned models Amazon SageMaker	4	1573	May 4, 2023
Estimating tokens per second Inference Endpoints on the Hub	3	8422	June 27, 2023

Monitoring Metric "Transform Fn"

Related topics