Monitoring Metric "Transform Fn"

Hi- It’s me again.

I want to get a better sense of the system latency for the speech model I have deployed.
AWS provides Invocation Endpoint Metrics like ModelLatency but that is more end-to-end. I am particularly interested in how much time is spent with the forward pass. I think I have two options:

1- HuggingFace model logs preprocess,predict, and postprocess times here but there is a bug as below where predict time is not captured correctly.

2- Here, the metrics Transform Fn is added to the context using the API described here.

My question is where does the Transform Fn go? I have hard time finding it on CloudWatch. I will also submit a PR to fix the predict time logging.

Thanks so much!
Deniz

Hey @dzorlu,

thank you for finding the bug, please let me know if you not manage to open PR, then I would take care of it.

Maybe @dan21c can tell more about the transform_fn and where is stored?

context.metrics.add_time("Transform Fn", round((predict_end - predict_start) * 1000, 2))
1 Like

Thanks @philschmid. Here is the PR.