Hello Hugging Face community,
I’m currently working with a custom handler for my inference pipeline, and I’m trying to understand how I can query the progress of the inference from an endpoint.
Below is the method I’m currently using:
def __call__(self, data: Any) -> List[List[Dict[str, float]]]:
"""
Args:
data (:obj:):
includes the input data and the parameters for the inference.
Return:
A :obj:`dict`:. base64 encoded image
"""
inputs = data.pop("inputs", data)
# run inference pipeline
with autocast(device.type):
image = self.pipe(inputs, guidance_scale=7.5)["sample"][0]
# encode image as base 64
buffered = BytesIO()
image.save(buffered, format="JPEG")
img_str = base64.b64encode(buffered.getvalue())
# postprocess the prediction
return {"image": img_str.decode()}
While this works to get the result, it doesn’t provide any insights into how far the inference has progressed.
My questions are:
- How can I modify the above
__call__
method to provide updates or feedback about the inference progress? - How do I subsequently query the endpoint to get this progress information?
Any help, sample code, or pointers would be greatly appreciated!
Thank you in advance!