Streaming output text when deploying on Sagemaker

Ludivine · May 12, 2023, 7:09am

Hi,
I’m working on fine-tuning bloom and deploying the model on Sagemaker and I wanted to know if it’s possible to stream the output generated text, with directly modifying the inference functions?
I already tried with applying the TextIteratorStreamer within predict_fn function, but this doesn’t seem to be the solution.
Any idea about this would be really appreciate

philschmid · May 12, 2023, 1:58pm

SageMaker is currently not supporting streaming responses.

Ludivine · May 12, 2023, 2:39pm

Ok thank you @philschmid

RemiP · May 16, 2023, 3:06pm

Hi @Ludivine , hi @philschmid ,

I’m also trying to stream in a GPT like style the output of my LLM. I’m trying to use https request and an iterator streamer. However, it would be great if it was a built-in possibility with huggingface/sagemaker.

Thanks for your great job.

Harsh1729 · June 11, 2023, 9:55am

Hey @RemiP thanks for your response. Can you pls elaborate how are you streaming outputs from the LLM deployed as HuggingFace inference endpoint? Appreciate you help:)

aastha6 · October 6, 2023, 9:48am

Sagemaker real-time inference endpoints now supports streaming. Check this blog

Topic		Replies	Views
Streaming output text When deploying a finetuned (SFT, DPO) model with custom inference script Amazon SageMaker	1	32	November 8, 2024
AWS Sagemaker doesn't return the full response Amazon SageMaker	1	127	July 17, 2024
Deploying a conversational pipeline on AWS Amazon SageMaker	9	4300	July 13, 2023
Model Stream Error - Streaming times out after 60 seconds Amazon SageMaker	0	339	May 15, 2024
Deploying Open AI’s whisper on Sagemaker for audio streaming Amazon SageMaker	2	1521	June 9, 2023

Streaming output text when deploying on Sagemaker

Related topics