Using the ElevenLabs API for text to speech streaming?

ethantan · March 8, 2023, 2:10pm

Hello how can I use the ElevenLabs API for realtime text-to-speech streaming output with Gradio please?

https://api.elevenlabs.io/docs#/text-to-speech/Text_to_speech_v1_text_to_speech__voice_id__stream_post

Many thanks

freddyaboulton · March 8, 2023, 3:17pm

Hi @ethantan !

This gradio demo uses a different streaming api but I imagine you can adept it to your use case.

Cc @ysharma

ethantan · March 8, 2023, 4:00pm

Thanks Freddy - it’s text streaming rather than audio but will give it a go!

ethantan · March 8, 2023, 5:09pm

Hi @freddyaboulton I can’t figure out how to play a stream of audio - would appreciate any help!

freddyaboulton · March 8, 2023, 10:11pm

I’m off today - will check back tomorrow!

finaspirant · October 6, 2023, 4:23am

Was anyone able to figure this out? The elevelabs API play() works in local environment but produce no audio and no error on huggingface.

Ashoka74 · October 25, 2023, 4:36pm

import gradio as gr
import librosa
import pyaudio
import numpy as np
import requests
import tempfile
import os
import io
from io import BytesIO
import base64
from faster_whisper import WhisperModel
model_size = "large-v2"
# Run on GPU with FP16
model = WhisperModel(model_size, device="cuda", compute_type="float16")

def main():

    with gr.Blocks(theme=gr.themes.Soft(primary_hue='orange',
                             secondary_hue='orange', neutral_hue='stone')) as app:


        def text_to_speech(text):
            url = f"https://api.elevenlabs.io/v1/text-to-speech/YOUR_VOICE"

            headers = {
            "accept": "audio/mpeg",
            "xi-api-key": "API_KEY",
            "Content-Type": "application/json",
            }

            data = {"text": text}

            response = requests.post(url, headers = headers, json=data)


            if response.status_code == 200:
                print(response.status_code)
                return response.content
            else:
                print(f"Error: {response.status_code}")
                return None
            

        def on_click_play_audio_button(text):
            #text = response2
            print(text)
            audio_content = text_to_speech(text)

            if audio_content is not None:
                # with open("output_audio.mp3", "wb") as audio_file:
                #     audio_file.write(audio_content)
                audio_bytes = BytesIO(audio_content)
                audio_bytes.seek(0)
                audio_base_64 = base64.b64encode(audio_bytes.read()).decode("utf-8")
                audio_player = f'<audio src="data:audio/mpeg;base64,{audio_base_64}" controls autoplay></audio>'
                return audio_player

        with gr.Row():
            with gr.Column(scale=3):
                query = gr.Textbox(label='Query', lines=1, placeholder="Ask a question to the dataset...")
                gr.Textbox.style(query, show_copy_button=True)
            with gr.Column(scale=.1, min_width=200):


                audio = gr.Audio(source="microphone", type="filepath", label="Audio")
                    
                def get_audio(audio):
                    if audio != None:
                        segments, _ = model.transcribe(audio)
                        text = []
                        for segment in segments:
                            text.append(segment.text)
                        text = ' '.join(text)
                        return text
                    else:  
                        return None


        html = gr.HTML()
        audio.change(get_audio, audio, query).then(on_click_play_audio_button, inputs=[query], outputs=[html])

    app.launch(share=True, width=1600, height=800)

if __name__ == "__main__":
    main()

This code generates text from your speech and convert it back to an ElevenLab audio.
You can modify it slightly so that it works with chatbot conversations!

Change the url link and API and it should work once you install all dependencies!

Topic		Replies	Views
Gradio api Streaming 🔒 Gradio	2	2571	July 12, 2023
Simplifying Hugging Face Spaces API calls in Flutter using hugging_face_chat_gradio package Intermediate	4	40	June 8, 2025
Best way to use TextStreamer in gradio 🔒 Gradio	1	1566	April 17, 2023
How to implement chatbot streaming in gradio with a function 🔒 Gradio	5	5967	August 12, 2023
Potential Drawbacks of Using Others' Gradio Apps via API Beginners	0	16	August 20, 2024

Using the ElevenLabs API for text to speech streaming?

Related topics