Transform any PDF document into an AI-generated audio interview/dialogue

Doc-To-Dialogue

Transform any PDF document (research report, market analysis, presentations, user guides,…) into an #AI-generated audio interview/dialogue with two different AI voices that discuss the most relevant aspects of the document. An alternative way to comprehend and enhance document engagement.

In this personal experiment I used Google Gemini 1.5 Flash #API for document processing, OpenAI Whisper #TTS for the voice generation, Gradio for the user interface. Code uploaded in a Hugging Face Space.

Any repost or feedback will be much appreciated!

2 Likes

Wow! Qué genial que ya se puede hacer en español.

1 Like