Transform any PDF document (research report, market analysis, presentations, user guides,…) into an #AI-generated audio interview/dialogue with two different AI voices that discuss the most relevant aspects of the document. An alternative way to comprehend and enhance document engagement.
In this personal experiment I used Google Gemini 1.5 Flash #API for document processing, OpenAI Whisper #TTS for the voice generation, Gradio for the user interface. Code uploaded in a Hugging Face Space.
Any repost or feedback will be much appreciated!