Don't know where to start. Please help manipulating transcribed audio

Hello,

I’m trying to create a tool that will take transcribed audio (via Whisper) and break up the text into paragraphs.

It’s important that during this stage of the process, that the text/sentences are not altered in any way. Just have the paragraphs breaks inserted based on the context of the text. When there is an idea shift, the script will insert \n\n.

Can someone please help me?

I created a script with the help of ChatGPT but the script is doing the EXACT same output for all of the models that I’ve tried (BERT, DistilBERT, RoBERT, ALBERT, GPT-2, GPT-Neo, T5-small & T5-large).

After further investigation, the script was using SpaCy to break up the script and it wasn’t doing a great job. It was not actually using the LLM to do the task.

Can anyone help me get started with this and point me in the right direction, please? Maybe some beginner videos, resources, something?