How to develop auto chaptering feature

Hi, I want to develop an auto chaptering feature like Youtube or AssemblyAI.
But I could not find a blog post or an article about it.

My naive approach is something like below.

  1. split entire text int chunks.
    eg. split 2000 words text into 40 chunks. 1 chunk has 50 words.

  2. use sentence-transformers to get embeddings.

  3. if n index of chunk is significantly different from n-1, I can think topic change.

  4. summarize chunks to create chapter summary.
    eg 1st chunk is 0 to 15, 2nd chunk is 16 to 35, 3rd chunk is 36 to 50.

I’m pretty sure there is a way more sophisticated way.

Please give me a hint if someone know the proper methods.

Thanks in advance.

1 Like