Which is the optimum model for handling large transcripts with MetaData?

I am currently working on a system that helps raw transcripts split into meaningful paragraphs, the current structure of the transcript is a JSON schema with each paragraph as an item of the array and words of the paragraph as array within each paragraph. Each word is an object which contains the start and end time too.

So far, OpenAI is able to answer but it truncates the results when large transcripts are provided.

Which is the most suitable model for this particular use case?

1 Like