Hi all,
I’m Darshan Hiranandani, currently developing a system to process raw transcripts by splitting them into meaningful paragraphs. The transcript is structured in a JSON schema, where each paragraph is an item in an array, and each paragraph contains words, which are stored as objects with start and end timestamps.
While OpenAI models have been useful so far in answering questions based on the transcript, I’ve encountered an issue where the results get truncated when dealing with large transcripts.
Given this, I’m seeking recommendations for the most suitable model to handle large transcripts with accompanying metadata effectively. What are your suggestions? Has anyone dealt with similar issues, or can you think of other models that might be more efficient for this use case?
Would appreciate any insights!
Regards
Darshan Hiranandani