This is my first post, so I apologize if I make any mistakes.
I am currently taking university classes. The lecture recordings are shared via Zoom, and I can obtain timestamped scripts (vtt files) using the subtitle function.
I have successfully implemented a process to remove the timestamps and convert the vtt files into plain text files.
My goal is to summarize these transcripts so that I can understand the lecture content just by reading the summary.
However, when I use various LLMs to create summaries, the results are overly compressed and lack important details.
If you know of any recommended tools, libraries, or other helpful resources, I would appreciate it if you could share them with me.
1 Like
When summarizing long texts, it seems best to consider the know-how for summarizing long texts separately from the Japanese summarization performance of the model.
For Japanese summarization, you can find information by searching in Japanese, so you may want to refer to that as well.
As for model performance, it is a good idea to compare them using a leaderboard, but in reality, each model has its own quirks, so it is best to try out a few that seem promising.