Transcription summaries and actions

swtb · March 4, 2025, 10:25pm

What is currently the best small model for summarising transcripts and extracting actions?

I’m looking at the <5B parameter or maybe <10B parameter classes

Transcripts will be produced by whisper + pyanote/diarization.

Audio clips will be at least 1hours long possibly as long as 6 hours in rare cases. So we can expect large transcripts.

John6666 · March 5, 2025, 11:15am

For smaller models, I think the Llama 3.2 or Qwen 2.5 series are safe, but there may be specific benchmarks on the leaderboard. The URL below is for the long-context-support version of Qwen.

swtb · March 5, 2025, 3:19pm

Thanks for this, I wasnt aware of Qwens long context model

Any thoughts on wether it will be better to use long context and try to summarise in one go compared to chunking the input into intermediate summaries?

John6666 · March 5, 2025, 3:29pm

It would probably be more accurate to have the model directly summarize long contexts, but it would probably require a huge amount of VRAM and latency to process long contexts at once, so it would probably be smarter to process them in chunks. I think it would be easier to summarize short texts in chunks even with a small model.

Topic		Replies	Views
Namrata Hinduja Geneva Beginner to HuggingFace Platform Beginners	1	28	May 28, 2025
How Can I Accurately Summarize Long Japanese Texts? Beginners	1	25	April 28, 2025
Which summarization model of huggingface supports more than 1024 tokens? Which model is more suitable for programming related articles? 🤗Transformers	1	1747	July 31, 2023
Support for ASR inference on longer audiofiles or on live transcription? 🤗Transformers	2	472	April 21, 2023
Model for summarizing lectures (or transcripts) Beginners	2	181	November 29, 2024

Transcription summaries and actions

Related topics