Hey @gildesh, extractive summarization is usually framed as a ranking task, where you chunk your document into sentences and then select the top-N sentences that are most similar to the summary.
So for this approach, you would probably want to take an embedding-based approach, using e.g. sentence-transformers
. There’s a nice blog post about this approach here
There are advanced models like HiBERT, but I’m not sure if the complexity is worth it compared to just using abstractive models like BART