Hello, is there a way I can perform text segmentation on news articles?
For example, a news article usually contains the main topic, but when reading through, there might probably be some subtopics present in the article. Is there a way I can divide those articles into those subsections/subtopics so that a news article can contain 2,3 or more sections depending on the subtopics discussed in that particular article.
In case you are curious about what I need this for, I’m performing summarization on news articles, so instead of summarizing or parsing the whole article into the model at once, I want to divide them into sections based on what is discussed in the article and then summarize each section. Basically I’m trying to imitate what is done at summari.com
I will appreciate it if someone has done something like this before, or if anybody knows a way I can work through it.