Hi !
After some years doing ml using only sckit-learn for text classification, I have now a opportunity to do something a little bit more advanced: text summarization
So, I would like to create a small proof-of-concept using (already extracted in txt files) ± 4.000 legal text divided in:
- 2.000 initial petitions / complaints *.txt
- 2.000 summaries of each inital petition
PS.: all text files are in brazilian portuguese
So how can I use these txt files to train a new transformer able to generate new summaries?