Summarization on long documents

@ananddeshpande (and anyone else who still needs an answer to this question) - take a look at Unlimiformer: GitHub - abertsch72/unlimiformer: Public repo for the preprint "Unlimiformer: Long-Range Transformers with Unlimited Length Input"

i am using nltk to tokenize text and set threshold limit to 512 tokens each . i still get my input segment greater than 1024 so i use truncation = True . then i run the code now i am not getting any error like limit exceed . but my concern is there any data loss because of truncation = True another question is how can i make it more faster if i reduce the max_lenght parameter then my code will work faster if not then suggest something. i am working on pdf summariser project using facebook/bart-large-cnn

How can I use this model if I need to summarize a table data