How to utilize a summarization model

neuralpat · February 16, 2021, 6:12am

Do you have any concrete questions though? Where exactly are you stuck?

Regarding the out of memory error - Have you tried decreasing the batch size or using a smaller model?

I wouldn’t say any transformer is “easier” or harder. That’s what’s beautiful about huggingface, it gives you access to many models through one API. Different kinds of models may have different needs but I wouldn’t say there are easier and harder models, as a lot of the complexity is abstracted away by huggingface.

One more thing I think I’d try is not to remove the legalese. Usually those are the important parts. Wouldn’t it be awesome if your model included readable summaries of that stuff?
If you have examples of T&Cs and summaries, then you could fine tune any model designed for that task, or you could use an EncoderDecoderModel as explained here: Leveraging Pre-trained Language Model Checkpoints for Encoder-Decoder Models.
If you don’t have any training data I’d still leave the legalese in and just see what the result looks like. It might still be okay.

T&Cs are usually long though, are you currently just truncating the input (most models I’ve come across have a max input length of 512)? This is something I’m trying to solve myself right now.

Disclaimer: I’m fairly new to this myself.

Topic		Replies	Views
Finetuning Pegasus for summarization task 🤗Transformers	3	1048	October 14, 2020
How does summarization work with pretrained models? 🤗Transformers	0	596	November 14, 2023
Pegasus Questions 🤗Transformers	29	3945	July 5, 2021
Questions about Pegasus for Summarization 🤗Transformers	1	787	August 24, 2020
PEGASUS extracting from input instead of abstrative summarization 🤗Transformers	0	272	June 16, 2021

How to utilize a summarization model

Related topics