How can I do text Summarization using ProphetNet

The ProphetNet is now integrated,


Any idea how to make the model for a text summarization ?

ProphetNet automatically shifts the tokens so you call compute the loss as follows:

prophetnet = ProphetNetForConditionalGeneration.from_pretrained(...) 

loss = prophetnet(input_ids=tokenized_article, decoder_input_ids=tokenized_summary, labels=tokenized_summary)
1 Like

Also you could make use of this notebook - it should be applicable 1-to-1, when you change the tokenizer correctly and switch EncoderDecoderModel with ProphetNetForConditionalGeneration:

2 Likes

Hi,
I tried applying ProphetNet to the Seq2Seq Trainer fine-tuning example script.

  • Environment: Colaboratory

    • transformers version: 4.0.0-dev
    • Platform: Linux-4.19.112±x86_64-with-Ubuntu-18.04-bionic
    • Python version: 3.6.9
    • PyTorch version (GPU?): 1.7.0+cu101 (True)
    • Tensorflow version (GPU?): 2.3.0 (True)
    • GPU : Tesla T4
  • Task: summarization

  • Dataset: about one-tenth of XSum (considering the 12-hours limitation of Colab)

  • Script

python finetune_trainer.py \
    --learning_rate=3e-5 \
    --fp16 \
    --do_train --do_eval --do_predict --evaluate_during_training \
    --predict_with_generate \
    --max_source_length 510 \
    --per_device_train_batch_size 2 \
    --per_device_eval_batch_size 2 \
    --n_train 20400 \
    --n_val 100 \
    --model_name_or_path microsoft/prophetnet-large-uncased \
    --data_dir $XSUM_DIR \
    --output_dir prophetnet_large_uncased_xsum \
    --save_steps 10000 \
    --save_total_limit 5 \
    --overwrite_output_dir

It took 6:28:28 for fine-tuning and 2:36:39 for prediction.
Here is the result.

11/17/2020 21:28:35 - INFO - root -   *** Test ***
[INFO|trainer.py:1387] 2020-11-17 21:28:35,447 >> ***** Running Prediction *****
[INFO|trainer.py:1388] 2020-11-17 21:28:35,448 >>   Num examples = 11332
[INFO|trainer.py:1389] 2020-11-17 21:28:35,448 >>   Batch size = 2
100% 5666/5666 [2:36:39<00:00,  1.67s/it]11/18/2020 00:06:13 - INFO - __main__ -   ***** Test results *****
11/18/2020 00:06:13 - INFO - __main__ -     test_loss = 2.11389422416687
11/18/2020 00:06:13 - INFO - __main__ -     test_rouge1 = 38.5525
11/18/2020 00:06:13 - INFO - __main__ -     test_rouge2 = 15.35
11/18/2020 00:06:13 - INFO - __main__ -     test_rougeL = 30.6488
11/18/2020 00:06:13 - INFO - __main__ -     test_rougeLsum = 30.7002
11/18/2020 00:06:13 - INFO - __main__ -     test_gen_len = 26.0
100% 5666/5666 [2:37:53<00:00,  1.67s/it]

The prediction (generation) results are output as test_generation.txt.
Here are some examples from it.

n - dubz have revealed they are up for four prizes at this year’s mtv europe music awards.
j craig venter, who won the nobel prize for his work in human genomics, is one of the world’s most celebrated scientists.

Considering the time limitations of the execution environment, I used only about one tenth of the dataset for now, but I think we could get better results if we used the entire dataset.

For more details, please check the .ipynb in the URL: https://github.com/forest1988/colaboratory/blob/main/prophetnet_seq2seqtrainer_finetuning_experiment.ipynb

Thank you.

1 Like

That’s super useful thank you !

1 Like

It’s my pleasure!
I’m glad to hear that it is useful for you.