The ProphetNet is now integrated,
Any idea how to make the model for a text summarization ?
The ProphetNet is now integrated,
ProphetNet automatically shifts the tokens so you call compute the loss as follows:
prophetnet = ProphetNetForConditionalGeneration.from_pretrained(...)
loss = prophetnet(input_ids=tokenized_article, decoder_input_ids=tokenized_summary, labels=tokenized_summary)
Also you could make use of this notebook - it should be applicable 1-to-1, when you change the tokenizer correctly and switch EncoderDecoderModel
with ProphetNetForConditionalGeneration
:
Hi,
I tried applying ProphetNet to the Seq2Seq Trainer fine-tuning example script.
Environment: Colaboratory
transformers
version: 4.0.0-devTask: summarization
Dataset: about one-tenth of XSum (considering the 12-hours limitation of Colab)
Script
python finetune_trainer.py \
--learning_rate=3e-5 \
--fp16 \
--do_train --do_eval --do_predict --evaluate_during_training \
--predict_with_generate \
--max_source_length 510 \
--per_device_train_batch_size 2 \
--per_device_eval_batch_size 2 \
--n_train 20400 \
--n_val 100 \
--model_name_or_path microsoft/prophetnet-large-uncased \
--data_dir $XSUM_DIR \
--output_dir prophetnet_large_uncased_xsum \
--save_steps 10000 \
--save_total_limit 5 \
--overwrite_output_dir
It took 6:28:28 for fine-tuning and 2:36:39 for prediction.
Here is the result.
11/17/2020 21:28:35 - INFO - root - *** Test ***
[INFO|trainer.py:1387] 2020-11-17 21:28:35,447 >> ***** Running Prediction *****
[INFO|trainer.py:1388] 2020-11-17 21:28:35,448 >> Num examples = 11332
[INFO|trainer.py:1389] 2020-11-17 21:28:35,448 >> Batch size = 2
100% 5666/5666 [2:36:39<00:00, 1.67s/it]11/18/2020 00:06:13 - INFO - __main__ - ***** Test results *****
11/18/2020 00:06:13 - INFO - __main__ - test_loss = 2.11389422416687
11/18/2020 00:06:13 - INFO - __main__ - test_rouge1 = 38.5525
11/18/2020 00:06:13 - INFO - __main__ - test_rouge2 = 15.35
11/18/2020 00:06:13 - INFO - __main__ - test_rougeL = 30.6488
11/18/2020 00:06:13 - INFO - __main__ - test_rougeLsum = 30.7002
11/18/2020 00:06:13 - INFO - __main__ - test_gen_len = 26.0
100% 5666/5666 [2:37:53<00:00, 1.67s/it]
The prediction (generation) results are output as test_generation.txt
.
Here are some examples from it.
n - dubz have revealed they are up for four prizes at this year’s mtv europe music awards.
j craig venter, who won the nobel prize for his work in human genomics, is one of the world’s most celebrated scientists.
Considering the time limitations of the execution environment, I used only about one tenth of the dataset for now, but I think we could get better results if we used the entire dataset.
For more details, please check the .ipynb in the URL: https://github.com/forest1988/colaboratory/blob/main/prophetnet_seq2seqtrainer_finetuning_experiment.ipynb
Thank you.
That’s super useful thank you !
It’s my pleasure!
I’m glad to hear that it is useful for you.