Bart Large CNN summarization

On facebook/bart-large-cnn · Hugging Face, an article can be pasted into the summarization tool. I am attempting to replicate this with the same model. By viewing the “use in transformers” button, the following code is able to be seen:

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("facebook/bart-large-cnn")

model = AutoModel.from_pretrained("facebook/bart-large-cnn")

Looking at the transformers/model_doc/bart documentation, the summarization example at the bottom of the page uses the bart-large model, and not the cnn. However, attempting to combine the two I come up with:

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("facebook/bart-large-cnn")
model = AutoModel.from_pretrained("facebook/bart-large-cnn")

ARTICLE_TO_SUMMARIZE = "My friends are cool but they eat too many carbs."
inputs = tokenizer([ARTICLE_TO_SUMMARIZE], max_length=1024, return_tensors='tf')
# Generate Summary
summary_ids = model.generate(inputs['input_ids'], num_beams=4, max_length=5, early_stopping=True)
print([tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=False) for g in summary_ids])

The following warning and trace is returned:

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-fb149e69ea96> in <module>
      7 inputs = tokenizer([ARTICLE_TO_SUMMARIZE], max_length=1024, return_tensors='tf')
      8 # Generate Summary
----> 9 summary_ids = model.generate(inputs['input_ids'], num_beams=4, max_length=5, early_stopping=True)
     10 print([tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=False) for g in summary_ids])

~/.local/lib/python3.6/site-packages/torch/autograd/grad_mode.py in decorate_no_grad(*args, **kwargs)
     47         def decorate_no_grad(*args, **kwargs):
     48             with self:
---> 49                 return func(*args, **kwargs)
     50         return decorate_no_grad
     51 

~/.local/lib/python3.6/site-packages/transformers/generation_utils.py in generate(self, input_ids, max_length, min_length, do_sample, early_stopping, num_beams, temperature, top_k, top_p, repetition_penalty, bad_words_ids, bos_token_id, pad_token_id, eos_token_id, length_penalty, no_repeat_ngram_size, num_return_sequences, decoder_start_token_id, use_cache, num_beam_groups, diversity_penalty, prefix_allowed_tokens_fn, output_attentions, output_hidden_states, output_scores, return_dict_in_generate, **model_kwargs)
    821             # init `attention_mask` depending on `pad_token_id`
    822             model_kwargs["attention_mask"] = self._prepare_attention_mask_for_generation(
--> 823                 input_ids, pad_token_id, eos_token_id
    824             )
    825 

~/.local/lib/python3.6/site-packages/transformers/generation_utils.py in _prepare_attention_mask_for_generation(self, input_ids, pad_token_id, eos_token_id)
    360         self, input_ids: torch.Tensor, pad_token_id: int, eos_token_id: int
    361     ) -> torch.LongTensor:
--> 362         is_pad_token_in_inputs_ids = (pad_token_id is not None) and (pad_token_id in input_ids)
    363         is_pad_token_not_equal_to_eos_token_id = (eos_token_id is None) or (
    364             (eos_token_id is not None) and (pad_token_id != eos_token_id)

~/.local/lib/python3.6/site-packages/tensorflow/python/framework/ops.py in __bool__(self)
    990 
    991   def __bool__(self):
--> 992     return bool(self._numpy())
    993 
    994   __nonzero__ = __bool__

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

I have tried switching different hyperparameters, specifically the max_length with truncation=True and more.

End goal is to reproduce locally the output from this article summary.

Essentially my question boils down to what follows after the bart large cnn model instantiation in order to obtain the desired article summary from the link above?

from transformers import AutoTokenizer, AutoModel
  
tokenizer = AutoTokenizer.from_pretrained("facebook/bart-large-cnn")
  
model = AutoModel.from_pretrained("facebook/bart-large-cnn")

inputs = tokenizer(params?)
model.generate(params?)

transformers-cli env

- `transformers` version: 4.2.2
- Platform: Linux-5.4.0-62-generic-x86_64-with-Ubuntu-18.04-bionic
- Python version: 3.6.9
- PyTorch version (GPU?): 1.4.0 (True)
- Tensorflow version (GPU?): 2.4.1 (False)
- Using GPU in script?: tried both
- Using distributed or parallel set-up in script?: no

I have also verified that

python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('we love you'))"

has worked for me in my environment.

looks like you are returning tf tensors, from tokenizer, but the model is in torch.

inputs = tokenizer([ARTICLE_TO_SUMMARIZE], max_length=1024, return_tensors='tf')

changing it to pt should fix the issue.

Changing just return_tensors='pt' returns a different error:

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("facebook/bart-large-cnn")
model = AutoModel.from_pretrained("facebook/bart-large-cnn")

ARTICLE_TO_SUMMARIZE = "My friends are cool but they eat too many carbs."
inputs = tokenizer([ARTICLE_TO_SUMMARIZE], max_length=1024, return_tensors='pt')
# Generate Summary
summary_ids = model.generate(inputs['input_ids'], num_beams=4, max_length=5, early_stopping=True)
print([tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=False) for g in summary_ids])

Error:

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Traceback (most recent call last):
  File "para.py", line 56, in <module>
    summary_ids = model.generate(inputs['input_ids'], num_beams=4, max_length=5, early_stopping=True)
  File "/home/billy/.local/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
    return func(*args, **kwargs)
  File "/home/billy/.local/lib/python3.6/site-packages/transformers/generation_utils.py", line 958, in generate
    **model_kwargs,
  File "/home/billy/.local/lib/python3.6/site-packages/transformers/generation_utils.py", line 1596, in beam_search
    next_token_logits = outputs.logits[:, -1, :]
AttributeError: 'Seq2SeqModelOutput' object has no attribute 'logits'

to load Seq2Seq (encoder-decoder) models, AutoModelForSeq2SeqLM should be used, in your snippet you are using AutoModel

1 Like

Cheers, thank you. I was unable to find any relevant documentation for that specifying that model import, should it be updated and is more info located somewhere where I can read more about it as well?

Auto models classes are documented here
https://huggingface.co/transformers/model_doc/auto.html#

can you not use the BartModel and BartTokenizer directly and use the truncation=True to get over this?

if you don’t intend to truncate then you may be interested in this discussion here Summarization on long documents

Thanks