BART for Portuguese

Hey @sshleifer !

I have a question: I need to perform Summarization on Portuguese texts. Is there some pre-trained BART for this language or some Multi-Lingual BART model?

If not, what is the recommended approach? I could think in some possibilities, like using portuguese encoder or translating the text to english…


There is no pretrained model for portuguese summarization, you would need to finetune multilingual bart (mbart). Do you have document/summary pairs to use as training data?

Ok! I still don’t have it, but I am preparing. Should I use BartForConditionalGeneration? If you could share one notebook example for finetunning this task would be great. Thanks.

I would use mbart, I will share a command when this PR is merged.

1 Like

@bunoviske Were you able to train the model? If so, do you have a notebook to share? I need to do this summary with BART in Portuguese too, but I didn’t find anything about it.

@thiagocmoreira @bunoviske Rather than doing separate models, you may want to communicate with each other and share the training data and cost to fine tune mbart on Portuguese.

I agree with you @BramVanroy :slight_smile:

Hello @bunoviske and @thiagocmoreira. There is a list of datasets in Portuguese in the AI Lab forum. When your dataset to train NLP models for Portuguese summary is ready, could you share its link in the corresponding topic of this forum as well? Thanks in advance!

1 Like

Hey! Sorry, currently I am not working anymore on this issue but it would be great if you update us about your current status and progress. Good luck!