Write With Transformers XLNet Broken

I may be wrong, but this issue seems to have begun following the release of Transformers v3.0. I apologize if I am being annoying, considering you are doing an immense favor for us in making the demo available.

Hello @zanderbush! Why do you say it’s broken? Do you think the text generation is off?

The completions trigger correctly:

Thank you for your response! (@lysandre)

I have seen its outputs dramatically reduce in quality. If you notice, none of those completions make much sense in that context.

Ah, I see. Have you tried the run_generation.py code in the transformers examples/text-generation folder with XLNet? If you have, have you noticed the same drop in quality using it?

While I did not use run_generation.py code, I remember using this code (or similar to it) and finding myself on the receiving end of an undesirable output. However, Write With Transformers was visibly worse. I can’t think of why this would be the case, but I can say with certainty that this loss in quality does exist and took place recently.

from transformers import AutoModelWithLMHead, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("")
model = AutoModelWithLMHead.from_pretrained("")

sequence = f"""X"""
input = tokenizer.encode(sequence, return_tensors="pt")
generated = model.generate(input, max_length=500, do_sample=True)

resulting_string = tokenizer.decode(generated.tolist()[0])

Interesting, Write with Transformer isn’t based on the generate method, but is simply sequential decoding with topk/topp filtering. We should update it to take full advantage of that method, which is stronger.

We won’t have the bandwidth to do the upgrade just yet, but I’ll keep you posted here.

Thank you. Please take your time in doing that.

I could be wrong, but I don’t believe that to be the issue. I have regularly used the XLNet demo, and only after the most recent update have I seen an issue with its outputs.

Here is a screenshot to explain that explains my thinking. It is suggesting that I repeat “the” twice in two of the examples, and “At the were” does not make grammatical sense either. I wish to bring this to your attention as I am sure you do not want to allocate funds towards a demo that does not work.