I am new to HuggingFaces and I am trying to use the GPT-Neo model to generate the next sentence in a conversation (basically like a chatbot).
I tried around with GPT-3 before, and there I was using “Me:” as an End-Sequence to ensure the model would stop generating when it genrated the Text “Me:” (which indicates that it is my turn to say something).
Is there a similar option for GPT-Neo?
Just hopping in to say I have the exact same question in the hopes it’ll encourage someone to answer.
There is the tokenizer.eos_token which is basically <|endoftext|>
I’m still a beginner too, I have tried to use it in various ways but nothing seems too fitting of getting good endings and results.
You have a couple of different options, but neither is perfect.
The easiest option is to generate the longest text you can stand. Your stopword (“Me:”) will probably get generated at least once in there, possibly several times. So do re.sub() to remove the first occurrance of your stopword and everything after it in the generated text.
That’s very easy to implement, the downside is it’s computationally expensive because you will usually wind up generating a ton of stuff only to throw it away.
If you are using model.generate(), you can also use the stopping_criteria parameter with a Callable class that checks to see if your stopword has been generated and returns True if it has to stop further generation. That gets tricky if you are generating multiple sequences (num_return_sequences>1). You’ll have to wait until they have all generated the stopword to end generation, meaning you’ll still have to trim after. And it’s rare (but not impossible, especially if temperature is low) for all sequences to produce the stopword at the same position.
There may also be a better way that I’m not aware of; I’m new at this.