I’m currently developing a chatbot using transformer models (e.g., GPT 2 or BlenderBot). I would like to incorporate a Knowledge Base to give the chatbot a persona. That means a few sentences describing who it is. For example, the knowledge base could contain the sentences “I am an artist”, “I have two children”, “I recently got a cat”, “I love physics”.
Is it possible to build with Hunggingface a chatbot leveraging a knowledge base? And if so, how? I did not find any tutorials or examples about it.
Second, I would like to automatically build such a knowledge base about a (famous) person from information on the internet (e.g. Wikipedia). Is there a possibility to extract such a knowledge base automatically (e.g. about Brad Pitt)?
You should look into prompt engineering, from experience it might be a bit difficult to get GPT2 to catch your prompt correctly, so if you are able I would go with a bigger model.
(Any article about prompt engineering will tell you this but, make sure you make the prompt read as something you would see in a book)
As for generating that prompt, (and this is only a suggestion) you can use a transformer to summarize the wikipedia article, and use that as the prompt. I believe HF transformers has a pipeline for that.
Thank you very much for your answer. I appreciate it very much.
What do you exactly mean with a “prompt” (or engineering of a prompt)? I tried to search for it but I did not get it.
Second, I only have 11 GB GPU for inference. What bigger model compared to GPT2 would you recommend that fits within 11 GB GPU?
It’s simple, a prompt is the input you give to the transformer, hinting at what the machine should do but most of the time not actually stating it.
Engineering of a prompt is trying different wording to make sure the machine understands what to do.
For example: “The English translation of [PHRASE] is [RESULT]”, Here, replace phrase by anything, and let GPT2 complete the result. This is an example of a prompt.
Some people call this “Prompt learning” as well.
As for what model to use, maybe try GPT-J by EleutherAI, sorry I’m not sure of the memory requirements.
Also, just so you know, if your project is open source you can apply for the TRC of Google Cloud, which will give you free TPUs, so you can use bigger models.
I’m now a bit confused. So, if I understand you correctly the task is to train GPT2 for sentence completion. Let’s assume that I have used a transformer to summarize the wikipedia article about a person. How shall I use this for learning sentence completion? Because the summarization still contains several sentences.
And how can this help the chatbot? Because for the chatbot I don’t want sentence completion but a complete sentence as an answer to another sentence.
Hi @Eichhof , I have a chatbot trained from an indonesian gpt2 base model with indonesian and English personna dataset. As result, I can talk both English and Indonesian, and I can set the personna manually in both languages . Here is the link to the demo
I plan to put it in spaces later.
Thank you very much for your answer. How did you incorporate the persona in the model?
During the finetuning, we feed the concatenation of the persona and the conversation to the model. You can read the detail here 🦄 How to build a State-of-the-Art Conversational AI with Transfer Learning | by Thomas Wolf | HuggingFace | Medium
For simplicity, I finetuned my gpt2 model using simpletransformers Conversational AI Specifics - Simple Transformers