Regarding question answering systems using BERT, I seem to mainly find this being used where a context is supplied. Does anyone have any information where this was used to create a generative language model where no context is available?
Hey @EmuK, indeed most expositions of “question answering” are really referring to the simpler task of reading comprehension
What you’re probably looking for is either:
- open-domain question answering, where only the query is supplied at runtime and a retriever fetches relevant documents (i.e. context) for a reader to extract answers from. You can find a really nice summary of these systems here: How to Build an Open-Domain Question Answering System?
- closed-book question answering, where large language models like T5 or GPT-3 have memorised some facts during pre-training and can generate an answer without explicit context (the “closed-book” part is an analogy with humans taking exams, where we’ve learnt something in advance and have to use our memory to answer questions ). There’s a brief discussion of these models in the above blog post, but this T5 paper is well worth reading in it’s own right: [2002.08910] How Much Knowledge Can You Pack Into the Parameters of a Language Model?
There’s also a nifty library called Haystack that brings a lot of these ideas together in a unified API: https://haystack.deepset.ai/
Ok got it. Thanks for the references!
Hi you should try RAG and RAG-end2end in the Transformers library.