Llama 2 support for AutoModelForQuestionAnswering

Hello everyone!

I was wondering if there is any way to use the Llama 2 type models with the AutoModelForQuestionAnswering? Currently, as far as I am aware, Llama models cannot be use in a AutoModelForQuestionAnswering pipeline. Is there any prediction for their integration, or no?

If not, any one recommends a work around?


There’s no Llama2ForQuestionAnswering model yet in the Transformers library. The reason for that is because Llama-2 is a decoder-only Transformer, mainly useful for generative tasks. xxxForQuestionAnswering models are trained to extract the appropriate answer from a given piece of context, hence one typically uses encoder-only Transformers like BERT, RoBERTa, etc. for these. They are also a lot smaller than models like Llama-2.

See this guide for more info: notebooks/examples/question_answering.ipynb at main · huggingface/notebooks · GitHub.

Thanks for the answer, but I don’t think that is the case (only encoder transformers are supported for this pipeline, like BERT-style models).

According to the documentation, all these models can be used by the question-and-answering pipeline:

ALBERT, BART, BERT, BigBird, BigBird-Pegasus, BLOOM, CamemBERT, CANINE, ConvBERT, Data2VecText, DeBERTa, DeBERTa-v2, DistilBERT, ELECTRA, ERNIE, ErnieM, Falcon, FlauBERT, FNet, Funnel Transformer, OpenAI GPT-2, GPT Neo, GPT NeoX, GPT-J, I-BERT, LayoutLMv2, LayoutLMv3, LED, LiLT, Longformer, LUKE, LXMERT, MarkupLM, mBART, MEGA, Megatron-BERT, MobileBERT, MPNet, MPT, MRA, MT5, MVP, Nezha, Nyströmformer, OPT, QDQBert, Reformer, RemBERT, RoBERTa, RoBERTa-PreLayerNorm, RoCBert, RoFormer, Splinter, SqueezeBERT, T5, UMT5, XLM, XLM-RoBERTa, XLM-RoBERTa-XL, XLNet, X-MOD, YOSO

As you can see, many are deconder-only transformers (GPT-2, GPT-J, etc).

Hence, my question is, is there any prediction for Llama 2 models to be inclueded in this list?

Yes it’s true that you can also do that using decoder-only Transformers (I just don’t recommend it).

For LLaMa, could you open an issue on the Transformers library? Then I’ll assign it as “good first issue” since it’s a nice opportunity for a first contribution by someone. The implementation can be copied from GptjForQuestionAnswering.

1 Like

Thank you for the suggestion. I will open this feature request.

The motivation behind it is that the AutoModelForQuestionAnswering, regardless of whether encoder-only transformers are superior in closed Q&A settings, facilitates the evaluation of generative transformers in benchmarks like Squad.

A feature request is open.