Llama 2 support for AutoModelForQuestionAnswering

nicholasKluge · December 24, 2023, 5:01pm

Hello everyone!

I was wondering if there is any way to use the Llama 2 type models with the AutoModelForQuestionAnswering? Currently, as far as I am aware, Llama models cannot be use in a AutoModelForQuestionAnswering pipeline. Is there any prediction for their integration, or no?

If not, any one recommends a work around?

nielsr · December 26, 2023, 8:28am

Hi,

There’s no Llama2ForQuestionAnswering model yet in the Transformers library. The reason for that is because Llama-2 is a decoder-only Transformer, mainly useful for generative tasks. xxxForQuestionAnswering models are trained to extract the appropriate answer from a given piece of context, hence one typically uses encoder-only Transformers like BERT, RoBERTa, etc. for these. They are also a lot smaller than models like Llama-2.

See this guide for more info: notebooks/examples/question_answering.ipynb at main · huggingface/notebooks · GitHub.

nicholasKluge · December 26, 2023, 5:33pm

Thanks for the answer, but I don’t think that is the case (only encoder transformers are supported for this pipeline, like BERT-style models).

According to the documentation, all these models can be used by the question-and-answering pipeline:

ALBERT, BART, BERT, BigBird, BigBird-Pegasus, BLOOM, CamemBERT, CANINE, ConvBERT, Data2VecText, DeBERTa, DeBERTa-v2, DistilBERT, ELECTRA, ERNIE, ErnieM, Falcon, FlauBERT, FNet, Funnel Transformer, OpenAI GPT-2, GPT Neo, GPT NeoX, GPT-J, I-BERT, LayoutLMv2, LayoutLMv3, LED, LiLT, Longformer, LUKE, LXMERT, MarkupLM, mBART, MEGA, Megatron-BERT, MobileBERT, MPNet, MPT, MRA, MT5, MVP, Nezha, Nyströmformer, OPT, QDQBert, Reformer, RemBERT, RoBERTa, RoBERTa-PreLayerNorm, RoCBert, RoFormer, Splinter, SqueezeBERT, T5, UMT5, XLM, XLM-RoBERTa, XLM-RoBERTa-XL, XLNet, X-MOD, YOSO

As you can see, many are deconder-only transformers (GPT-2, GPT-J, etc).

Hence, my question is, is there any prediction for Llama 2 models to be inclueded in this list?

nielsr · December 26, 2023, 6:23pm

Yes it’s true that you can also do that using decoder-only Transformers (I just don’t recommend it).

For LLaMa, could you open an issue on the Transformers library? Then I’ll assign it as “good first issue” since it’s a nice opportunity for a first contribution by someone. The implementation can be copied from GptjForQuestionAnswering.

nicholasKluge · December 27, 2023, 5:53pm

Thank you for the suggestion. I will open this feature request.

The motivation behind it is that the AutoModelForQuestionAnswering, regardless of whether encoder-only transformers are superior in closed Q&A settings, facilitates the evaluation of generative transformers in benchmarks like Squad.

nicholasKluge · December 27, 2023, 6:59pm

A feature request is open.

Topic		Replies	Views
Support for LLaMA in EncoderDecoder framework 🤗Transformers	1	527	March 8, 2025
Replacing the LlamaDecoderLayer Class hugging Face With New LongNet Intermediate	0	809	March 30, 2024
Llama 2 don't reponse prompt invokes Models	0	404	February 9, 2024
Can I use "AutoModel For Sequence Classification" class for generative models? 🤗Transformers	2	745	April 15, 2024
How to use llama 2 model for table based question answering? Models	1	1028	October 31, 2023

Llama 2 support for AutoModelForQuestionAnswering

Related topics