I have problem with teaching GPT-2 large model (774m) with transformers

Here is the code

`from transformers import AutoTokenizer, AutoModelForQuestionAnswering
from transformers import squad_convert_examples_to_features
from transformers.data.processors.squad import SquadV2Processor
import torch

Load the SQuAD dataset

processor = SquadV2Processor()
train_examples = processor.get_train_examples(‘train/’)
dev_examples = processor.get_dev_examples(‘train/’)

Load the GPT-2 model and tokenizer

model_name = ‘gpt2’
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForQuestionAnswering.from_pretrained(model_name).to(‘cuda’)

Convert examples to features

train_features = squad_convert_examples_to_features(
examples=train_examples,
tokenizer=tokenizer,
max_seq_length=384,
doc_stride=128,
max_query_length=64,
is_training=True,
return_dataset=‘pt’,
threads=6
)

dev_features = squad_convert_examples_to_features(
examples=dev_examples,
tokenizer=tokenizer,
max_seq_length=384,
doc_stride=128,
max_query_length=64,
is_training=False,
return_dataset=‘pt’,
threads=6
)

Fine-tune the GPT-2 model on the SQuAD dataset

from transformers import Trainer, TrainingArguments

training_args = TrainingArguments(
output_dir=‘./results’,
evaluation_strategy=‘epoch’,
num_train_epochs=3,
per_device_train_batch_size=16,
per_device_eval_batch_size=16,
warmup_steps=500,
weight_decay=0.01,
logging_dir=‘./logs’,
)

trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_features,
eval_dataset=dev_features,
).to(‘cuda’)

trainer.train()
`

And here is the full error message

100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 442/442 [00:34<00:00, 12.78it/s] 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 35/35 [00:03<00:00, 10.88it/s] Traceback (most recent call last): File "c:\Users\moroz\Desktop\train_ai\import torch.py", line 14, in <module> model = AutoModelForQuestionAnswering.from_pretrained(model_name).to('cuda') File "C:\Users\moroz\AppData\Local\Programs\Python\Python310\lib\site-packages\transformers\models\auto\auto_factory.py", line 467, in from_pretrained raise ValueError( ValueError: Unrecognized configuration class <class 'transformers.models.gpt2.configuration_gpt2.GPT2Config'> for this kind of AutoModel: AutoModelForQuestionAnswering. Model type should be one of AlbertConfig, BartConfig, BertConfig, BigBirdConfig, BigBirdPegasusConfig, BloomConfig, CamembertConfig, CanineConfig, ConvBertConfig, Data2VecTextConfig, DebertaConfig, DebertaV2Config, DistilBertConfig, ElectraConfig, ErnieConfig, FlaubertConfig, FNetConfig, FunnelConfig, GPTJConfig, IBertConfig, LayoutLMv2Config, LayoutLMv3Config, LEDConfig, LiltConfig, LongformerConfig, LukeConfig, LxmertConfig, MarkupLMConfig, MBartConfig, MegatronBertConfig, MobileBertConfig, MPNetConfig, MvpConfig, NezhaConfig, NystromformerConfig, OPTConfig, QDQBertConfig, ReformerConfig, RemBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, SplinterConfig, SqueezeBertConfig, XLMConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, YosoConfig.

Can someone help me fix this problem or tell me how to teach GPT-2 (locally on my pc) for Q&A with ready-to-use dataset?

Hey @StupidAiCat :wave:

You’re getting that exception because our GPT2 implementation doesn’t have a Question Answering head.

Solutions include:

  1. Use another model, as suggested in the exception
  2. Open a PR with GPT2+QA head :slight_smile:
1 Like

Thx, i created another post in “models” category with working (by logic) code, and some problems. And now i used GPT2DoubleHeadsModel from Transformers lib