Mistral Unrecognized configuration class

I’m using autotrain to finetune an already fine-tuned Question Answering model ‘BramVanroy/GEITje-7B-ULTRA’, which is based on Mistral 7B and further pretrained on Dutch data.

I have some context specific question-answer pairs I’d like to further finetune the model with. However, when using a simple autotrain space with the following settings (see picture), I get the following error:

aautotrain.trainers.common:wrapper:121 - Unrecognized configuration class <class ‘transformers.models.mistral.configuration_mistral.MistralConfig’> for this kind of AutoModel: AutoModelForSeq2SeqLM.
Model type should be one of BartConfig, BigBirdPegasusConfig, BlenderbotConfig, BlenderbotSmallConfig, EncoderDecoderConfig, FSMTConfig, GPTSanJapaneseConfig, LEDConfig, LongT5Config, M2M100Config, MarianConfig, MBartConfig, MT5Config, MvpConfig, NllbMoeConfig, PegasusConfig, PegasusXConfig, PLBartConfig, ProphetNetConfig, SeamlessM4TConfig, SeamlessM4Tv2Config, SwitchTransformersConfig, T5Config, UMT5Config, XLMProphetNetConfig…

What could I be doing wrong?

I figured out the solution: the model BramVanroy/GEITje-7B-ULTRA uses the Mistral architecture. The error message showed that this architecture doesn’t work with ‘Sequence to Sequence’ tasks. So, I needed to choose LLM SFT (Supervised Fine-Tuning) instead. This approach is meant for refining language models using labeled data. In my case, I used question-answer pairs related to a specific domain. Here, each question is the input and its corresponding answer is the output. After switching to LLM SFT, the error disappeared and I could fine-tune the model successfully."

This topic was automatically closed 12 hours after the last reply. New replies are no longer allowed.