The performance of the huggingface QA model depend on the order in which it loads

TaeCheon · April 28, 2021, 5:38am

transformers version: 4.4.2
Python version: 3.7

I am implementing a paper that I read based on the Question Answering code “run_qa.py” on huggingface.
I added a few layer in the ELECTRA, and I trained and saved only the parameters for the added layer.

when I evaluate, I load that parameters and the rest were initialized by parameters of the pre-trained ELECTRA model.

def load_cda_qa_model(args, phase, checkpoint=None):
    # assert phase == 'train' or phase == 'eval'
    config = CONFIG_CLASSES[args.model_type].from_pretrained(args.model_name_or_path)

    model = MODEL_FOR_QUESTION_ANSWERING[args.model_type].from_pretrained(checkpoint)
    tmp_electra = MODEL_FOR_QUESTION_ANSWERING['electra'].from_pretrained(args.model_name_or_path, config=config)

    electra_state_dict = tmp_electra.state_dict()
    model_state_dict = model.state_dict()

    for electra_key, electra_value in electra_state_dict.items():
        model_state_dict[electra_key] = electra_value

    model.load_state_dict(model_state_dict)

    return model

the results of two cases are:

What I want to ask here is why the results change when the order of writing in the red and yellow parts seems to be no difference in code flow.

Topic		Replies	Views
Trainer "load_best_model_at_end" doesn't load the best model Intermediate	0	2554	February 21, 2023
Saving-Loading Model in Colab and Making Predictions Beginners	2	15343	June 15, 2021
Prakash Hinduja Switzerland (Swiss) How do I load a pre-trained model in Hugging Face? Beginners	1	23	June 26, 2025
How to load my own pretrained model to huggingface code Intermediate	1	860	January 31, 2023
A few questions about beginning with Huggingface Beginners	1	509	July 18, 2022

The performance of the huggingface QA model depend on the order in which it loads

Related topics