Seq2SeqTrainer downloads different model on evaluation

benobe · August 18, 2021, 3:18pm

I am evaluating a fine-tuned BART model (pretrained base = bart-large-cnn). I am able to instantiate the Seq2SeqTrainer class no problem.

When I call evaluate() on the trainer, the evaluation loop appears to run perfectly fine, however as soon as the progress bar finishes, additional logs have printed that show that it is now attempting to download roberta-large into RobertaForMaskedLM? It is not at all clear to me where in the evaluate function that this is occurring from, nor what part the roberta-large model is playing in the evaluation process. My concern is that the evaluation results I receive are not reliable as I can’t be certain what model is being used.

The trainer is instantiated with a model collected like so:

    config = BartConfig.from_pretrained(
        "facebook/bart-large-cnn",
        cache_dir=model_args.cache_dir,
        revision=model_args.model_revision
    )
    model = BartForConditionalGeneration.from_pretrained(
        "facebook/bart-large-cnn",
        from_tf=bool(".ckpt" in model_args.model_name_or_path),
        config=config,
        cache_dir=model_args.cache_dir,
        revision=model_args.model_revision
    )

When I call evaluate like so:

trainer.evaluate(max_length=100, num_beams=5, metric_key_prefix="eval")

I receive the following output:

Does anyone have any idea why a new model that I have not specified anywhere in my code is being downloaded during the evaluation execution?

Dependencies
python3.6
datasets==1.10.2
transformers==4.9.1
torch==1.6.0
tensorflow==2.5.0

Topic		Replies	Views
Model trains with Seq2SeqTrainer but gets stuck using Trainer 🤗Transformers	4	1956	August 23, 2021
Evaluate model at saved checkpoint 🤗Transformers	0	1295	June 22, 2021
Problem fine-tuning a model with Seq2Seq Trainer Beginners	1	994	June 25, 2023
How to view the changes in a model after training? Beginners	4	445	November 10, 2021
Evaluate a fine-tune zero-shot Facebook model error Beginners	0	310	May 17, 2023

Seq2SeqTrainer downloads different model on evaluation

Related topics