I am evaluating a fine-tuned BART model (pretrained base = bart-large-cnn
). I am able to instantiate the Seq2SeqTrainer class no problem.
When I call evaluate()
on the trainer, the evaluation loop appears to run perfectly fine, however as soon as the progress bar finishes, additional logs have printed that show that it is now attempting to download roberta-large
into RobertaForMaskedLM
? It is not at all clear to me where in the evaluate
function that this is occurring from, nor what part the roberta-large
model is playing in the evaluation process. My concern is that the evaluation results I receive are not reliable as I can’t be certain what model is being used.
The trainer is instantiated with a model collected like so:
config = BartConfig.from_pretrained(
"facebook/bart-large-cnn",
cache_dir=model_args.cache_dir,
revision=model_args.model_revision
)
model = BartForConditionalGeneration.from_pretrained(
"facebook/bart-large-cnn",
from_tf=bool(".ckpt" in model_args.model_name_or_path),
config=config,
cache_dir=model_args.cache_dir,
revision=model_args.model_revision
)
When I call evaluate like so:
trainer.evaluate(max_length=100, num_beams=5, metric_key_prefix="eval")
I receive the following output:
Does anyone have any idea why a new model that I have not specified anywhere in my code is being downloaded during the evaluation execution?
Dependencies
python3.6
datasets==1.10.2
transformers==4.9.1
torch==1.6.0
tensorflow==2.5.0