How to evaluate boolq models

Hi,

How can I evaluate an existing model trained on boolq dataset, WITHOUT retraining it?
I’m trying the “evaluate” package of HF, and the question-answering evaluator, but I got some errors.

Here’s my main code:

from transformers import AutoModelWithHeads
import torch

from datasets import load_dataset
import evaluate
from evaluate import evaluator

model = AutoModelWithHeads.from_pretrained("roberta-base")
adapter_name = model.load_adapter("AdapterHub/roberta-base-pf-boolq", source="hf")
model.active_adapters = adapter_name

from transformers import RobertaTokenizer, RobertaModel
tokenizer = RobertaTokenizer.from_pretrained('roberta-base')

eval = evaluator("question-answering")
results = eval.compute(model_or_pipeline=model, data="boolq", metric="accuracy", 
                      question_column="question", context_column="passage", 
                       id_column=None, label_column="answer")

It gave me this error:

/opt/anaconda3/envs/hugging-face/lib/python3.7/site-packages/evaluate/evaluator/question_answering.py in compute(self, model_or_pipeline, data, subset, split, metric, tokenizer, strategy, confidence_level, n_resamples, device, random_state, question_column, context_column, id_column, label_column, squad_v2_format)
    189             context_column=context_column,
    190             id_column=id_column,
--> 191             label_column=label_column,
    192         )
    193 

/opt/anaconda3/envs/hugging-face/lib/python3.7/site-packages/evaluate/evaluator/question_answering.py in prepare_data(self, data, question_column, context_column, id_column, label_column)
    104                 "context_column": context_column,
    105                 "id_column": id_column,
--> 106                 "label_column": label_column,
    107             },
    108         )

/opt/anaconda3/envs/hugging-face/lib/python3.7/site-packages/evaluate/evaluator/base.py in check_required_columns(self, data, columns_names)
    301             if column_name not in data.column_names:
    302                 raise ValueError(
--> 303                     f"Invalid `{input_name}` {column_name} specified. The dataset contains the following columns: {data.column_names}."
    304                 )
    305 

ValueError: Invalid `id_column` None specified. The dataset contains the following columns: ['question', 'answer', 'passage'].

I’m not sure I’m in the right direction in evaluating a boolq model.
Please advise. Thanks a lot!