Export bigbird model to onnx

When I tried to export bigbird model to onnx using the script below,

class BigBirdForMaskedLMOnnxConfig(OnnxConfig):
    @property
    def inputs(self) -> Mapping[str, Mapping[int, str]]:
        return OrderedDict(
            [
                ("input_ids", {0: "batch", 1: "sequence"}),
                ("attention_mask", {0: "batch", 1: "sequence"}),
            ]
        )

config = AutoConfig.from_pretrained("./model_save_point")
onnx_config = BigBirdForMaskedLMOnnxConfig(config, task="masked-lm")
onnx_path = Path("./model_save_point/model.onnx")
model_ckpt = "./model_save_point"
base_model = BigBirdForMaskedLM.from_pretrained(model_ckpt)
tokenizer = BigBirdTokenizerFast.from_pretrained(model_ckpt)
onnx_inputs, onnx_outputs = export(tokenizer, base_model, onnx_config, onnx_config.default_onnx_opset, onnx_path)

I got the following warning.

/home/sypark/anaconda3/lib/python3.8/site-packages/transformers/models/big_bird/modeling_big_bird.py:2063: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can’t record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if self.attention_type == “block_sparse” and seq_length <= max_tokens_to_attend:
Attention type ‘block_sparse’ is not possible if sequence_length: 8 <= num global tokens: 2 * config.block_size + min. num sliding tokens: 3 * config.block_size + config.num_random_blocks * config.block_size + additional buffer: config.num_random_blocks * config.block_size = 704 with config.block_size = 64, config.num_random_blocks = 3. Changing attention type to ‘original_full’…

Also, the output values were quite different from those of the pytorch model.

How can I solve this problem ?

Thanks in advance for your help.