Get UnicodeEncodeError while using pipeline for question answering

Hello everyone,
I need some help. I fine-tuned the question answering model with PyTorch.
First, I prepare the SQUAD dataset using the following codes.

import os
squad_dir = ‘./data/squad’
import json
with open(os.path.join(squad_dir, ‘train-v2.0.json’), ‘rb’) as f:
squad = json.load(f)

initialize list where we will place all of our data

new_squad =

we need to loop through groups → paragraphs → qa_pairs

for group in squad[‘data’]:
for paragraph in group[‘paragraphs’]:
# we pull out the context from here
context = paragraph[‘context’]
for qa_pair in paragraph[‘qas’]:
# we pull out the question
question = qa_pair[‘question’]
# now the logic to check if we have ‘answers’ or ‘plausible_answers’
match qa_pair:
case {‘answers’: [{‘text’: answer}]}:

            case {'plausible_answers': [{'text': answer}]}:
            case _:
                answer = None

            'question': question,
            'answer': answer,
            'context': context

import io
with‘new_train4.json’, ‘w’, encoding=“utf-8”) as f:
json.dump(new_squad, f)

next, I tried fine tune using the following codes

import json

with open(‘new_train4.json’, ‘r’, encoding=‘utf-8’) as f:
squad = json.load(f)

from transformers import BertTokenizer, BertForQuestionAnswering

modelname = ‘deepset/bert-base-cased-squad2’

#modelname = ‘dbmdz/bert-large-cased-finetuned-conll03-english’

tokenizer = BertTokenizer.from_pretrained(modelname)
model = BertForQuestionAnswering.from_pretrained(modelname)

from transformers import pipeline

qa = pipeline(‘question-answering’, model=model, tokenizer=tokenizer)

qa({‘question’: squad[0][‘question’], ‘context’: squad[0][‘context’]})

But. the code does not work correctly and get error in qa command.
UnicodeEncodeError: ‘charmap’ codec can’t encode characters in position return codecs.charmap_encode(input,self.errors,encoding_table)[0]

is there a way to sole this error? Thanks.