Hi,
I have created and trained a tokenizer. I then loaded using
tokenizer = PreTrainedTokenizerFast.from_pretrained(‘bert-base-he’)
I then created a model using starting to load
model = BertForPreTraining.from_pretrained(‘bert-base-uncased’)
I did everything to train the model both with the MLM approach and the NSP approach and everything worked fine. So I saved the model in my google drive.
I now want to use the model but I get the following error: KeyError: ‘logits’
This is the code I use:
Initialize MLM pipeline
from transformers import pipeline
mlm = pipeline(
“fill-mask”,
model = torch.load(‘BHSA01.pt’, map_location=torch.device(‘cpu’)), #it seems it wants just the model name in .json format
tokenizer = PreTrainedTokenizerFast.from_pretrained(‘bert-base-he’)
)
Get mask token
mask = mlm.tokenizer.mask_token
Get results for a particular masked phrase
phrase = f’{mask} בְּ רֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַ שָּׁמַ֖יִם וְ אֵ֥ת הָ’
result = mlm(phrase)
Print result
print(result)
Just one last note. Biblical Hebrew is written right to left. Can that create problem?
Thanks a lot for Your help!
Elia