Loading model from repository does not return expected result

I have trained a masked language model using RoBERTa on clinical data. The model is stored here tdobrxl/ClinicBERT 路 Hugging Face.

Locally, the trained model works pretty well for predicting a masked work with the following code:

fill_mask = pipeline("fill-mask", model="ClinicBERT", tokenizer="ClinicBERT")
fill_mask(text)

For example:

text = "Decitabine Combined With Oxaliplatin in <mask> With Advanced Renal Cell Carcinoma"

fill_mask(text)
[{'score': 0.9417486190795898, 'token': 593, 'token_str': ' Patients', 'sequence': 'Decitabine Combined With Oxaliplatin in Patients With Advanced Renal Cell Carcinoma'}, {'score': 0.024623002856969833, 'token': 943, 'token_str': ' Subjects', 'sequence': 'Decitabine Combined With Oxaliplatin in Subjects With Advanced Renal Cell Carcinoma'}, {'score': 0.0076624322682619095, 'token': 2488, 'token_str': ' Participants', 'sequence': 'Decitabine Combined With Oxaliplatin in Participants With Advanced Renal Cell Carcinoma'}, {'score': 0.0044851102866232395, 'token': 3380, 'token_str': ' Children', 'sequence': 'Decitabine Combined With Oxaliplatin in Children With Advanced Renal Cell Carcinoma'}, {'score': 0.003453735029324889, 'token': 4756, 'token_str': ' Adults', 'sequence': 'Decitabine Combined With Oxaliplatin in Adults With Advanced Renal Cell Carcinoma'}]

However, when I loaded the model from my repository (i.e., tdobrxl/ClinicBERT), it predicts the word randomly.

fill_mask = pipeline("fill-mask", model="tdobrxl/ClinicBERT", tokenizer="tdobrxl/ClinicBERT")
text = "Decitabine Combined With Oxaliplatin in <mask> With Advanced Renal Cell Carcinoma"
fill_mask(text)

[{'score': 0.00024121406022459269, 'token': 13994, 'token_str': 'oproxil', 'sequence': 'Decitabine Combined With Oxaliplatin inoproxil With Advanced Renal Cell Carcinoma'}, {'score': 0.00023664682521484792, 'token': 15167, 'token_str': 'enecid', 'sequence': 'Decitabine Combined With Oxaliplatin inenecid With Advanced Renal Cell Carcinoma'}, {'score': 0.000197140165255405, 'token': 18398, 'token_str': ' enlarged', 'sequence': 'Decitabine Combined With Oxaliplatin in enlarged With Advanced Renal Cell Carcinoma'}, {'score': 0.0001816125150071457, 'token': 4308, 'token_str': ' edema', 'sequence': 'Decitabine Combined With Oxaliplatin in edema With Advanced Renal Cell Carcinoma'}, {'score': 0.0001676036190474406, 'token': 23309, 'token_str': 'ucal', 'sequence': 'Decitabine Combined With Oxaliplatin inucal With Advanced Renal Cell Carcinoma'}]

I notice that if the model is loaded from the repo, it shows the warning:

You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

How to load the model from the repo and predict properly?

Turned out that when pushing to the hub, I used RobertaModel instead of RobertaForMaskedLM.
Using the right model solved the issue.