Named Entity Recognition in medical notes

Hi all,

I am really new to NLP and I am starting a project to extract features from medical notes for specific cancer diseases. I would like to use model such as bioBERT or clinicalBERT and fine tune them for extracting medical information. However, I do not know how to prepare a dataset for Named Entity Recognition. Could you give me some advices?