We are trying to re-train/fine-tune the T5-small model in the transformers library to output alpha-numeric ICD-10 codes from raw text cause of death as seen below.
“cardiac arrest metabolic acidosis end stage renal disease type 2 diabetes” → “E117 E111 I469 N185”
We have previously re-trained/fine-tuned the BERT model to predict single ICD-10 codes (multi-class prediction) from raw text cause of death.
The main issue we are having is getting the data in the right shape for training. Any advice would be greatly appreciated!