Hello,
I’m currently doing a research project where the main aim is to generate synthetic clinical text with the MIMIC III dataset being used for reference, particularly the NOTEEVENTS table where there is free text such as discharge summaries, radiology reports and more.
I thought that it would be worthwhile investigating how an encoder-decoder type of model, such as FLAN-T5, could be used to generate this text, however, I’m not entirely sure if the task is feasible as I’m still a beginner when it comes to this stuff. I noticed that FLAN-T5 is typically used for text summarization and translation but that differs from what I’m trying to do.
So, basically my question is: Can a FLAN-T5 model be fine-tuned using MIMIC clinical text to generate synthetic clinical text? If yes, how difficult would it be to accomplish this and are there any tutorials that can get me started?
Thank you in advance.