BERT for medical information extraction

naccib · November 3, 2022, 8:35pm

Hello, HuggingFace community. I’ve got unstructured lab reports which contains the values of each test result. For example, this is a report containing the test results for magnesium (MAGNESIO, 2,0), potassium (POTASSIO, 4,9) and sodium (SODIO, 137).

MAGNESIO\nMaterial de Coleta: SORO\nMétodo: Clorofosfonazo IlI\nReferência\nResultado:\n2,0\nmg/dL\n1 1,7 a 2,5 mg/dL\nPOTASSIO\nMaterial de Coleta: SORO\nMétodo: Eletrodo Seletivo\nReferência\nResultado\n4,9\nmEq/L\n1 3,5 a 5,1 mEq/L\nSODIO\nMaterial de Coleta: SORO\nMétodo: Eletrodo Seletivo\nReferência\nResultado\n137\nmEq/L\n/ 135 a 145 mEq/L\n
(Test name and result annotated for ease of reading)

I would like to use a BERT-like model to extract this information in a structure similar as:

{
   "magnesium": "2,0",
   "potassium": "4,9",
   "sodium": "137"
}

Since my inputs are in the Portuguese language, I figured BERTimbau would be a good foundational model. Is using BERT the appropriate way to solve my problem? How would I go about annotating my training data and setting up my model for training?

Topic		Replies	Views
Multilingual NLP with BERT Beginners	0	376	December 14, 2021
Finetuning German BERT for QA on biomedical domain Research	2	1016	January 30, 2022
Medical NER based on Bert in Norwegian Research	0	276	June 21, 2023
Correct way to structure BERT for genetic segmentation? Beginners	1	625	October 31, 2020
Help a beginner starting this journey Beginners	0	292	February 8, 2024

BERT for medical information extraction

Related topics