Topic classification: is zero-shot the way?

Neuroinformatica · August 12, 2021, 9:35am

Hello there

I need to extract info from italian medical records. Since such documents are somewhat long (2-3k words), my idea is to split them in subsections, and then try to classify the topic of each subsection. This would allow me to be more confident about info extraction (i.e. I am extracting a specific info from the correct subsection).

I was wondering if zero-shot classification is the best tool to do this. Moreover, since the content of records is very specific (exams results, medical reports and so on), I think I will need to do a fine-tuning of some kind.
If this is correct, how could I do that? And approximately how many data I would need?

Thanks a lot

Topic		Replies	Views
Seperating Paragraphs in Text File Based on Topics for Zero-Shot Classification Beginners	1	215	May 8, 2024
Zero shot classification for long form text Beginners	4	606	July 15, 2024
News topic classifier Intermediate	0	376	August 8, 2021
Improving zero-shot classification for roughly tokenized labels Models	0	765	December 30, 2021
Understanding zero-shot classification in one-shot ;-) Intermediate	3	2328	August 2, 2021

Topic classification: is zero-shot the way?

Related topics