[Help Needed] Suicide Risk Detection from Long Clinical Notes (Few-shot + ClinicBERT approaches struggling)

Hello HF community,

I’m a master’s student working on a clinical NLP project involving suicide risk classification from psychiatric patient records. I’d really appreciate any guidance on how to improve performance in this task.

Overview of the task:

• 114 records, each including:

• Free-text doctor and nurse notes

• hospital name

• Binary label: whether the patient later died by suicide (yes/no)

• Only 29 yes examples → highly imbalanced

• Notes are unstructured, long (up to 32k characters), and rich in psychiatric language

What I’ve tried:

• Concatenating the doctor + nurse texts

• Sliding window chunking + aggregation (majority voting)

• Few-shot learning using GPT-4

• Fine-tuning ClinicBERT on the dataset

Despite these efforts, recall on the yes cases is consistently low. It seems the models struggle to recognize subtle suicidal patterns in long, complex, domain-specific text — especially under token limitations.

I’d love input on:

• Handling long clinical texts with LLMs

• Boosting performance on minority class (yes)

• Experiences working with BERT-style models or few-shot prompts in sensitive medical contexts

Happy to share sample data, code, or results if it helps. Thanks a lot!

1 Like