Fintuning Transformer on CLEF dataset

lewtun · September 1, 2021, 8:59am

Hey @rasel277144, if I understand correctly you’d like to classify whether whether a user is “depressed” based on their posts?

In this case, you could concatenate all the user posts and treat it as a standard classification problem and Sylvain has created a nice tutorial for this task here.

Having said that, you will probably run into limitations with the maximum context size of models like BERT (typically just a few paragraphs), so you might want to see if models like BigBird or LongFormer can help as their context size is 8x that of BERT. If that’s still not sufficient, you might want to adapt some of the suggestions in this thread to text classification (e.g. you could create an embedding for each user post, average the embeddings, and then use those embeddings for a simple logistic regression classifier)

PS I put “depressed” in quotes because I assume this is not a phenomenon we can hope to capture accurately from written text alone. I also suggest treading very carefully in this domain as there’s plenty of public examples where using NLP to diagnose patient well-being leads to bad outcomes.

Topic		Replies	Views
[Help Needed] Suicide Risk Detection from Long Clinical Notes (Few-shot + ClinicBERT approaches struggling) Models	2	27	June 10, 2025
How can I do word classification? Beginners	3	1448	July 26, 2021
Dealing with Imbalanced Datasets? Research	1	5469	March 11, 2021
Restricting BERT scores; Methods to counter high confidence in classification of short non-word-like-phrases to labels Beginners	0	467	May 27, 2021
Super Beginner to NLP. I am not sure if what i did is correct. Please help Beginners	0	331	April 13, 2023

Fintuning Transformer on CLEF dataset

Related topics