Hello everyone,
I am very new to the topic, so sorry if this question is obvious.
I’d like to start working on this task (Chapter 5 - Time to slice and dice):
- Use the techniques from Chapter 3 to train a classifier that can predict the patient condition based on the drug review.
Since this label (patient condition) is also a string (I think there are 819 unique conditions), what would be the best approach? I was thinking about tokenizing this field and then use a seq2seq model. Or maybe assign a number to each unique condition
Thanks for the great course!