Need guidance in selecting the model and the required approach

Hi, I am following the Hugging Face course. In chapter.5, the Datasets Library, at the end of the topic “Time to slice and dice”, the following exercise is given.

1. Use the techniques from [Chapter 3](https://huggingface.co/course/chapter3) to train a classifier that can predict the patient condition based on the drug review.

My queries:

  1. The ‘patient condition’ feature has over 1000 unique labels in the dataset. Given this fact, if I need to predict ‘patient condition’ based on ‘reviews’ feature, I think I shall first convert these two features into tensor objects. Is it correct?

  2. I see this as zero-shot classification problem. Is it correct?

  3. Further, which chekpoint/ model shall I select as the pre-trained classifier, for me to train on the dataset? How do I configure the two tensor objects in the model?

Appreciate inputs.