Text data labeling

Amit4454 · June 12, 2024, 7:14pm

0

Task: Classify customer emails into relevant categories based on their content.

Data: DataFrame containing customer emails. I have a dataset of customer emails stored in a data frame. Each email pertains to a specific issue the customer encountered. My goal is to categorize these emails automatically. For example, an email about problems with food quality would be categorized as “Food Quality Issue,” while an email regarding payment difficulties would be categorized as “Payment Issue.”

The challenge lies in the unknown number of potential categories. There could be a vast range of issues customers might contact us about.

My plan is to address this challenge in two steps:

Data Labeling: I need to label a representative sample of emails from the data frame. This labeling process involves assigning each email a category that accurately reflects the customer’s concern.
Classifier Model Training: Once I have a labeled dataset, I can use it to train a machine learning model. This model will then be able to automatically categorize new, unseen emails based on the patterns it learns from the labeled data. I need answer for first part , how can i label data.

My approach was to generate embeddings of email and then apply clustering,but result is not good. Please help me to solve this problem and is any thing to apply before generating embeddings.

Topic		Replies	Views
📧 Method question to solve a specific mail classification problem Beginners	0	436	June 9, 2023
Email classification, labeling and entity classification/extraction Beginners	2	612	June 6, 2024
Advice on an email classification problem Beginners	3	367	August 27, 2024
NLP Training data Intermediate	0	129	March 14, 2024
Is text classification always sentiment analysis (what is the task if other non-sentiment labels are needed)? Beginners	2	833	March 3, 2023

Text data labeling

Related topics