Using transformers to flag specific complaints

Our company recently launched a new tactical solution in which we sent an email to customers who left us in less than 6 months asking to return their pedometers to our stores. looking at a random sample of 50 complaints, I see some of the customers have already returned the pedometers and we missed them. Customers didn’t use a specific wording to mention they returned the pedometers. I used ChatGPT for the 50 samples and it could infer whether the customer is claiming they returned the pedometers or not properly( 15 complaints were related to customers who returned their pedometers and the rest were related to other problems or products). I wanna do the same with transformers but can’t get it working.

I used pipeline to classify the complaints. It didn’t work properly and flagged only one complaints out of 15. I also used T5 large but it just could detect 5 out of 15 related complaints(I’m beginner and I used default parameters). I don’t wanna jump to fine tuning of models before trying the existing ones as I have to build a labelled data for this case. Appreciate it if you could share your thoughts on this.