Sales transcript training

We just got started with AI and LLMs, and would like to get some feedback/directions.

We have thousands of text transcripts between our salespeople and customers. Once in a while we go thru some of these manually, and grade them based on a few criteria; i.e. did the salesperson greet the customer properly, did the salesperson offer any additional products, etc. We have about a thousand scripts graded this way with detailed explanations for that grade.

We would like to use AI for this purpose and use the graded transcripts for training initially.

I assume this is a common use case for AI.

What is a cost effective approach for this?

Thanks

2 Likes

Hi!
You might not need a full LLM for this task—there are more efficient and cost-effective approaches. If your goal is structured evaluation, TF-IDF with a classifier like Random Forest or Logistic Regression can be a simple and scalable solution. It converts text into numerical vectors based on word frequency, though it doesn’t capture meaning beyond word presence. TF-IDF works well for structured classification when keyword presence is a strong indicator, but it struggles with nuanced language understanding

For a deeper understanding, Word Embeddings (Word2Vec, FastText, GloVe) represent words as dense numerical vectors, helping classification models detect patterns beyond exact word matches. If context and sentence relationships matter, Sentence Embeddings (SBERT, OpenAI Embeddings) provide a more nuanced representation.

If you’re also interested in uncovering patterns beyond predefined criteria, Topic Modeling (LDA, BERTopic) can automatically identify recurring themes in sales conversations, offering valuable insights you might not have explicitly defined. A balanced approach could combine classification for structured grading with topic modeling for broader insights.

Hope this helps!

2 Likes

Thanks, very helpful. I think some of the criteria will require language understanding beyond keywords. Not sure if we need to go beyond word embeddings yet, but I will test all those.

1 Like