I have been tasked with tacking the following problem and I wanted to ask for different approaches on how to best approach it.
I am looking to infer the intent of finalising the transaction during a chat conversation. For example: buyer messages “are there any scratches on the table?” and gets a response “no, there are no scratches, the table is brand new” the probability of finalizing the transaction is 89%.
Chat data is available for the last month all in Polish with a flag pointing if a transaction was completed or not. The feedback was acquired by sending a custom binary closed question 48h after the conversation ended probing both sides buyer and seller.
I was looking to preprocess the whole dialogue (remove stopwords, lemmatisation) as one text and pass it through a TF-IDF (use n-grams as well). Then based on the frequency of words determine how relevant those words are to a transaction or not and then fit a classifier (naive bayes) to determine the probability of a transaction. An open question still to answer is to use the whole dialogue up until a point or just use the last 2,4… message exchanged between the buyer and the seller.
Looking forward to your thoughts on the topic. Thanks a lot in advance for your help.