Hello everyone !
I am a product manager working on matching algorithms for a marketplace in the health sector, and one of the questions I often come asking myself is: do we collect the right data points to automate some part of our matching processes ?
I recently started using the Autotrain feature to answer this question, with the idea that, if I can train a classifier with good performance on a given dataset, it means that we do collect the right data points. If not, it means we need to evolve our product.
This has shown very good results, but I would now like to go one step further and understand the generated models. Typical questions I have are:
- assuming the best autotrained model is a decision tree, how can I understand its decision rules ?
- assuming it’s an xgboost classifier, can I know the feature importance that the model learned ?
Any guidance on how I can answer these questions would be much appreciated
Thank you in advance !