Autotrain: explain generated models

Hello everyone !

I am a product manager working on matching algorithms for a marketplace in the health sector, and one of the questions I often come asking myself is: do we collect the right data points to automate some part of our matching processes ?

I recently started using the Autotrain feature to answer this question, with the idea that, if I can train a classifier with good performance on a given dataset, it means that we do collect the right data points. If not, it means we need to evolve our product.

This has shown very good results, but I would now like to go one step further and understand the generated models. Typical questions I have are:

  • assuming the best autotrained model is a decision tree, how can I understand its decision rules ?
  • assuming it’s an xgboost classifier, can I know the feature importance that the model learned ?

Any guidance on how I can answer these questions would be much appreciated :pray:

Thank you in advance !