I have been exploring scikit-learn, making decision trees with both entropy and gini splitting criteria, and exploring the differences.
My question, is how can I "open the hood" and find out exactly which attributes the trees are splitting on at each level, along with their associated information values, so I can see where the two criterion make different choices?
So far, I have explored the 9 methods outlined in the documentation. They don't appear to allow access to this information. But surely this information is accessible? I'm envisioning a list or dict that has entries for node and gain.
Thanks for your help and my apologies if I've missed something completely obvious.
Best Answer
Directly from the documentation ( http://scikit-learn.org/0.12/modules/tree.html ):
There is also the
tree_
attribute in your decision tree object, which allows the direct access to the whole structure.And you can simply read it
for more details look at the source code of export method
In general you can use the
inspect
moduleto get all the object's elements