Decision Tree


Decision trees are tree-like graphs that model a decision.

Decision trees are commonly learned by recursively splitting the set of training instances into subsets based on the instances' values for the explanatory variables.

Decision trees are easy to use. Unlike many learning algorithms, decision trees do not require the data to have zero mean and unit variance. While decision trees can tolerate missing values for explanatory variables, scikit-learn's current implementation cannot. Decision trees can even learn to ignore explanatory variables that are not relevant to the task.

Decision trees are more prone to Over-Fitting than many of the models we discussed, as their learning algorithms can produce large, complicated decision trees that perfectly model every training instance but fail to generalize the real relationship. Several techniques can mitigate Over-Fitting in decision trees. Pruning is a common strategy that removes some of the tallest nodes and leaves of a decision tree

META

Status:: #wiki/notes/mature
Plantations:: Machine Learning
References:: Mastering Machine Learning with scikit-learn