Decision Tree!

3 min readJan 23, 2022

A Decision Tree is one of the most fundamental supervised machine learning algorithms for classification and regression problems. It works very well with complex datasets also it is very easy to understand. As the name suggests, this algorithm works by dividing the whole dataset into a tree-like structure based upon some conditions and then giving predictions based on those conditions.

Two types of decision trees

Decision Tree for Categorical Variable

In this, we have a categorical target or output variables. For example, Yes/No, 0/1.

from sklearn.tree import DecisionTreeClassifier

Decision Tree for Continous Variable

In this, we have a continuous target or output variables. For example, we have to predict the income of a person depending upon occupation, education, age, etc…

from sklearn.tree import DecisionTreeRegressor

For splitting the nodes of a decision tree, we have to select the best attribute for the parent node. Entropy and Ginni Impurity help us to select the best attributes for the nodes of the tree.

Entropy: Entropy will measure the purity present in the dataset. Entropy lies between 0 to 1. It is given by,

E = -p*log2(p) – q*log2(q)

Gini Impurity: According to Wikipedia, Gini impurity is a measure of how often a randomly chosen element from the set would be incorrectly labeled if it was randomly labeled according to the distribution of labels in the subset. It is lies between 0 to 0.5. It is given by,

Ginni = 1 - sum(Pi**2)

Entropy VS Ginni

Entropy and Ginni impurity both are ways of selecting the root node. It doesn’t affect the result much. Although, Ginni is easier to compute than entropy since entropy has a Log term in the calculation. That’s why the CART algorithm uses Ginni as the default algorithm.

Different Algorithms for Decision Tree

ID3 (Iterative Dichotomiser): It is one of the algorithms used to construct a decision tree for classification. It uses Information gain as the criteria for finding the root nodes and splitting them. It only accepts categorical attributes.
C4.5: It is an extension of the ID3 algorithm, and better than ID3 as it deals with both continuous and discreet values.
CART (Classification and Regression Algorithm): It is the most popular algorithm used for constructing a decision tree. It uses Ginni impurity as the default calculation for selecting root nodes, however, we can use entropy for criteria as well. This algorithm works on both regression as well as classification problems.

Advantages of decision trees

Trees can be visualized.
Simple and easy to understand and interpret.
It can be used for both Regression and Classification problems.
It can handle both continuous and categorical variables.
Scaling and normalization (Feature Scaling) are not needed.
It will handle the missing values.
Robust to outliers.

Disadvantages of decision trees

overfitting are very high for Decision Trees. (Low Bias, High Variance)
A small change in data can cause instability in the model because of the greedy approach.
Not suitable for large datasets. (Use Random Forest)

Summary

In this article, We saw what is decision tree, the types of a decision tree, different algorithms of decision trees, the advantages and disadvantages of the decision tree.

Thanks for reading. Please let me know in the comments below or else ping me on LinkedIn if you have any doubt.

find me on LinkedIn | GitHub | Email

Happy Learning!!! ^_^