Oddball Data Science Interview Questions on Decision Trees

Decision trees are the building blocks of most ML models. So often questions regarding Decision Trees are asked in data science interviews. In this post, I’ll try to cover some questions which are asked during data science interviews but often catch people by surprise.

Are Decision Trees Parametric or non-parametric models ?

Decision Trees are non-parametric models. Linear Regression and Logistic regression are examples of parametric models

Why is Gini Index preferred way of growing decision trees than Entropy in Machine Learning libraries ?

The calculation for Gini Index is computationally more effecient than that for Entropy.
$https://latex.codecogs.com/svg.image?1 - \sum_{i=0}^{C}p_{i}^{2}$

It’s because of this reason that it is the preferred way

How are continouis variables handled as predictor variables in decision trees ?

Continuous or numerical variables are binned and then used for splitting a node in Decision tree

What is optimised in case the target is a continuous variable or when the task is Regression ?

Variance reduction is used to choose the best split when the target is continuous.

How do decision trees handle multiple classes or in other words does multi-class classification ?

The split is done on Information gain like in case of binary classifier using Gini or Entropy. In the leaf where no further splits are possible, the class having the highest probability is the predicted class. You can even return the probability as well.

Oddball Data Science Interview Questions on Decision Trees

Comments

Leave a comment Cancel reply

More posts

I Tested GPT-5 API So You Don’t Have To

Daily Coding Problem #1511

LeetCode 1249 – Minimum Remove To Make Valid Parentheses

LeetCode 3371 – Identify The Largest Outlier In An Array (Python)

Oddball Data Science Interview Questions on Decision Trees

Share this:

Comments

Leave a comment Cancel reply

More posts

I Tested GPT-5 API So You Don’t Have To

Daily Coding Problem #1511

LeetCode 1249 – Minimum Remove To Make Valid Parentheses

LeetCode 3371 – Identify The Largest Outlier In An Array (Python)