Decision Tree Regression
Decision Tree Regression is a supervised machine learning algorithm used to predict:
Continuous numerical values
Unlike Decision Tree Classification:
Which predicts categories
Decision Tree Regression predicts:
-
House prices
-
Salary values
-
Sales forecasting
-
Temperature prediction
Main Idea of Decision Tree Regression
Decision Tree Regression works by:
Splitting data into smaller regions
and predicting:
Average value inside each region
Real-Life Example
Suppose we want to predict:
Salary based on years of experience
The algorithm may create rules like:
Experience < 3 years → Salary = 25,000
Experience between 3 and 5 years → Salary = 45,000
Experience > 5 years → Salary = 80,000
How Decision Tree Regression Works
The algorithm:
-
Finds the best split point
-
Divides data into regions
-
Calculates average output in each region
-
Repeats splitting recursively
Decision Tree Regression does NOT create:
A straight line
like Linear Regression.
Instead:
It creates step-like predictions
Example Dataset
| Experience | Salary |
|---|---|
| 1 | 15 |
| 2 | 20 |
| 3 | 28 |
| 4 | 40 |
| 5 | 60 |
| 6 | 80 |
Simple Tree Idea
The model may split like:
Experience < 3.5
Then:
Left side:
Average salary = 21
Right side:
Average salary = 60
Why Splitting Happens
The goal is:
Reduce prediction error
The model tries to group similar salary values together.
Important Terms
| Term | Meaning |
|---|---|
| Root Node | First split |
| Leaf Node | Final prediction |
| Split | Decision condition |
| Depth | Levels in tree |
How Best Split is Selected
Decision Tree Regression commonly uses:
Mean Squared Error (MSE)
to select the best split.
MSE Formula
MSE = mean((Actual - Predicted)²)
Lower MSE:
Better split
Mathematical Example
Suppose:
| Experience | Salary |
|---|---|
| 1 | 20 |
| 2 | 25 |
| 3 | 30 |
| 4 | 80 |
Suppose the tree splits at:
Experience < 3.5
Left Region
Values:
20, 25, 30
Average:
(20 + 25 + 30) / 3
75 / 3
25
Prediction for left region:
25
Right Region
Values:
80
Prediction:
80
Prediction Logic
If:
Experience = 2
Prediction becomes:
25
because it falls in the left region.
Why Decision Tree Regression is Powerful
Decision Tree Regression:
-
Handles nonlinear patterns
-
Works without feature scaling
-
Captures complex relationships
Practical Example Using Python
Step 1: Import Libraries
import pandas as pd
from sklearn.tree import DecisionTreeRegressor
Step 2: Create Dataset
data = {
"Experience": [1, 2, 3, 4, 5, 6],
"Salary": [15, 20, 28, 40, 60, 80]
}
df = pd.DataFrame(data)
print(df)
Step 3: Define Features and Target
X = df[["Experience"]]
y = df["Salary"]
Step 4: Create Model
model = DecisionTreeRegressor()
Step 5: Train Model
model.fit(X, y)
Step 6: Predict Salary
Predict for:
Experience = 4.5
prediction = model.predict([[4.5]])
print(prediction)
Example Output
[40.]
Why Output Looks Like Steps
Decision Tree Regression predicts:
Region averages
So predictions change in:
Steps instead of smooth curves
Advantages
-
Handles nonlinear data
-
Easy to understand
-
No feature scaling required
-
Captures complex patterns
Limitations
-
Can overfit easily
-
Sensitive to small data changes
-
Predictions may become unstable
Real-World Applications
| Industry | Usage |
|---|---|
| Real Estate | House price prediction |
| Finance | Revenue prediction |
| Healthcare | Medical forecasting |
| Sales | Demand forecasting |
Important Points
1. Decision Tree Regression predicts continuous values.
2. Uses recursive splitting.
3. MSE helps find best split.
4. Predictions are region averages.
5. Decision Trees can overfit easily.
Summary
Decision Tree Regression is a supervised learning algorithm that predicts continuous numerical values by recursively splitting data into smaller regions. Instead of fitting a straight line, it creates step-like predictions based on average values inside each region.
Keywords
Decision Tree Regression, Decision Tree Regressor, Regression Tree, Decision Tree Algorithm, Supervised Learning Regression, Nonlinear Regression, Tree-Based Regression, Decision Tree in Machine Learning, Regression using Decision Tree, Recursive Splitting, MSE in Decision Tree, Regression Prediction, Stepwise Prediction, Machine Learning Regression, DecisionTreeRegressor using Python