Multiple Linear Regression
Multiple Linear Regression is an extension of Simple Linear Regression where multiple input features are used to predict a single output value.
In Simple Linear Regression:
One input feature → One output
In Multiple Linear Regression:
Multiple input features → One output
Real-Life Example
Suppose we want to predict house price using:
-
House area
-
Number of bedrooms
-
Age of house
Here:
-
Multiple input variables affect the output
-
A single feature is not enough for accurate prediction
Example Dataset
| Area | Bedrooms | Age | Price |
|---|---|---|---|
| 1000 | 2 | 5 | 50 |
| 1200 | 3 | 4 | 60 |
| 1500 | 3 | 3 | 75 |
| 1800 | 4 | 2 | 90 |
Where:
-
Area
-
Bedrooms
-
Age
are input features and:
Price
is the target variable.
Why Multiple Linear Regression is Needed
Real-world problems usually depend on multiple factors.
For example:
-
Salary depends on experience, education, and skills
-
House price depends on location, area, and amenities
-
Sales depend on marketing, price, and season
Using multiple features helps improve prediction accuracy.
Multiple Linear Regression Equation
y = b0 + b1x1 + b2x2 + b3x3 + ...
Where:
-
y → Predicted output
-
b0 → Intercept
-
b1, b2, b3 → Coefficients
-
x1, x2, x3 → Input features
Suppose:
Price = 10 + 0.05(Area) + 5(Bedrooms) - 2(Age)
This means:
-
Larger area increases price
-
More bedrooms increase price
-
Older houses reduce price
Mathematical Example
Suppose we have:
| Area | Bedrooms | Price |
|---|---|---|
| 1000 | 2 | 50 |
| 1200 | 3 | 60 |
| 1500 | 3 | 75 |
Assume the model learned:
Price = 5 + 0.04(Area) + 3(Bedrooms)
Prediction Example
Predict house price for:
Area = 1400
Bedrooms = 3
Substitute values:
Price = 5 + 0.04(1400) + 3(3)
Calculate:
0.04 * 1400 = 56
3 * 3 = 9
Final prediction:
Price = 5 + 56 + 9
Price = 70
Practical Example Using Python
Step 1: Import Libraries
import pandas as pd
from sklearn.linear_model import LinearRegression
Step 2: Create Dataset
data = {
"Area": [1000, 1200, 1500, 1800],
"Bedrooms": [2, 3, 3, 4],
"Age": [5, 4, 3, 2],
"Price": [50, 60, 75, 90]
}
df = pd.DataFrame(data)
print(df)
Step 3: Define Features and Target
X = df[["Area", "Bedrooms", "Age"]]
y = df["Price"]
Step 4: Train Model
model = LinearRegression()
model.fit(X, y)
Step 5: Predict New Value
prediction = model.predict([[1400, 3, 3]])
print(prediction)
Expected Output
[70.]
Understanding Model Coefficients
print(model.coef_)
This gives:
-
Effect of each feature on prediction
Intercept
print(model.intercept_)
This gives:
b0 value
Feature Importance
In Multiple Linear Regression:
-
Each feature contributes differently
-
Larger coefficient means stronger impact
Example:
| Feature | Coefficient |
|---|---|
| Area | 0.04 |
| Bedrooms | 3 |
| Age | -2 |
This means:
-
Bedrooms strongly increase price
-
Age decreases price
Advantages
-
Uses multiple features
-
Better prediction accuracy
-
Models real-world problems effectively
-
Easy to interpret
Limitations
-
Sensitive to outliers
-
Multicollinearity may occur
-
Assumes linear relationship
Real-World Applications
| Application | Prediction |
|---|---|
| Real Estate | House prices |
| Business | Sales forecasting |
| Finance | Profit prediction |
| Healthcare | Medical cost prediction |
Important Points
1. Multiple Linear Regression uses multiple input features.
2. It predicts continuous numerical values.
3. Each feature has its own coefficient.
4. The regression equation contains multiple variables.
5. It is an extension of Simple Linear Regression.
Summary
Multiple Linear Regression is a supervised learning algorithm used to predict continuous numerical values using multiple input features. It helps model real-world problems more accurately by considering the combined effect of several variables on the target output.
Keywords
Multiple Linear Regression, Multiple Regression, Multiple Linear Regression in Machine Learning, Regression with Multiple Variables, Supervised Learning Regression, Multiple Feature Prediction, Regression Coefficients, Linear Regression using Python, Regression Model, Predictive Modeling, House Price Prediction, Multivariable Regression, Regression Analysis, Scikit Learn Linear Regression, Machine Learning Regression