Gradient Descent in LR

Gradient Descent is an optimization algorithm used to find the best values of slope (m) and intercept (b) in Linear Regression (LR). It helps minimize prediction error by continuously updating model parameters step by step.

Instead of calculating the best-fit line directly using formulas, Gradient Descent gradually learns the optimal line through iterations.

Why Gradient Descent is Needed

Suppose a regression model predicts values poorly.

Example:

Actual Marks Predicted Marks
50 30
60 35
70 40

The prediction error is high.

Gradient Descent helps:

  • Reduce prediction error

  • Improve model accuracy

  • Find optimal values of m and b

Main Idea of Gradient Descent

Gradient Descent works like this:

1. Start with random values of m and b
2. Calculate prediction error
3. Update m and b
4. Repeat until error becomes very small

Linear Regression Equation

y = mx + b

Where:

  • y → Predicted output

  • x → Input feature

  • m → Slope

  • b → Intercept

Cost Function

Gradient Descent minimizes the Cost Function.

The most common cost function is:

Mean Squared Error (MSE)

MSE Formula

MSE = Σ(actual_y - predicted_y)^2 / n

Goal:

Minimize MSE

Real-Life Analogy

Imagine you are standing on a mountain and want to reach the lowest point.

You:

  • Take small steps downward

  • Check direction continuously

  • Eventually reach the bottom

Gradient Descent works similarly:

  • It moves step by step toward minimum error.

Important Terms

1. Learning Rate

Learning Rate controls:

How big each step should be

Small Learning Rate

Slow learning
More iterations

Large Learning Rate

May skip minimum point
Unstable learning

2. Iterations

Iterations represent:

How many times parameters are updated

More iterations usually improve learning.

Mathematical Example

Dataset

X Y
1 2
2 4
3 6

Step 1: Initial Values

Suppose:

m = 0
b = 0
Learning Rate = 0.01

Step 2: Prediction Formula

predicted_y = mx + b

Predictions

For X = 1:

predicted_y = (0 * 1) + 0 = 0

For X = 2:

predicted_y = (0 * 2) + 0 = 0

For X = 3:

predicted_y = (0 * 3) + 0 = 0

Step 3: Calculate Error

Actual Y Predicted Y Error
2 0 2
4 0 4
6 0 6

Large errors exist.

Step 4: Update Parameters

Gradient Descent updates:

  • m

  • b

to reduce error.

After one update:

m = 0.28
b = 0.12

Predictions improve.

Step 5: Repeat Process

Gradient Descent keeps updating values repeatedly until:

Error becomes minimum

Visualization of Learning

Iteration 1 → High Error
Iteration 10 → Lower Error
Iteration 100 → Minimum Error

Python Example — Gradient Descent

import numpy as np

# Dataset
X = np.array([1, 2, 3])
Y = np.array([2, 4, 6])

# Initial values
m = 0
b = 0

# Learning rate
L = 0.01

# Iterations
epochs = 1000

n = len(X)

# Gradient Descent
for i in range(epochs):

Y_pred = m * X + b

# Derivatives
dm = (-2/n) * sum(X * (Y - Y_pred))
db = (-2/n) * sum(Y - Y_pred)

# Update values
m = m - L * dm
b = b - L * db

print("Slope:", m)
print("Intercept:", b)

Expected Output

Slope ≈ 2
Intercept ≈ 0

Final Equation

y = 2x

What Gradient Descent Learned

The algorithm learned:

When X increases,
Y increases proportionally.

Types of Gradient Descent

Type Description
Batch Gradient Descent Uses entire dataset
Stochastic Gradient Descent Uses one sample at a time
Mini-Batch Gradient Descent Uses small batches

Advantages of Gradient Descent

  • Works for large datasets

  • Efficient optimization

  • Widely used in Deep Learning

  • Helps minimize prediction error

Limitations

  • Requires proper learning rate

  • Can be slow for complex problems

  • May get stuck in local minima

Important Points

1. Gradient Descent minimizes the cost function.

2. Learning Rate controls step size.

3. Gradient Descent updates slope and intercept iteratively.

4. MSE is commonly used as the cost function.

5. Gradient Descent is widely used in Machine Learning and Deep Learning.

Summary

Gradient Descent is an optimization algorithm used in Linear Regression to minimize prediction error by continuously updating slope and intercept values. It helps models learn the best-fit line step by step through iterative optimization.

Keywords

Gradient Descent, Gradient Descent in Linear Regression, Optimization Algorithm, Cost Function, Mean Squared Error, MSE, Learning Rate, Gradient Descent Algorithm, Linear Regression Optimization, Batch Gradient Descent, Stochastic Gradient Descent, Mini Batch Gradient Descent, Machine Learning Optimization, Regression Optimization, Gradient Descent using Python.

Previous Topic Example: SLR Next Topic ML Projects