Ridge and Lasso Regression

Ridge Regression and Lasso Regression are advanced versions of Linear Regression used to reduce overfitting and improve model generalization.

These techniques are called Regularization Techniques

Why Ridge and Lasso Regression are Needed

In Linear Regression:

  • The model may fit training data too perfectly

  • This can cause overfitting

  • The model performs poorly on new data

Ridge and Lasso Regression help: Control model complexity

and improve prediction performance.

What is Overfitting?

Overfitting happens when: The model memorizes training data instead of learning patterns

Symptoms:

  • Very high training accuracy

  • Poor testing accuracy

Main Idea of Regularization

Regularization adds:

Penalty to large coefficient values

This helps:

  • Reduce complexity

  • Prevent overfitting

  • Improve generalization

Linear Regression Equation

y = b0 + b1x1 + b2x2 + ...

Sometimes coefficients become: Very large

Regularization tries to: Shrink coefficient values

Ridge Regression

Ridge Regression adds:

L2 Penalty

to the cost function.

Ridge Formula

Cost Function = RSS + λ(Σb²)

Where:

  • RSS → Residual Sum of Squares

  • λ (lambda) → Regularization parameter

  • Σb² → Sum of squared coefficients

Main Idea of Ridge Regression

Ridge Regression:

  • Reduces coefficient size

  • Keeps all features

  • Prevents coefficients from becoming too large

Important Point

Ridge Regression:

Does NOT make coefficients exactly zero

It only:

Reduces their values

Lasso Regression

Lasso Regression adds:

L1 Penalty

to the cost function.

Lasso Formula

Cost Function = RSS + λ(Σ|b|)

Where:

  • Σ|b| → Sum of absolute coefficient values

Main Idea of Lasso Regression

Lasso Regression:

  • Shrinks coefficients

  • Can make some coefficients exactly zero

This means:

Automatic feature selection

Ridge vs Lasso

Ridge Regression Lasso Regression
Uses L2 penalty Uses L1 penalty
Reduces coefficients Can remove coefficients
Keeps all features Performs feature selection
Better for multicollinearity Better for sparse models

Understanding Lambda (λ)

Lambda controls:

Strength of regularization

Small Lambda

Weak regularization

Model behaves similar to Linear Regression.

Large Lambda

Strong regularization

Coefficients shrink heavily.

Example Dataset

Area Bedrooms Price
1000 2 50
1200 3 60
1500 3 75
1800 4 90

Suppose Linear Regression produces:

b1 = 12
b2 = 9

These large values may cause overfitting.

Ridge Regression Effect

After Ridge:

b1 = 5
b2 = 4

Coefficients become smaller.

Lasso Regression Effect

After Lasso:

b1 = 5
b2 = 0

Lasso removed one feature completely.

Practical Example Using Python

Step 1: Import Libraries

import pandas as pd

from sklearn.linear_model import Ridge
from sklearn.linear_model import Lasso

Step 2: Create Dataset

data = {
"Area": [1000, 1200, 1500, 1800],
"Bedrooms": [2, 3, 3, 4],
"Price": [50, 60, 75, 90]
}

df = pd.DataFrame(data)

print(df)

Step 3: Define Features and Target

X = df[["Area", "Bedrooms"]]

y = df["Price"]

Step 4: Ridge Regression

ridge = Ridge(alpha=1.0)

ridge.fit(X, y)

print("Ridge Coefficients:")
print(ridge.coef_)

Step 5: Lasso Regression

lasso = Lasso(alpha=1.0)

lasso.fit(X, y)

print("Lasso Coefficients:")
print(lasso.coef_)

Understanding alpha

In Python:

alpha = lambda

Higher alpha:

  • Stronger regularization

  • Smaller coefficients

Why Feature Scaling is Important

Ridge and Lasso are sensitive to feature scales.

So:

Feature Scaling is recommended

before applying these algorithms.

Advantages of Ridge Regression

  • Reduces overfitting

  • Handles multicollinearity

  • Keeps all features

Advantages of Lasso Regression

  • Performs feature selection

  • Reduces unnecessary features

  • Creates simpler models

Limitations

  • Choosing lambda is important

  • Large regularization may underfit

  • Scaling is usually required

Real-World Applications

Application Usage
Finance Stock prediction
Healthcare Disease prediction
Marketing Sales forecasting
Real Estate House price prediction

Important Points

1. Ridge and Lasso are regularization techniques.

2. Ridge uses L2 penalty.

3. Lasso uses L1 penalty.

4. Lasso can perform feature selection.

5. Lambda controls regularization strength.

Summary

Ridge and Lasso Regression are regularized versions of Linear Regression used to reduce overfitting and improve model performance. Ridge Regression shrinks coefficient values, while Lasso Regression can completely remove less important features using feature selection.

Keywords

Ridge Regression, Lasso Regression, Regularization Techniques, L1 Regularization, L2 Regularization, Ridge vs Lasso, Overfitting Reduction, Feature Selection, Regression Regularization, Machine Learning Regularization, Penalized Regression, Ridge Regression in Machine Learning, Lasso Regression in Machine Learning, Multicollinearity Handling, Regression Optimization

Previous Topic Example: LR Next Topic ML Projects