Logistic Regression

Logistic Regression is a supervised machine learning algorithm used for classification problems.

Unlike Linear Regression:

Which predicts continuous values

Logistic Regression predicts:

Categories or classes

Real-Life Examples

Application Prediction
Email Spam Detection Spam / Not Spam
Disease Prediction Disease / No Disease
Loan Approval Approved / Rejected
Student Result Pass / Fail

Why Logistic Regression is Needed

Suppose we want to predict:

Whether a student pass or fail

Possible outputs are:

0 → Fail
1 → Pass

Linear Regression is not suitable because:

  • It predicts continuous values

  • Output may become less than 0 or greater than 1

Logistic Regression solves this problem.

Output of Logistic Regression

Logistic Regression predicts:

Probability values

between:

0 and 1

Example:

Probability Prediction
0.90 Pass
0.20 Fail

Logistic Function (Sigmoid Function)

Logistic Regression uses the Sigmoid Function.

The sigmoid function converts values into probabilities.

Sigmoid Function Formula

P(y) = 1 / (1 + e^-z)

Where:

z = b0 + b1x

Understanding Sigmoid Curve

The sigmoid function produces an S-shaped curve.

Output always remains:

Between 0 and 1

This makes it suitable for classification problems.

Classification Decision

Usually:

If probability >= 0.5 → Class 1
If probability < 0.5 → Class 0

Example Dataset

Study Hours Pass
1 0
2 0
3 0
4 1
5 1
6 1

Where:

  • 0 = Fail

  • 1 = Pass

How Logistic Regression Learns

The algorithm:

  • Learns patterns from data

  • Finds decision boundary

  • Predicts probabilities

  • Classifies outputs

Decision Boundary

A decision boundary separates classes.

Example:

Study Hours < 4 → Fail
Study Hours >= 4 → Pass

Suppose:

z = -4 + 1.5x

Predict for:

x = 4

Step 1: Calculate z

z = -4 + 1.5(4)
z = -4 + 6
z = 2

Step 2: Apply Sigmoid Function

Formula:

P(y) = 1 / (1 + e^-z)

Substitute:

P(y) = 1 / (1 + e^-2)

Approximate value:

P(y) ≈ 0.88

Final Prediction

Since:

0.88 > 0.5

Prediction becomes:

Class = 1 (Pass)

Practical Example Using Python

Step 1: Import Libraries

import pandas as pd
from sklearn.linear_model import LogisticRegression

Step 2: Create Dataset

data = {
"Hours": [1, 2, 3, 4, 5, 6],
"Pass": [0, 0, 0, 1, 1, 1]
}

df = pd.DataFrame(data)

print(df)

Step 3: Define Features and Target

X = df[["Hours"]]

y = df["Pass"]

Step 4: Train Model

model = LogisticRegression()

model.fit(X, y)

Step 5: Predict Class

Predict for:

Hours = 4
prediction = model.predict([[4]])

print(prediction)

Expected Output

[1]

Step 6: Predict Probability

probability = model.predict_proba([[4]])

print(probability)

Example Output

[[0.12 0.88]]

Meaning:

  • 12% probability of Fail

  • 88% probability of Pass

Understanding predict_proba()

Output format:

[Probability of Class 0, Probability of Class 1]

Advantages

  • Simple and fast

  • Works well for binary classification

  • Produces probability output

  • Easy to interpret

Limitations

  • Works best for linear decision boundaries

  • Sensitive to outliers

  • Not suitable for highly complex data

Real-World Applications

Industry Usage
Healthcare Disease Prediction
Banking Fraud Detection
Education Pass/Fail Prediction
Marketing Customer Purchase Prediction

Important Points

1. Logistic Regression is used for classification problems.

2. Output is probability between 0 and 1.

3. Sigmoid function converts values into probabilities.

4. Decision boundary separates classes.

5. Logistic Regression is widely used for binary classification.

Summary

Logistic Regression is a supervised learning classification algorithm used to predict categorical outcomes. It uses the sigmoid function to convert predictions into probabilities and is commonly used for binary classification problems such as spam detection, disease prediction, and pass/fail prediction.

Keywords

Logistic Regression, Logistic Regression in Machine Learning, Classification Algorithm, Binary Classification, Sigmoid Function, Logistic Function, Probability Prediction, Binary Classifier, Supervised Learning Classification, Classification using Python, Logistic Regression using Scikit Learn, Machine Learning Classification, Decision Boundary, Predict Probability, Binary Outcome Prediction

Previous Topic Example: MLR Next Topic Example: LR