Simple Linear Regression

Simple Linear Regression is a supervised machine learning algorithm used to predict a continuous numerical value using one independent variable (input feature).

It tries to find the best straight-line relationship between:

  • Input Variable (X)
  • Output Variable (Y)

Real-Life Example

Suppose we want to predict:

  • Student marks based on study hours
  • House price based on area
  • Salary based on years of experience

Example Dataset

Study Hours Marks
1 10
2 20
3 30
4 40
5 50

The relationship is:

More study hours → Higher marks

Goal of Simple Linear Regression

The goal is to find the best-fit straight line that predicts output values accurately.

Equation of Simple Linear Regression

 
y=mx+b

Where:

  • → Predicted output
  • → Input feature
  • → Slope of line
  • → Intercept

Understanding the Equation

Slope (m)

Slope shows:

How much y changes when x changes

Intercept (b)

Intercept is:

The value of y when x = 0

Best Fit Line

Simple Linear Regression tries to draw the best straight line through all data points.

Visualization

Data Points → Actual values
Line → Predicted relationship

Practical Example Using Python

Problem Statement

Predict student marks based on study hours.

Step 1: Import Libraries

import pandas as pd
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression

Step 2: Create Dataset

data = {
"Hours": [1, 2, 3, 4, 5, 6, 7, 8],
"Marks": [10, 20, 30, 40, 50, 60, 70, 80]
}

df = pd.DataFrame(data)

print(df)

Output

   Hours  Marks
0 1 10
1 2 20
2 3 30
...

Step 3: Visualize Dataset

plt.scatter(df["Hours"], df["Marks"])

plt.xlabel("Study Hours")
plt.ylabel("Marks")

plt.title("Study Hours vs Marks")

plt.show()

Observation

The points follow a straight-line pattern.

Step 4: Define Features and Target

X = df[["Hours"]]
y = df["Marks"]

Where:

  • X → Input feature
  • y → Target variable

Step 5: Split Dataset

X_train, X_test, y_train, y_test = train_test_split(
X, y,
test_size=0.2,
random_state=42
)

Step 6: Train the Model

model = LinearRegression()

model.fit(X_train, y_train)

Step 7: Predict Test Values

predictions = model.predict(X_test)

print(predictions)

Step 8: Predict New Value

new_prediction = model.predict([[9]])

print("Predicted Marks:",
new_prediction[0])

Expected Output

Predicted Marks: 90

Step 9: Visualize Regression Line

plt.scatter(df["Hours"], df["Marks"])

plt.plot(df["Hours"],
model.predict(X),
color="red")

plt.xlabel("Study Hours")
plt.ylabel("Marks")

plt.title("Simple Linear Regression")

plt.show()

What the Model Learned

The model learned:

Study Hours ↑ → Marks ↑

This relationship is represented using a straight line.

Model Coefficients

Slope

print(model.coef_)

Intercept

print(model.intercept_)

Understanding Predictions

Suppose:

  • Slope = 10
  • Intercept = 0

Then:

Marks=10 (Hours) + 0

If Hours = 9:

Marks = 10(9) = 90

Assumptions of Linear Regression

1. Linear relationship exists
2. Data has minimal outliers
3. Errors are normally distributed
4. Independent observations

Real-World Applications

Application Prediction
Education Student marks
Real Estate House prices
Business Sales forecasting
Finance Profit prediction

Advantages

  • Simple and easy to understand
  • Fast training
  • Easy visualization
  • Works well for linear data

Limitations

  • Cannot model complex nonlinear patterns
  • Sensitive to outliers
  • Assumes linear relationship

Important Interview Points

1. Simple Linear Regression uses one independent variable.

2. It predicts continuous numerical values.

3. The regression line is represented by: y=mx+b

4. Slope represents rate of change.

5. Linear Regression works best when data has a linear relationship.

Summary

Simple Linear Regression is a supervised learning algorithm used to predict continuous numerical values using one input feature. It models the relationship between input and output variables using a straight-line equation and is widely used for prediction and forecasting tasks.

Keywords

Simple Linear Regression, Linear Regression, Simple Linear Regression in Machine Learning, Regression Algorithm, Supervised Learning, Regression Line, Best Fit Line, Predictive Modeling, Continuous Value Prediction, Linear Relationship, Regression Equation, Slope and Intercept, Regression using Python, Scikit Learn Linear Regression, Machine Learning Regression

Previous Topic Regression Next Topic Example: SLR