Project 2
Project 2: Sales Data Visualization using Python
Objective
In this project, we will analyze monthly sales data and visualize business performance using Python.
We will learn how companies use charts to understand:
-
Monthly sales trends
-
Best sales months
-
Low sales months
-
Category-wise sales
-
Business insights from data
Cell 1: Import Required Libraries
import pandas as pd
import matplotlib.pyplot as plt
Here we are importing the required Python libraries.
pandas is used to create and manage data in table format.
matplotlib.pyplot is used to create different types of charts and graphs.
Cell 2: Create Monthly Sales Dataset
data = {
"Month": [
"Jan", "Feb", "Mar", "Apr", "May", "Jun",
"Jul", "Aug", "Sep", "Oct", "Nov", "Dec"
],
"Sales": [
12000, 15000, 18000, 17000, 22000, 25000,
30000, 28000, 35000, 40000, 42000, 50000
]
}
Here we are creating a simple monthly sales dataset manually.
The dataset contains two columns:
-
Month: month name
-
Sales: sales amount for that month
We are not using any external CSV file. This makes the project easy for beginners.
Cell 3: Convert Data into DataFrame
df = pd.DataFrame(data)
print(df)
Here we convert the dictionary into a Pandas DataFrame.
A DataFrame is like a table.
In real-world business projects, sales data is usually stored in tables.
Cell 4: Display First Few Records
df.head()
head() displays the first five records of the dataset.
This helps us quickly check whether the data is created properly.
Cell 5: Check Dataset Information
df.info()
info() shows details about the dataset.
It tells us:
-
column names
-
number of records
-
data types
-
missing values information
This is useful before doing analysis.
Cell 6: Statistical Summary
df.describe()
describe() gives statistical summary of numerical columns.
For sales data, it shows:
-
average sales
-
minimum sales
-
maximum sales
-
standard deviation
This helps us understand business performance.
Cell 7: Total Sales
total_sales = df["Sales"].sum()
print("Total Sales:", total_sales)
Here we calculate total yearly sales.
sum() adds all monthly sales values.
This tells us the total business sales for the full year.
Cell 8: Average Monthly Sales
average_sales = df["Sales"].mean()
print("Average Monthly Sales:", average_sales)
Here we calculate average monthly sales.
mean() gives the average value.
This helps us understand normal monthly business performance.
Cell 9: Highest Sales Month
highest_sales = df[df["Sales"] == df["Sales"].max()]
highest_sales
Here we find the month with the highest sales.
max() gives the maximum sales value.
Then we filter the dataset to display the month where sales were highest.
Cell 10: Lowest Sales Month
lowest_sales = df[df["Sales"] == df["Sales"].min()]
lowest_sales
Here we find the month with the lowest sales.
min() gives the minimum sales value.
This helps business teams identify weak sales months.
Cell 11: Monthly Sales Line Chart
plt.plot(df["Month"], df["Sales"], marker="o")
plt.xlabel("Month")
plt.ylabel("Sales")
plt.title("Monthly Sales Trend")
plt.show()
This line chart shows how sales changed month by month.
A line chart is useful for trend analysis.
From this chart, we can easily observe whether sales are increasing, decreasing, or fluctuating.
Cell 12: Monthly Sales Bar Chart
plt.bar(df["Month"], df["Sales"])
plt.xlabel("Month")
plt.ylabel("Sales")
plt.title("Monthly Sales Comparison")
plt.show()
This bar chart compares sales across all months.
Each bar represents one month.
Bar charts are useful when we want to compare values clearly.
Cell 13: Add Product Category Sales Data
category_data = {
"Category": ["Electronics", "Clothing", "Grocery", "Furniture", "Books"],
"Sales": [45000, 30000, 25000, 20000, 10000]
}
category_df = pd.DataFrame(category_data)
category_df
Now we are creating another dataset for product category sales.
This helps us understand which product category generated more sales.
The dataset contains:
-
Category
-
Sales
Cell 14: Category-wise Bar Chart
plt.bar(category_df["Category"], category_df["Sales"])
plt.xlabel("Product Category")
plt.ylabel("Sales")
plt.title("Category-wise Sales")
plt.show()
This chart compares sales across product categories.
It helps businesses identify best-performing product categories.
For example, if Electronics has the highest sales, the company can focus more on electronics.
Cell 15: Category-wise Pie Chart
plt.pie(
category_df["Sales"],
labels=category_df["Category"],
autopct="%1.1f%%"
)
plt.title("Sales Distribution by Category")
plt.show()
This pie chart shows the percentage contribution of each category.
Pie charts are useful to understand share or proportion.
For example:
Electronics contributes the highest percentage of total sales.
Cell 16: Add Profit Column
df["Profit"] = [
2000, 2500, 3000, 2800, 4000, 4500,
5500, 5000, 7000, 8000, 8500, 10000
]
df
Here we are adding one more column called Profit.
In real business, sales alone are not enough.
We also need to analyze profit.
Sometimes sales may be high, but profit may be low.
Cell 17: Sales vs Profit Line Chart
plt.plot(df["Month"], df["Sales"], marker="o", label="Sales")
plt.plot(df["Month"], df["Profit"], marker="o", label="Profit")
plt.xlabel("Month")
plt.ylabel("Amount")
plt.title("Sales vs Profit Trend")
plt.legend()
plt.show()
This chart compares sales and profit month by month.
legend() shows labels for each line.
This helps us understand whether profit is growing along with sales.
Cell 18: Find Business Insights
print("Business Insights:")
print("1. Total yearly sales are:", total_sales)
print("2. Average monthly sales are:", average_sales)
print("3. Highest sales happened in:", highest_sales["Month"].values[0])
print("4. Lowest sales happened in:", lowest_sales["Month"].values[0])
print("5. Sales show an increasing trend towards the end of the year.")
Here we summarize our findings.
This is very important in data analysis.
The goal is not just to create charts, but to understand business meaning from the data.
Complete Code in One Place
import pandas as pd
import matplotlib.pyplot as plt
data = {
"Month": [
"Jan", "Feb", "Mar", "Apr", "May", "Jun",
"Jul", "Aug", "Sep", "Oct", "Nov", "Dec"
],
"Sales": [
12000, 15000, 18000, 17000, 22000, 25000,
30000, 28000, 35000, 40000, 42000, 50000
]
}
df = pd.DataFrame(data)
print("Monthly Sales Dataset:")
print(df)
print("\nFirst Five Records:")
print(df.head())
print("\nDataset Information:")
print(df.info())
print("\nStatistical Summary:")
print(df.describe())
total_sales = df["Sales"].sum()
print("\nTotal Sales:", total_sales)
average_sales = df["Sales"].mean()
print("Average Monthly Sales:", average_sales)
highest_sales = df[df["Sales"] == df["Sales"].max()]
print("\nHighest Sales Month:")
print(highest_sales)
lowest_sales = df[df["Sales"] == df["Sales"].min()]
print("\nLowest Sales Month:")
print(lowest_sales)
plt.plot(df["Month"], df["Sales"], marker="o")
plt.xlabel("Month")
plt.ylabel("Sales")
plt.title("Monthly Sales Trend")
plt.show()
plt.bar(df["Month"], df["Sales"])
plt.xlabel("Month")
plt.ylabel("Sales")
plt.title("Monthly Sales Comparison")
plt.show()
category_data = {
"Category": ["Electronics", "Clothing", "Grocery", "Furniture", "Books"],
"Sales": [45000, 30000, 25000, 20000, 10000]
}
category_df = pd.DataFrame(category_data)
print("\nCategory Sales Dataset:")
print(category_df)
plt.bar(category_df["Category"], category_df["Sales"])
plt.xlabel("Product Category")
plt.ylabel("Sales")
plt.title("Category-wise Sales")
plt.show()
plt.pie(
category_df["Sales"],
labels=category_df["Category"],
autopct="%1.1f%%"
)
plt.title("Sales Distribution by Category")
plt.show()
df["Profit"] = [
2000, 2500, 3000, 2800, 4000, 4500,
5500, 5000, 7000, 8000, 8500, 10000
]
print("\nSales with Profit:")
print(df)
plt.plot(df["Month"], df["Sales"], marker="o", label="Sales")
plt.plot(df["Month"], df["Profit"], marker="o", label="Profit")
plt.xlabel("Month")
plt.ylabel("Amount")
plt.title("Sales vs Profit Trend")
plt.legend()
plt.show()
print("\nBusiness Insights:")
print("1. Total yearly sales are:", total_sales)
print("2. Average monthly sales are:", average_sales)
print("3. Highest sales happened in:", highest_sales["Month"].values[0])
print("4. Lowest sales happened in:", lowest_sales["Month"].values[0])
print("5. Sales show an increasing trend towards the end of the year.")
Summary
From this project, we can observe that sales are increasing towards the end of the year.
The business can use this information to:
-
plan inventory
-
improve marketing
-
identify strong months
-
identify weak months
-
focus on profitable categories
-
make better business decisions