Pandas: Data Visualization
Plotting Basic Charts
Pandas makes data visualization ridiculously easy—so easy that you’ll wonder why you ever suffered through Excel charts. Here’s how to get started:
- Line Plots using
.plot()
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({"Year": [2018, 2019, 2020, 2021], "Sales": [200, 250, 400, 600]})
df.plot(x="Year", y="Sales", kind="line")
plt.show()A line plot is perfect for showing trends over time—like your caffeine intake while debugging code.
- Histograms using
.hist()
df["Sales"].hist(bins=5)
plt.show()Histograms help visualize distribution, so you can quickly spot if your data is normal… or totally skewed.
- Box Plots using
.boxplot()
df.boxplot(column=["Sales"])
plt.show()Box plots reveal outliers—because there’s always that one absurd data point ruining everything.
- Bar and Scatter Plots
df.plot(kind="bar", x="Year", y="Sales")
df.plot(kind="scatter", x="Year", y="Sales")
plt.show()Bar plots are great for categorical data, while scatter plots are perfect for spotting relationships.
Using Pandas with Matplotlib and Seaborn
Matplotlib is the backbone of data visualization, and Seaborn is the stylish best friend that makes everything look better.
- Introduction to Matplotlib and Seaborn
import seaborn as sns
sns.set_style("darkgrid")This instantly makes your plots less ugly.
- Creating Pandas Visualizations with Matplotlib Backend
df.plot(kind="line")
plt.title("Sales Over Time")
plt.show()Pandas uses Matplotlib under the hood, so you can mix and match.
- Enhancing Pandas Plots with Seaborn Styles
sns.lineplot(data=df, x="Year", y="Sales")
plt.show()Seaborn brings cleaner aesthetics and default settings that don’t look like they were made in the ‘90s.
Customizing Plots and Adding Labels
Your plot is useless if no one knows what it means. Add some context:
- Adding Titles, Labels, and Legends
plt.title("Annual Sales Trend")
plt.xlabel("Year")
plt.ylabel("Sales")
plt.legend(["Revenue"])
plt.show()This prevents people from guessing what your data means.
- Adjusting Figure Size and Color Schemes
plt.figure(figsize=(10,5))
df.plot(kind="bar", color=["#FF5733", "#33FF57"])
plt.show()Change figure size and color for better readability.
- Using Subplots for Multiple Visualizations
fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(12, 5))
df.plot(ax=axes[0], kind="bar")
df.plot(ax=axes[1], kind="line")
plt.show()Multiple plots? No problem. Subplots let you visualize different angles in one go.
Advanced Visualization Techniques
Let’s take things up a notch.
- Creating Multi-Faceted Plots with
sns.FacetGrid
g = sns.FacetGrid(df, col="Year")
g.map_dataframe(sns.histplot, x="Sales")
plt.show()Great for comparing distributions across different categories.
- Visualizing Correlations with Heatmaps
import numpy as np
corr_matrix = df.corr()
sns.heatmap(corr_matrix, annot=True, cmap="coolwarm")
plt.show()Heatmaps show correlations at a glance—useful for spotting relationships in large datasets.
- Combining Multiple Plots for Storytelling
fig, ax = plt.subplots()
sns.lineplot(data=df, x="Year", y="Sales", ax=ax, label="Sales Trend")
sns.scatterplot(data=df, x="Year", y="Sales", color="red", ax=ax)
plt.show()Layer different plots together to build a compelling narrative.
Hands-On Exercise
- Create Basic Plots: Generate line, histogram, and box plots using Pandas.
- Enhance Visualizations with Matplotlib & Seaborn: Customize plots by adding labels, legends, and styling.
- Use Advanced Techniques: Create heatmaps and multi-faceted plots using Seaborn.
- Combine Plots for Insights: Develop a visualization dashboard using multiple Pandas plots.