Data Visualization with Matplotlib
Overview
Graphs and charts make data easier to understand. You'll learn to create simple line, bar, and scatter plots using matplotlib, customizing colors, labels, and titles to make your visualizations clear and informative, presenting data visually.
What You Will Learn in This Lesson
By the end of this lesson, you will know:
- Why visualize data: Making data easier to understand.
- matplotlib basics: Creating simple plots and charts.
- Plot types: Line plots, bar charts, scatter plots, histograms.
- Customization: Adding labels, titles, and colors.
- Displaying plots: Showing and saving visualizations.
Why Visualize Data?
Visualizations make data easier to understand:
Benefits of Visualization
- Makes patterns and trends visible
- Easier to understand than raw numbers
- Helps communicate findings
- Enables data-driven decisions
- Identifies outliers and anomalies
Creating Your First Plot
Basic steps to create a plot with matplotlib:
import matplotlib.pyplot as plt
# Prepare data
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]
# Create plot
plt.plot(x, y)
# Add labels and title
plt.xlabel('X Axis')
plt.ylabel('Y Axis')
plt.title('My First Plot')
# Display plot
plt.show()
Installing matplotlib
First, install matplotlib using pip:
# In your terminal:
pip install matplotlib
# Or for Python 3:
pip3 install matplotlib
Import Convention
Most Python programmers import matplotlib.pyplot as plt for convenience: import matplotlib.pyplot as plt
Common Plot Types
Different plots for different data. Choose the right plot type for your data:
import matplotlib.pyplot as plt
# Data
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May']
sales = [100, 120, 140, 110, 150]
# Create line plot
plt.plot(months, sales, marker='o', color='blue', linewidth=2)
plt.xlabel('Month')
plt.ylabel('Sales')
plt.title('Monthly Sales Trend')
plt.grid(True)
plt.show()
import matplotlib.pyplot as plt
# Data
categories = ['A', 'B', 'C', 'D']
values = [23, 45, 56, 78]
# Create bar chart
plt.bar(categories, values, color=['red', 'blue', 'green', 'orange'])
plt.xlabel('Category')
plt.ylabel('Value')
plt.title('Category Comparison')
plt.show()
import matplotlib.pyplot as plt
# Data
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
y = [2, 4, 5, 7, 8, 9, 11, 12, 13, 15]
# Create scatter plot
plt.scatter(x, y, s=100, alpha=0.6, color='purple')
plt.xlabel('X Values')
plt.ylabel('Y Values')
plt.title('Relationship Between X and Y')
plt.show()
import matplotlib.pyplot as plt
# Data
scores = [65, 70, 75, 80, 85, 90, 95, 100, 85, 90, 75, 80, 85]
# Create histogram
plt.hist(scores, bins=10, color='skyblue', edgecolor='black')
plt.xlabel('Score')
plt.ylabel('Frequency')
plt.title('Score Distribution')
plt.show()
Choosing the Right Plot
- Line Plot: Use for trends over time or continuous data
- Bar Chart: Use for comparing discrete categories
- Scatter Plot: Use to show relationships between two variables
- Histogram: Use to show the distribution of a single variable
Customizing Plots
Make your plots more informative and visually appealing:
import matplotlib.pyplot as plt
# Data
x = [1, 2, 3, 4, 5]
y1 = [2, 4, 6, 8, 10]
y2 = [1, 3, 5, 7, 9]
# Create figure with custom size
plt.figure(figsize=(10, 6))
# Plot multiple lines
plt.plot(x, y1, label='Series 1', marker='o', linewidth=2)
plt.plot(x, y2, label='Series 2', marker='s', linewidth=2)
# Customize
plt.xlabel('X Axis Label', fontsize=12)
plt.ylabel('Y Axis Label', fontsize=12)
plt.title('Customized Plot', fontsize=14, fontweight='bold')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()
Common Customizations
- Colors: Use
color='red'or hex codes likecolor='#FF5733' - Markers: Use
marker='o'(circle),'s'(square),'^'(triangle) - Line width: Use
linewidth=2for thicker lines - Transparency: Use
alpha=0.5for semi-transparent plots - Figure size: Use
plt.figure(figsize=(10, 6))for custom dimensions
Practice: Creating Plots
Try It YourselfTry creating a simple plot:
Note: This is a simulation. In a real program with matplotlib installed, plt.plot(x, y) followed by plt.show() would display a visual plot!
Advanced Customization
matplotlib offers extensive customization options:
import matplotlib.pyplot as plt
# Create figure with subplots
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
# First subplot: Line plot
x1 = [1, 2, 3, 4, 5]
y1 = [2, 4, 6, 8, 10]
ax1.plot(x1, y1, marker='o')
ax1.set_title('Line Plot')
ax1.set_xlabel('X')
ax1.set_ylabel('Y')
# Second subplot: Bar chart
categories = ['A', 'B', 'C']
values = [10, 20, 15]
ax2.bar(categories, values, color=['red', 'blue', 'green'])
ax2.set_title('Bar Chart')
ax2.set_xlabel('Category')
ax2.set_ylabel('Value')
plt.tight_layout() # Adjust spacing
plt.show()
import matplotlib.pyplot as plt
# Create a plot
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]
plt.plot(x, y, marker='o')
plt.xlabel('X Axis')
plt.ylabel('Y Axis')
plt.title('My Plot')
# Save to file (instead of showing)
plt.savefig('my_plot.png', dpi=300, bbox_inches='tight')
# Formats: PNG, PDF, SVG, JPG
Common Customization Options
- Colors: Use color names ('red', 'blue') or hex codes ('#FF5733')
- Line styles: Use
linestyle='--'for dashed,'-.'for dash-dot - Markers: Use
marker='o'for circles,'s'for squares - Figure size: Use
figsize=(width, height)in inches - Legends: Use
plt.legend()to show what each line/bar represents - Grid: Use
plt.grid(True)to add grid lines
Summary
In this lesson, you learned:
- Visualization: Makes data easier to understand and identify patterns
- matplotlib: Python library for creating plots and charts
- Plot types: Line plots, bar charts, scatter plots, histograms
- Customization: Colors, labels, titles, markers, grid lines
- Display: Use
plt.show()to display plots orplt.savefig()to save them - Subplots: Create multiple plots in one figure
Remember
Good visualizations tell a story! Choose the right plot type for your data, always add clear labels and titles, and use colors and styles that make your data easy to understand.
End-of-Lesson Exercises
Think about these questions to reinforce what you've learned:
Exercise 1: Data Visualization
Why is data visualization important? What types of plots would you use for different situations?
Exercise 2: matplotlib Basics
What are the basic steps to create a plot with matplotlib?