Time Series Data Visualization
Share:
Matplotlib is a powerful plotting library in Python that allows data to be visualized in a variety of ways, including plotting time series data. Time series data is a sequence of numerical data points collected over a specified range of time increments. This can include anything from historical stock prices to the average monthly temperatures of a location. In this chapter, we will explore how to create compelling visuals for time series data using Matplotlib, using movie features and characters as an example for data sets.
Before we start plotting, we'll need to import our necessary libraries, Matplotlib and pandas. Matplotlib has a module called Pyplot that brings the state-management interface for Matplotlib inline with that of MATLAB. Pandas will be used to handle and manipulate our data. Install them using pip if you haven't already, and then import them into your Python script:
import matplotlib.pyplot as plt
import pandas as pd
To work with dates efficiently, Python provides a built-in module called datetime. This module supplies classes for manipulating dates and times in both simple and complex ways. So, we'll import this module as well:
import datetime
First, let's create a hypothetical data set. Suppose we have a dataset that tracks the popularity of three major movie character - Luke Skywalker, Darth Vader, and Yoda over a span of December.
movie_characters = ['Luke Skywalker', 'Darth Vader', 'Yoda']
dates = pd.date_range(start='12/1/2021', end='12/31/2021')
data = pd.DataFrame(index = dates)
for character in movie_characters:
data[character] = pd.Series(np.random.randint(1,6,len(dates)), index=dates)
In the above example, we begin by specifying the movie characters we're interested in. We then create a range of dates for the month of December using the pandas date_range
function. We create a new data frame using these dates as the index. For each character's popularity rating, we generate a random integer between 1 and 5 inclusive for each day in the date range.
Now that we have our data, we can create a time series plot:
fig, ax = plt.subplots()
for character in movie_characters:
ax.plot(data.index, data[character], label=character)
# Set title and labels for the plot
ax.set_title('Popularity rating over December')
ax.set_xlabel('Date')
ax.set_ylabel('Popularity Rating')
ax.legend()
# Customize the date format
date_format = mpl_dates.DateFormatter('%b, %d')
ax.xaxis.set_major_formatter(date_format)
# Show the plot
plt.show()
In the plot code, we first make a call to plt.subplots()
to create a figure and a set of subplots. We iterate through all characters present in our data and plot their data using the ax.plot()
function. Next, we set the title of the plot and label the x and y axes using the set_title()
, set_xlabel()
, and set_ylabel()
functions on ax
. We then set a legend using the legend()
method on ax
which automatically adds a legend to the plot.
To format dates on the x-axis, we use the DateFormatter class from the matplotlib.dates module and set this formatter to the x-axis. Finally, we use plt.show()
to display the plot. This would generate a plot where the x-axis represents the date and the y-axis represents the popularity rating of the characters in our list.
In addition to line plots, Matplotlib can also create bar plots. Say, you are curious to compare the total popularity rating of these characters over December. Here is how:
fig, ax = plt.subplots()
ax.bar(movie_characters, data.sum())
ax.set_title('Total Popularity rating in December')
ax.set_xlabel('Character')
ax.set_ylabel('Total Popularity Rating')
plt.show()
In this bar plot, we once again first create the figure object using subplots()
. Here, ax.bar()
is used to create the bar chart. The first argument passed is the category (our movie characters) while the second one is an aggregated version of our dataset (we sum the popularity ratings across the entire time span). The remainder of the code follows the same pattern as above for labeling the chart and displaying it.
This tutorial should provide you with a solid starting point for visualizing time series data using Matplotlib. There are many additional customizations and plot types available in Matplotlib, allowing you to create visuals that best suit your specific data and analysis needs. It's now time for you to start exploring what else you can do with your data using this powerful tool. Happy plotting!
0 Comment
Sign up or Log in to leave a comment