Visualizing the trend of data

Once we have imported the two datasets, we can set out on a further visualization journey. Let's begin by plotting the world population trends from 1950 to 2017. To select rows based on the value of a column, we can use the following syntax: df[df.variable_name == "target"] or df[df['variable_name'] == "target"], where df is the dataframe object. Other conditional operators, such as larger than > or smaller than <, are also supported. Multiple conditional statements can be chained together using the "and" operator &, or the "or" operator |.

To aggregate the population across all age groups within a year, we are going to rely on df.groupby().sum(), as shown in the following example:

import matplotlib.pyplot as plt


# Select the aggregated population data from the world for both genders,
# during 1950 to 2017.
selected_data = data[(data.Location == 'WORLD') & (data.Sex == 'Both') & (data.Time <= 2017) ]

# Calculate aggregated population data across all age groups for each year
# Set as_index=False to avoid the Time variable to be used as index
grouped_data = selected_data.groupby('Time', as_index=False).sum()

# Generate a simple line plot of population vs time
fig = plt.figure()
plt.plot(grouped_data.Time, grouped_data.Value)

# Label the axis
plt.xlabel('Year')
plt.ylabel('Population (thousands)')

plt.show()
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset