Exploring Global Population Trends

A python analysis task using a population dataset to answer business style questions through filtering, aggregation and visualisation.

Context

This project was completed as part of the Generation UK Data Analyst programme, using a global population dataset. The objective was to explore demographic data across countries and continents and answer structured analytical questions using Python.

Aim

The aim of this project was to use Python to uncover regional population trends, identify outliers, and measure growth patterns over time to support demographic comparisons.

Approach

🧹Data Preparation

Imported CSV data into Pandas DataFrames
Inspected dataset structure and column type
Filtered data by year, continent, and population thresholds
Identified missing or zero-value population records

📊Data Analysis

Calculated total population values using aggregate functions
Computed average population values for regional comparisons
Created metrics such as population growth between years
Used conditional logic to classify countries above or below regional averages

📈Visualisation

Created horizontal bar charts to compare population distribution across many countries
Used scatter plots to highlight population outliers
Selected visual formats based on dataset size and comparison goals

Visualisations

The following visualisations were created to support key analytical questions and highlight patterns and outliers within the population data.

Figure 1: Country population in 2007, highlighting countires with populations >1000.

Figure 2: Population distribution across Africa in 2010, it highlights substantial variation in population size.

Key Insights

Several countries recorded population values of zero in 2000, suggesting missing or incomplete data
Africa's population in 2010 was unevenly distributed, with a small number of countries accounting for a large share
Population levels across South America varied depending on the regional average
Only a small number of countries exceeded a population of 1000 in 2007
Europe experienced an overall decline in population growth between 2000 and 2010

Syntax Examples

Selected Python code snippets demonstrating data filtering, aggregation and time based analysis.

Python Syntax Q1

Filtering and data quality check

pop_2000 = population[population['year'] == 2000]

no_data = pop_2000[pop_2000['population'] == 0]

countries_no_data = no_data[['country name', 'continent']].drop_duplicates()

print(countries_no_data)

Python Syntax Q2

Aggregating and Visualisation

pop_year_continent = population[(population['year'] == 2010) & (population['continent'] == 'Africa')]

total_pop_2010 = pop_year_continent['population'].sum()

print(total_pop_2010)

africa_2010 = population[(population['year'] == 2010) & (population['continent'] == 'Africa')]

africa_2010 = africa_2010.sort_values('population', ascending=False).head(10)

countries = africa_2010['country name']

populations = africa_2010['population']

plt.barh(countries, populations)

plt.xlabel('Population')

plt.ylabel('Country')

plt.title('Population across Africa in 2010')

plt.show()

Python Syntax Q5

Time based analysis using pivot tables

europe_pop = population[population['continent'] == 'Europe']

europe_pop_2000 = europe_pop[europe_pop['year'] == 2000]['population'].sum()

europe_pop_2010 = europe_pop[europe_pop['year'] == 2010]['population'].sum()

growth = europe_pop_2010 - europe_pop_2000

growth_percent = growth/europe_pop_2000 * 100

print(growth_percent)

pivot = europe_pop.pivot(index = 'country name', columns = 'year', values = 'population')

pivot['growth'] = pivot[2010] - pivot[2000]

top = pivot.sort_values(by ='growth').head(5)

print(top)

Reflection

This project strengthened my ability to use Python, Pandas, and Matplotlib to explore large datasets and translate analytical questions into insights. While I am still learning Python, this task demonstrated my ability to translate analytical questions into code and communicate findings clearly through structured analysis and visuals.

Back to Projects

Page updated

Google Sites

Report abuse