Global Population Trends
Answering analytical questions using SQL queries on a relational population database.
Answering analytical questions using SQL queries on a relational population database.
This project was completed early in my data analyst training as part of the Generation UK programme. A relational database was provided, and the task involved writing SQL queries to answer a series of structured analytical questions within a limited timeframe.
To develop confidence using SQL to retrieve, combine, and summarise data from multiple tables in order to answer defined business style questions.
Explored the provided tables to understand their structure and relationships
Identified key fields needed for analysis, such as country, continent, year, and population
Wrote SQL queries using:
SELECT, WHERE, ORDER BY
JOIN clauses to combine country and population tables
Aggregations such as COUNT(), SUM() AND AVG()
GROUP BY to summarise data by region or country
Conditional logic using CASE for year comparisons
Different regions showed large differences in both the number of countries and total population.
Excluding missing or zero values was necessary to avoid misleading conclusions.
Combining geographical and demographic tables allowed for more accurate insights.
Comparing multiple years showed that growth patterns were not consistent across all regions.
Techniques demonstrated:
SQL aggregation (COUNT, SUM, AVG, JOINs)
Filtering
Grouping
Calculated metrics
Here are the SQL queries that were used in this project:
SELECT
COUNT(continent) AS Africa_Count
FROM countries
WHERE continent = 'Africa';
SELECT
SUM(population) AS Africa_Population_2010
FROM population_years AS p
LEFT JOIN countries AS c
ON p.country_id = c.id
WHERE c.continent = 'Africa'
AND p.year = '2010';
SELECT
AVG(p.population) AS Avg_Population
FROM population_years AS p
LEFT JOIN countries AS c
ON p.country_id = c.id
WHERE c.continent = 'South America'
AND p.year = '2000'
GROUP BY c.continent;
SELECT
c.name as Country,
p.population
FROM population_years AS p
LEFT JOIN countries AS c
ON p.country_id = c.id
WHERE population > '0'
AND p.year = '2007'
ORDER BY population ASC;
SELECT
SUM(CASE WHEN year = 2000 THEN population ELSE 0 END) AS sum_population_2000,
SUM(CASE WHEN year = 2010 THEN population ELSE 0 END) AS sum_population_2010
FROM population_years AS p
LEFT JOIN countries AS c
ON p.country_id = c.id
WHERE c.continent = 'Europe';
SELECT
ROUND(((714-706)/714.0) * 100,2) AS percentage_decrease
This project strengthened my foundational SQL skills and improved my ability to extract and summarise information from a relational database. I gained practical experience working through structured analytical questions using joins and aggregate functions.
As my first SQL project, this task was a valuable learning experience. It improved my understanding of how relational tables connect and reinforced the importance of validating data before concluding.