Leland Data Consulting
  • Home
  • About
  • Services
  • Portfolio
  • Contact

WHO Mortality Dashboard

Tableau Showcase: From Fixed Expressions to Dynamic Titles

image

Project description: With Tableau being all the rage in data informatics these days, I can’t hide behind my mastery of ggplot and matplotlib. As wonderful as those tools are for bespoke data visualization, 9/10 times in industry you just need some nice charts for a meeting with surface level user interaction. As such, I’ve joined the masses and, in addition to the arcane skills of bespoke script based graphics, I’m now proficient in Tableau. I found a fascinating data set from the World Health Organization (WHO) that consists of all the major causes of death across the globe for six different years between 2000 and 2021 and managed to make a dashboard that integrates some really cool functionalities of Tableau without being too overwhelming.

Link to the Tableau Dashboard

The Data

image

All the data used in this project can be found here. The WHO provides six excel workbooks of tracked causes of death for each country they monitor. Each workbook corresponds to a different year and the years available are 2000, 2010, 2015, 2019, 2020, 2021. Each workbook is divided into 9 sheets with the first being a notes page, the second a collection of all causes of deaths across all ages, and the rest are causes of death broken down by age categories. As I was interested in the top causes of death per country I used the second page of each workbook as my data source.

Each second page was a massive, 661x192, and formatted to be human readable with a lot of white space bullet point style organization. Additionally, these pages used conditional formatting to highlight countries with differing levels of statistical verification of their reported numbers. Green was high completeness/quality and transitioned through yellow to orange then finally red as the data became less complete/verified. As such, I chose to focus on the verified countries as any interesting visual trends seen in the final Tableau dashboard would have more statistical weight and meaning behind them.

R Based Data Prep

I use Tableau Public which doesn’t come with Tableau Prep, the GUI data manipulation tool. As such, all of my pre-processing was done in R. This required importing each excel workbook into R, grabbing the second sheet with the data we want, extracting only the relevant data, and doing a bunch of formatting to change the human readable spreadsheet into a CSV that Tableau accepted.

library(tidyverse)  # For data manipulation functions (dplyr, tidyr, etc.)
library(readxl)     # For reading Excel files

# Function to read the "All ages" sheet from the given Excel file and drop unnecessary rows
read_sheet <- function(file_name) {
  sheet <- read_excel(file_name, sheet = "All ages")[-1:-5,]
  return(sheet)
}

# Function to clean the "All ages" sheet data
clean_sheet <- function(sheet) {
  
  all_ages_sheet <- sheet
  
  # Filter out rows where the first column is either "Persons", NA, or "Sex"
  # Eliminates summary, sex divided, and empty rows
  all_ages_sheet <- all_ages_sheet %>%
    filter(all_ages_sheet[[1]] == "Persons" | is.na(all_ages_sheet[[1]]) | all_ages_sheet[[1]] == "Sex")
  
  # Remove the first 5 bullet point formatted info columns
  all_ages_sheet <- all_ages_sheet[,-1:-5]
  
  # Grab non-sub divided cause of death info and add in placeholder data 
  # This avoids dropping the column in a later step
  causes <- all_ages_sheet[,1]
  causes[1:7,] <- "Placeholder"
  
  # Keep columns that represent verified country data based on specific criteria
  # Hidden conditional formatting numbers in the row under 3 letter country codes
  verified_country_data <- all_ages_sheet %>% select(where(~ !is.na(.[3]) && .[3] < 2))
  
  # Combine the causes of death with the verified country data
  verified_N_causes <- cbind(causes, verified_country_data)
  
  # Remove rows that contain patterns like lowercase letters followed by a period
  # Also remove rows with NA in the 6th column
  df_initial <- verified_N_causes[!grepl("^[a-z]\\.$", verified_N_causes$...6) & !is.na(verified_N_causes$...6), ]
  
  # Remove the first 4 rows
  df_cleaned <- df_initial[-c(1:4), ]
  
  # Set the first row as column names and remove the first 3 rows of data
  colnames(df_cleaned) <- c("Cause of Death", as.character(df_initial[1, -1]))
  df_cleaned <- df_cleaned[-1:-3,]
  
  return(df_cleaned)  # Return the cleaned dataset
}

# Function to build a dataset for a specific year
build_year_df <- function(df_cleaned, Year) {
  
  # Extract the list of countries from the cleaned data, skipping the "Cause of Death" column
  countries <- colnames(df_cleaned)[-1:-2]  # Ignore the first two columns
  
  # Initialize an empty list and dataframe to store the top causes of death for each country
  top_causes_per_country <- list()
  test <- data.frame()
  
  # For each country, find the top 5 causes of death
  for (country in countries) {
    # Sort the data by the death numbers for the country in descending order
    df_sorted <- df_cleaned %>%
      arrange(desc(as.numeric(.data[[country]]))) %>%
      select("Cause of Death", all_of(country)) %>%
      head(5)  # Select the top 5 causes of death for the country
    
    # Add the country name and year to the sorted data
    df_sorted$Country <- country
    df_sorted$Year <- Year
    
    # Rename the column containing the death numbers to "Numbers"
    names(df_sorted)[names(df_sorted) == country] <- "Numbers"
    
    # Append the sorted data to the result dataframe
    test <- rbind(test, df_sorted)
  }
  
  # Convert the results to a dataframe and return it
  top_causes_df <- as.data.frame(test)
}

# Function to extract the second year from a filename string (if it contains multiple years)
get_second_year <- function(string) {
  # Find all four-digit numbers (representing years) in the filename
  matches <- regmatches(string, gregexpr("\\d{4}", string))
  
  # Return the second year if it exists, otherwise return NA
  if (length(matches[[1]]) >= 2) {
    return(matches[[1]][2])
  } else {
    return(NA)
  }
}

# Set the working directory to the folder containing the Excel files
path <- "D:/Data_Projects/Tableau/WHO_Death/Data/Raw_Year_xl"
setwd(path)

# Get a list of all files in the directory
rlist <- list.files(".")

# Apply the get_second_year function to each file to extract the year information
Years <- sapply(rlist, get_second_year)

# Initialize an empty dataframe to store all the data from multiple years
all_data <- data.frame()

# Initialize a counter to track the current file
count <- 1

# Loop through each year and process the corresponding Excel file
for (year in Years) {
  # Read, clean, and build the dataset for the current year
  df_year <- build_year_df(clean_sheet(read_sheet(rlist[count])), year)
  
  # Append the cleaned and processed data to the main dataframe
  all_data <- rbind(all_data, df_year)
  
  # Increment the counter to move to the next file
  count <- count + 1
}

# Write the combined data for all years and countries to a CSV file
write.csv(all_data, file = "Top_5_All_Years.csv")

The final CSV looked like this:

image

Tableau Time

As we have data separated by space, time, and category, two types of charts immediately come to mind. Firstly, a world map to cover the spatial dimension and secondly a stacked bar chart of causes of death per year which takes care of time and category. I wanted the user to be able to select multiple countries and be able to visually understand trends in major causes of death as well as being able to select causes of death to be highlighted across the data. This posed the perfect challenge as it requires some tinkering across multiple chart and data types as well as actions, expressions, filters, parameters, and calculated fields. Thus, a relatively simple dashboard can have some interesting behaviors that users might not often get to see.

The Map

image

Tableau is awesome for auto-generating simple map based charts. Simply butting the country column from the CSV data source into the details section created a monocolor map. I wanted to add color in a meaningful way without overwhelming the dashboard user. To do so I decided that instead of using some kind of multi color scheme to represent multiple causes of death per country over the six years, I would highlight each country via its top cause of death for a year selected by the user. Making a Year Selected parameter that the user could change was easy but finding the top cause of death for each year for each country was a little tricky. Tableau uses its bespoke Tableau Calculation Language (TCL) for internal data wrangling and it definitely has some quirks. As such I usually put in a couple layers of redundancy to catch errors before they happen. First, I created a calculated field that grabbed the cause of death numbers for a chosen value of the Selected Year parameter:

image

Next I used the following expression:

image

The first line iterates, grabbing unique combinations of Country, COD, and Selected Year and returns their associated number of death values. The second line is basically the same thing but finds the maximum total number of deaths for each country across all causes of death. It does this by first calculating the sum of deaths for each combination of Country and Cause of Death (like the first expression) and then taking the maximum of those sums for each country. By comparing these two expressions we return true when a unique combination of Country, COD, and Selected Year has a sum equal to the maximum sum for that country. This gives us the top cause of death per country for a given year.

From there, I used this boolean to drive map coloring and tooltips, so clicking a country both filters the bar chart and highlights that country’s top COD.

The Stacked Bar Chart

The second major view is a stacked bar chart of causes of death by year. It uses the same Top_5_All_Years.csv data source, filtered down to the top causes per country:

  • Bars encode the total number of deaths.
  • Colors encode the cause of death.
  • Filters let the user focus on selected countries and causes.

With parameter actions and highlight actions, clicking a bar segment highlights that cause across all selected countries and years, giving a quick sense of how a particular COD behaves globally.

UX and Parameter‑Driven Titles

A small but important UX touch is the dynamic guidance in chart titles. When nothing is selected, titles instruct the user to click a country or bar. Once a country is selected, the titles update to show the selection. Clearing the selection restores the helper text instead of leaving blank or confusing views.

To do this, I used a Country Selected parameter and a worksheet action that resets the parameter when the selection is cleared. The dashboard action looks like:

image

It’s a small detail, but it makes the dashboard much more approachable. Tableau makes “good enough” visuals easy; polishing the UX requires a bit more effort and an understanding of how its calculation language and actions work together.

For me, this project was as much about sharpening my Tableau skills (parameters, actions, and TCL quirks) as it was about the mortality data itself.

Link to the Tableau Dashboard

 
  • © 2025 LELAND DATA CONSULTING. ALL RIGHTS RESERVED.