This data set was obtained from the Office for National Statistics from the dataset Alcohol-specific deaths in the UK. This data set contained multiple different tables of data; I chose to use Table 2, which focuses on age-standardised alcohol-specific death rates per 100,000 population by sex in England from 2001 to 2022. The data in this dataset is collected from information supplied when deaths are certified and registered as part of civil registration. Person’s number of deaths refers to the total of both male and female deaths.
Area.code…note.2. | Area.of.usual.residence…note.2. | Year.of.death.registration…note.3. | Persons..Number.of.deaths | Persons..Rate.per.100.000…note.4. | Persons..LCL…note.6. | Persons..UCL…note.6. | Males..Number.of.deaths | Males..Rate.per.100.000…note.4. | Males..LCL…note.6. | Males..UCL…note.6. | Females..Number.of.deaths | Females..Rate.per.100.000…note.4. | Females..LCL…note.6. | Females..UCL…note.6. |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
E92000001 | England | 2022 | 7,912 | 14.5 | 14.1 | 14.8 | 5,184 | 19.5 | 19.0 | 20.0 | 2,728 | 9.7 | 9.3 | 10.1 |
E92000001 | England | 2021 | 7,558 | 13.9 | 13.6 | 14.2 | 4,949 | 18.7 | 18.2 | 19.2 | 2,609 | 9.3 | 9.0 | 9.7 |
E92000001 | England | 2020 | 6,984 | 13.0 | 12.7 | 13.3 | 4,594 | 17.5 | 17.0 | 18.0 | 2,390 | 8.6 | 8.3 | 9.0 |
E92000001 | England | 2019 | 5,820 | 10.8 | 10.6 | 11.1 | 3,904 | 15.0 | 14.5 | 15.4 | 1,916 | 7.0 | 6.7 | 7.3 |
E92000001 | England | 2018 | 5,698 | 10.7 | 10.4 | 11.0 | 3,830 | 14.8 | 14.3 | 15.2 | 1,868 | 6.9 | 6.6 | 7.2 |
E92000001 | England | 2017 | 5,843 | 11.1 | 10.8 | 11.4 | 3,853 | 15.0 | 14.5 | 15.5 | 1,990 | 7.4 | 7.0 | 7.7 |
E92000001 | England | 2016 | 5,507 | 10.5 | 10.2 | 10.8 | 3,687 | 14.4 | 14.0 | 14.9 | 1,820 | 6.8 | 6.5 | 7.1 |
E92000001 | England | 2015 | 5,306 | 10.3 | 10.0 | 10.5 | 3,510 | 13.9 | 13.5 | 14.4 | 1,796 | 6.8 | 6.5 | 7.1 |
E92000001 | England | 2014 | 5,386 | 10.5 | 10.2 | 10.8 | 3,583 | 14.3 | 13.9 | 14.8 | 1,803 | 6.9 | 6.6 | 7.2 |
E92000001 | England | 2013 | 5,184 | 10.2 | 9.9 | 10.5 | 3,485 | 14.1 | 13.6 | 14.5 | 1,699 | 6.6 | 6.2 | 6.9 |
My visualisation will focus on answering whether there has been an increase in alcoholic deaths in England, and try to see if there could be a correlation with the COVID-19 pandemic.
To prepare my data I cleaned it up and got rid of columns that I wasn’t interested in
Year | PersonsDeaths | MaleDeaths | FemaleDeaths |
---|---|---|---|
2022 | 7912 | 5184 | 2728 |
2021 | 7558 | 4949 | 2609 |
2020 | 6984 | 4594 | 2390 |
2019 | 5820 | 3904 | 1916 |
2018 | 5698 | 3830 | 1868 |
2017 | 5843 | 3853 | 1990 |
2016 | 5507 | 3687 | 1820 |
2015 | 5306 | 3510 | 1796 |
2014 | 5386 | 3583 | 1803 |
2013 | 5184 | 3485 | 1699 |
For the first visualisation, I wanted to give a clear demonstration of the death rates from alcohol throughout the years 2002-2022, using a line graph.
# Create the line graph
ggplot(data, aes(x = Year)) +
# Line for the trend of deaths with a label in the legend
geom_line(aes(y = PersonsDeaths, color = "Trend Line"), size = 1.5) +
# Points for individual data values with a label in the legend
geom_point(aes(y = PersonsDeaths, color = "Data Points"), size = 4) +
# Custom labels for the axes and title
labs(
title = "Alcohol-Specific Deaths in the UK (2002-2022)",
x = "Year",
y = "Number of Deaths",
color = "Legend" # Title of the legend
) +
# Show every year on the x-axis
scale_x_continuous(breaks = seq(2002, 2022, by = 1)) +
# Y-axis breaks in increments of 500, with limits
scale_y_continuous(
limits = c(4000, 8000),
breaks = seq(4000, 8000, by = 500)
) +
# Making it look nice
scale_color_manual(
values = c("Trend Line" = "darkgreen", "Data Points" = "pink")
) +
theme_minimal() +
theme(
panel.grid.major = element_line(color = "grey", linetype = "dashed"), # Green gridlines
panel.grid.minor = element_blank(), # Remove minor gridlines
plot.title = element_text(size = 18, face = "bold", hjust = 0.5, color = "darkgreen"), # Purple title
axis.title.x = element_text(size = 14, face = "bold", color = "darkgreen"), # Purple x-axis label
axis.title.y = element_text(size = 14, face = "bold", color = "darkgreen"), # Purple y-axis label
axis.text.x = element_text(size = 12, color = "darkgreen"), # Purple x-axis ticks
axis.text.y = element_text(size = 12, color = "darkgreen"), # Purple y-axis ticks
legend.position = "bottom", # Move the legend to the bottom
legend.title = element_text(size = 12, face = "bold"), # Style the legend title
legend.text = element_text(size = 10) # Style the legend text
)
This visualisation helps clearly portray how there has been a gradual increase over the past 20 years, using a line of trend and points for individual data values to help with clarity.
To try and portray the drastic difference even more, I decided to created a bar chart, showing death rates from alcohol in 2002, and in 2022.
# Simple bar chart to show the difference of total deaths from alcohol in 2002 and 2022
# Filtering the data for the years 2002 and 2022
bar_data <- data[data$Year %in% c(2002, 2022), ]
# Creating the bar chart
ggplot(bar_data, aes(x = factor(Year), y = PersonsDeaths, fill = factor(Year))) +
geom_bar(stat = "identity", width = 0.6, show.legend = FALSE) +
labs(
title = "Comparison of Alcohol-Specific Deaths (2002 vs. 2022)",
x = "Year",
y = "Number of Deaths"
) +
# Making it look nice
scale_fill_manual(values = c("#9B59B6", "#C8A2C8")) +
theme_minimal() +
theme(
panel.grid.major = element_blank(), # Remove major gridlines
panel.grid.minor = element_blank(), # Remove minor gridlines
plot.title = element_text(size = 18, face = "bold", hjust = 0.5, color = "#8E44AD"), # Purple title
axis.title.x = element_text(size = 14, face = "bold", color = "#36013F"), # Purple x-axis label
axis.title.y = element_text(size = 14, face = "bold", color = "#36013F"), # Purple y-axis label
axis.text.x = element_text(size = 12, color = "#36013F"), # Purple x-axis ticks
axis.text.y = element_text(size = 12, color = "#36013F") # Purple y-axis ticks
)
As shown by my previous visualisation, it is obvious that there has been a major increase, which particularly correlates with the pandemic and the introduction of lockdowns. I decided to take a mean from 2020-2022 and a mean from 2017-2019, to try and show visually the difference in alcoholic deaths in just short difference of time.
# Summarize data for the specified year ranges
summary_data <- data %>%
mutate(Period = case_when(
Year %in% 2017:2019 ~ "2017-2019",
Year %in% 2020:2022 ~ "2020-2022",
TRUE ~ NA_character_ # Exclude other years
)) %>%
filter(!is.na(Period)) %>%
group_by(Period) %>%
summarise(MeanDeaths = mean(PersonsDeaths))
# Create the bar chart
ggplot(summary_data, aes(x = Period, y = MeanDeaths, fill = Period)) +
geom_bar(stat = "identity", width = 0.6, show.legend = FALSE) +
geom_text(aes(label = round(MeanDeaths, 1)), vjust = -0.5, size = 5, color = "black") +
labs(
title = "Comparison of Mean Alcohol-Specific Deaths: 2017-2019 vs. 2020-2022",
x = "Time Period", y = "Mean Deaths"
) +
# Make it look nice
scale_fill_manual(values = c("2017-2019" = "#FFC0CB", "2020-2022" = "#FF69B4")) +
theme_minimal() +
theme(
panel.grid.major = element_blank(), # Remove major gridlines
panel.grid.minor = element_blank(), # Remove minor gridlines
panel.border = element_rect(color = "black", fill = NA), # Add black border
plot.title = element_text(size = 18, face = "bold", hjust = 0.5, color = "black"), # Black title
axis.title.x = element_text(size = 14, face = "bold", color = "black"), # Black x-axis label
axis.title.y = element_text(size = 14, face = "bold", color = "black"), # Black y-axis label
axis.text.x = element_text(size = 12, color = "black"), # Black x-axis ticks
axis.text.y = element_text(size = 12, color = "black") # Black y-axis ticks
)
This shows a clear increase in alcohol-specific deaths during the time
of the COVID-19 Pandemic.
Alcohol-specific deaths were the highest they have ever been in the past 20 years during 2022. I think an interesting area to next investigate is alcohol consumption during these more recent years and its correlation to months in lockdown.