Improving Data Storytelling: Labels and Annotations

class: center, middle, inverse, title-slide

.title[
# Improving Data Storytelling: Labels and Annotations
]
.author[
### Nicholas Sim
]
.date[
### 08 August 2025
]

---

class: center, middle, inverse
# Introduction

---
### Topics

* Label aesthetic and repelling labels with `geom_text_repel()`

* Annotation with areas and text with `annotate()`

---
### Required Libraries

``` r
library(tidyverse)
library(socviz)
library(ggrepel)

theme_set(theme_bw()) # Set the map theme to black and white
```

---
### Introduction

In this discussion, we explore the use of labels and annotations to improve the clarity and context of data visualizations. Labels are aesthetics, i.e. text values mapped from a data frame variable, that help in the identification of points in a scatter plot or values represented by bars in a bar chart.

Annotations, on the other hand, act as explanatory elements within a data visualization to provide descriptions of specific features in a plot. Unlike labels, annotations are not derived from the data frame and therefore are not an aesthetic. Instead, they are specified as text to explain certain features in a plot or as shaded areas that highlight particular relationships or events of significance.

The focus in this discussion is to use various elements to improve data storytelling. We will consider how labels and annotations can be used and how we may clean up the plot for more effective data storytelling.

---
class: center, middle, inverse
# The Label Aesthetic

---
### Adding Labels

Using the `label` aesthetic, labels can be added to visualizations such as scatter plots, bar charts, etc. Labels can be included as a label aesthetic using `geom_label`. However, the labels may overlapped if they are too crowded, and this makes it challenging to read these labels in the first place. This is the case when we use the `geom_label()` function that comes standard with the `ggolot2` package.

To avoid overcrowding, a useful package for adding labels that are automatically positioned nicely is the `ggrepel` package. The function `geom_text_repel()` from the package adds labels to the data points by calculating the best positions to place these labels in the figure.

---
### Example: How Were The US Presidential Elections Won?

As an example, let's use the `elections_historic` dataset from the `socviz` library. We will plot `ec_pct` against `popular_pct`. We will be labelling the scatter points using `winner_label` as the label aesthstic. Let's explore the dataset here.

.panelset[
.panel[.panel-name[R Code]

``` r
glimpse(elections_historic) 
```

```
## Rows: 49
## Columns: 19
## $ election       <int> 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23,…
## $ year           <int> 1824, 1828, 1832, 1836, 1840, 1844, 1848, 1852, 1856, 1…
## $ winner         <chr> "John Quincy Adams", "Andrew Jackson", "Andrew Jackson"…
## $ win_party      <chr> "D.-R.", "Dem.", "Dem.", "Dem.", "Whig", "Dem.", "Whig"…
## $ ec_pct         <dbl> 0.3218, 0.6820, 0.7657, 0.5782, 0.7959, 0.6182, 0.5621,…
## $ popular_pct    <dbl> 0.3092, 0.5593, 0.5474, 0.5079, 0.5287, 0.4954, 0.4728,…
## $ popular_margin <dbl> -0.1044, 0.1225, 0.1781, 0.1420, 0.0605, 0.0145, 0.0479…
## $ votes          <int> 113142, 642806, 702735, 763291, 1275583, 1339570, 13602…
## $ margin         <int> -38221, 140839, 228628, 213384, 145938, 39413, 137882, …
## $ runner_up      <chr> "Andrew Jackson", "John Quincy Adams", "Henry Clay", "W…
## $ ru_part        <chr> "D.-R.", "N. R.", "N. R.", "Whig", "Dem.", "Whig", "Dem…
## $ turnout_pct    <dbl> 0.269, 0.573, 0.570, 0.565, 0.803, 0.792, 0.728, 0.695,…
## $ winner_lname   <chr> "Adams", "Jackson", "Jackson", "Buren", "Harrison", "Po…
## $ winner_label   <chr> "Adams 1824", "Jackson 1828", "Jackson 1832", "Buren 18…
## $ ru_lname       <chr> "Jackson", "Adams", "Clay", "Harrison", "Buren", "Clay"…
## $ ru_label       <chr> "Jackson 1824", "Adams 1828", "Clay 1832", "Harrison 18…
## $ two_term       <lgl> FALSE, TRUE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, F…
## $ ec_votes       <dbl> 84, 178, 219, 170, 234, 170, 163, 254, 174, 180, 212, 2…
## $ ec_denom       <dbl> 261, 261, 286, 294, 294, 275, 290, 296, 296, 303, 233, …
```
]

.panel[.panel-name[Data]
<div class="datatables html-widget html-fill-item" id="htmlwidget-db5bab4da20d2c2b6bd3" style="width:100%;height:auto;"></div>
<script type="application/json" data-for="htmlwidget-db5bab4da20d2c2b6bd3">{"x":{"filter":"none","vertical":false,"data":[["1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20","21","22","23","24","25","26","27","28","29","30","31","32","33","34","35","36","37","38","39","40","41","42","43","44","45","46","47","48","49"],[10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58],[1824,1828,1832,1836,1840,1844,1848,1852,1856,1860,1864,1868,1872,1876,1880,1884,1888,1892,1896,1900,1904,1908,1912,1916,1920,1924,1928,1932,1936,1940,1944,1948,1952,1956,1960,1964,1968,1972,1976,1980,1984,1988,1992,1996,2000,2004,2008,2012,2016],["John Quincy Adams","Andrew Jackson","Andrew Jackson","Martin Van Buren","William Henry Harrison","James Polk","Zachary Taylor","Franklin Pierce","James Buchanan","Abraham Lincoln","Abraham Lincoln","Ulysses Grant","Ulysses Grant","Rutherford Hayes","James Garfield","Grover Cleveland","Benjamin Harrison","Grover Cleveland","William McKinley","William McKinley","Theodore Roosevelt","William Taft","Woodrow Wilson","Woodrow Wilson","Warren Harding","Calvin Coolidge","Herbert Hoover","Franklin Roosevelt","Franklin Roosevelt","Franklin Roosevelt","Franklin Roosevelt","Harry Truman","Dwight Eisenhower","Dwight Eisenhower","John Kennedy","Lyndon Johnson","Richard Nixon","Richard Nixon","Jimmy Carter","Ronald Reagan","Ronald Reagan","George H. W. Bush","Bill Clinton","Bill Clinton","George W. Bush","George W. Bush","Barack Obama","Barack Obama","Donald Trump"],["D.-R.","Dem.","Dem.","Dem.","Whig","Dem.","Whig","Dem.","Dem.","Rep.","Rep.","Rep.","Rep.","Rep.","Rep.","Dem.","Rep.","Dem.","Rep.","Rep.","Rep.","Rep.","Dem.","Dem.","Rep.","Rep.","Rep.","Dem.","Dem.","Dem.","Dem.","Dem.","Rep.","Rep.","Dem.","Dem.","Rep.","Rep.","Dem.","Rep.","Rep.","Rep.","Dem.","Dem.","Rep.","Rep.","Dem.","Dem.","Rep."],[0.3218,0.6820000000000001,0.7657,0.5782,0.7959000000000001,0.6182,0.5621,0.8581,0.5878,0.5941,0.9099,0.7279,0.8125,0.5014,0.5799,0.5461,0.581,0.6239,0.6063,0.6523,0.7059,0.6646,0.8192,0.5217000000000001,0.7608,0.7194,0.8362000000000001,0.8889,0.9849,0.8456,0.8136,0.5706,0.8324,0.8606,0.5642,0.9033,0.5595,0.9665,0.552,0.9089,0.9758,0.7917999999999999,0.6877,0.7045,0.5037,0.5316,0.6784,0.6171,0.5687],[0.3092,0.5593,0.5474,0.5079,0.5286999999999999,0.4954,0.4728,0.5083,0.4529,0.3965,0.5503,0.5266,0.5558,0.4792,0.4831,0.4885,0.478,0.4602,0.5102,0.5164,0.5642,0.5157,0.4184,0.4924,0.6032,0.5404,0.5821,0.5741000000000001,0.608,0.5474,0.5339,0.4955,0.5518,0.5737,0.4972,0.6105,0.4342,0.6067,0.5008,0.5075,0.5877,0.5337,0.4301,0.4923,0.4787,0.5073,0.5293,0.5106000000000001,0.4625],[-0.1044,0.1225,0.1781,0.142,0.0605,0.0145,0.0479,0.06950000000000001,0.122,0.1013,0.1008,0.0532,0.118,-0.03,0.0009,0.0057,-0.083,0.0301,0.0431,0.0612,0.1883,0.0853,0.1444,0.0312,0.2617,0.2522,0.1741,0.1776,0.2426,0.09959999999999999,0.075,0.0448,0.1085,0.154,0.0017,0.2258,0.007,0.2315,0.0206,0.0974,0.1821,0.0772,0.0556,0.0851,-0.051,0.0246,0.0727,0.0386,-0.0175],[113142,642806,702735,763291,1275583,1339570,1360235,1605943,1835140,1855993,2211317,3013790,3597439,4034142,4453337,4914482,5443633,5553898,7112138,7228864,7630557,7678335,6296284,9126868,16144093,15723789,21427123,22821277,27752648,27313945,25612916,24179347,34075529,35579180,34220984,43127041,31783783,47168710,40831881,43903230,54455472,48886597,44909806,47400125,50460110,62040610,69498516,65915796,62562131],[-38221,140839,228628,213384,145938,39413,137882,219525,494472,474049,405090,304810,763729,-252666,1898,57579,-94530,363099,601331,857932,2546677,1269356,2173563,578140,7004432,7337547,6411659,7060023,11070786,4966201,3594987,2188055,6700439,9551152,112827,15951287,511944,17995488,1683247,8423115,16878120,7077121,5805256,8201370,-543816,3012171,9550193,4982296,-2363361],["Andrew Jackson","John Quincy Adams","Henry Clay","William Henry Harrison","Martin Van Buren","Henry Clay","Lewis Cass","Winfield Scott","John Fremont","John Breckinridge","George McClellan","Horatio Seymour","Horace Greeley","Samuel Tilden","Winfield Scott Hancock","James Blaine","Grover Cleveland","Benjamin Harrison","William Jennings Bryan","William Jennings Bryan","Alton Brooks Parker","William Jennings Bryan","Theodore Roosevelt","Charles Evans Hughes","James Cox","John Davis","Al Smith","Herbert Hoover","Alf Landon","Wendell Willkie","Thomas Dewey","Thomas Dewey","Adlai Stevenson","Adlai Stevenson","Richard Nixon","Barry Goldwater","Hubert Humphrey","George McGovern","Gerald Ford","Jimmy Carter","Walter Mondale","Michael Dukakis","George H. W. Bush","Bob Dole","Al Gore","John Kerry","John McCain","Mitt Romney","Hillary Clinton"],["D.-R.","N. R.","N. R.","Whig","Dem.","Whig","Dem.","Whig","Rep.","Dem.","Dem.","Dem.","L. R.","Dem.","Dem.","Rep.","Dem.","Rep.","Dem.","Dem.","Dem.","Dem.","Prog.","Rep.","Dem.","Dem.","Dem.","Rep.","Rep.","Rep.","Rep.","Rep.","Dem.","Dem.","Rep.","Rep.","Dem.","Dem.","Rep.","Dem.","Dem.","Dem.","Rep.","Rep.","Dem.","Dem.","Rep.","Rep.","Dem."],[0.269,0.573,0.57,0.5649999999999999,0.803,0.792,0.728,0.695,0.794,0.8179999999999999,0.763,0.8090000000000001,0.721,0.826,0.805,0.782,0.805,0.758,0.796,0.737,0.655,0.657,0.59,0.618,0.492,0.489,0.569,0.569,0.61,0.624,0.5590000000000001,0.522,0.623,0.602,0.638,0.628,0.625,0.5620000000000001,0.548,0.542,0.552,0.528,0.581,0.517,0.542,0.601,0.616,0.586,0.539],["Adams","Jackson","Jackson","Buren","Harrison","Polk","Taylor","Pierce","Buchanan","Lincoln","Lincoln","Grant","Grant","Hayes","Garfield","Cleveland","Harrison","Cleveland","McKinley","McKinley","Roosevelt","Taft","Wilson","Wilson","Harding","Coolidge","Hoover","Roosevelt","Roosevelt","Roosevelt","Roosevelt","Truman","Eisenhower","Eisenhower","Kennedy","Johnson","Nixon","Nixon","Carter","Reagan","Reagan","Bush","Clinton","Clinton","Bush","Bush","Obama","Obama","Trump"],["Adams 1824","Jackson 1828","Jackson 1832","Buren 1836","Harrison 1840","Polk 1844","Taylor 1848","Pierce 1852","Buchanan 1856","Lincoln 1860","Lincoln 1864","Grant 1868","Grant 1872","Hayes 1876","Garfield 1880","Cleveland 1884","Harrison 1888","Cleveland 1892","McKinley 1896","McKinley 1900","Roosevelt 1904","Taft 1908","Wilson 1912","Wilson 1916","Harding 1920","Coolidge 1924","Hoover 1928","Roosevelt 1932","Roosevelt 1936","Roosevelt 1940","Roosevelt 1944","Truman 1948","Eisenhower 1952","Eisenhower 1956","Kennedy 1960","Johnson 1964","Nixon 1968","Nixon 1972","Carter 1976","Reagan 1980","Reagan 1984","Bush 1988","Clinton 1992","Clinton 1996","Bush 2000","Bush 2004","Obama 2008","Obama 2012","Trump 2016"],["Jackson","Adams","Clay","Harrison","Buren","Clay","Cass","Scott","Fremont","Breckinridge","McClellan","Seymour","Greeley","Tilden","Hancock","Blaine","Cleveland","Harrison","Bryan","Bryan","Parker","Bryan","Roosevelt","Hughes","Cox","Davis","Smith","Hoover","Landon","Willkie","Dewey","Dewey","Stevenson","Stevenson","Nixon","Goldwater","Humphrey","McGovern","Ford","Carter","Mondale","Dukakis","Bush","Dole","Gore","Kerry","McCain","Romney","Clinton"],["Jackson 1824","Adams 1828","Clay 1832","Harrison 1836","Buren 1840","Clay 1844","Cass 1848","Scott 1852","Fremont 1856","Breckinridge 1860","McClellan 1864","Seymour 1868","Greeley 1872","Tilden 1876","Hancock 1880","Blaine 1884","Cleveland 1888","Harrison 1892","Bryan 1896","Bryan 1900","Parker 1904","Bryan 1908","Roosevelt 1912","Hughes 1916","Cox 1920","Davis 1924","Smith 1928","Hoover 1932","Landon 1936","Willkie 1940","Dewey 1944","Dewey 1948","Stevenson 1952","Stevenson 1956","Nixon 1960","Goldwater 1964","Humphrey 1968","McGovern 1972","Ford 1976","Carter 1980","Mondale 1984","Dukakis 1988","Bush 1992","Dole 1996","Gore 2000","Kerry 2004","McCain 2008","Romney 2012","Clinton 2016"],[false,true,true,false,false,false,false,false,false,true,true,true,true,false,false,true,false,true,true,true,false,false,true,true,false,false,false,true,true,true,true,false,true,true,false,false,true,true,false,true,true,false,true,true,true,true,true,true,false],[84,178,219,170,234,170,163,254,174,180,212,214,286,185,214,219,233,277,271,292,336,321,435,277,404,382,444,472,523,449,432,303,442,457,303,486,301,520,297,489,525,426,370,379,271,286,365,332,306],[261,261,286,294,294,275,290,296,296,303,233,294,352,369,369,401,401,444,447,447,476,483,531,531,531,531,531,531,531,531,531,531,531,531,537,538,538,538,538,538,538,538,538,538,538,538,538,538,538]],"container":"<table class=\"display\">\n  <thead>\n    <tr>\n      <th> <\/th>\n      <th>election<\/th>\n      <th>year<\/th>\n      <th>winner<\/th>\n      <th>win_party<\/th>\n      <th>ec_pct<\/th>\n      <th>popular_pct<\/th>\n      <th>popular_margin<\/th>\n      <th>votes<\/th>\n      <th>margin<\/th>\n      <th>runner_up<\/th>\n      <th>ru_part<\/th>\n      <th>turnout_pct<\/th>\n      <th>winner_lname<\/th>\n      <th>winner_label<\/th>\n      <th>ru_lname<\/th>\n      <th>ru_label<\/th>\n      <th>two_term<\/th>\n      <th>ec_votes<\/th>\n      <th>ec_denom<\/th>\n    <\/tr>\n  <\/thead>\n<\/table>","options":{"pageLength":8,"scrollY":200,"scrollX":"100%","scrollCollase":true,"columnDefs":[{"className":"dt-right","targets":[1,2,5,6,7,8,9,12,18,19]},{"orderable":false,"targets":0},{"name":" ","targets":0},{"name":"election","targets":1},{"name":"year","targets":2},{"name":"winner","targets":3},{"name":"win_party","targets":4},{"name":"ec_pct","targets":5},{"name":"popular_pct","targets":6},{"name":"popular_margin","targets":7},{"name":"votes","targets":8},{"name":"margin","targets":9},{"name":"runner_up","targets":10},{"name":"ru_part","targets":11},{"name":"turnout_pct","targets":12},{"name":"winner_lname","targets":13},{"name":"winner_label","targets":14},{"name":"ru_lname","targets":15},{"name":"ru_label","targets":16},{"name":"two_term","targets":17},{"name":"ec_votes","targets":18},{"name":"ec_denom","targets":19}],"order":[],"autoWidth":false,"orderClasses":false,"lengthMenu":[8,10,25,50,100]}},"evals":[],"jsHooks":[]}</script>
]
]

---
### Basic Plot

Let's declare the titles and labels in advance.

``` r
p_title <- "Presidential Elections: Popular & Electoral College Margins"  
p_subtitle <- "1824-2016"  
p_caption <- "Data for 2016 are provisional." 
x_label <- "Winner's share of Popular Vote"  
y_label <- "Winner's share of Electoral College Votes" 
```

---
### Basic Plot

A scatter plot of `ec_pct` against `popular_pct` is shown here.

.panelset[
.panel[.panel-name[R Code]

``` r
p <- ggplot(elections_historic, 
            aes(x = popular_pct, y = ec_pct))

p + geom_point() +
  labs(title = p_title, subtitle = p_subtitle, caption = p_caption, x = x_label, y = y_label) 
```
]

.panel[.panel-name[Data]
<img src="LabelsAndAnnotations_files/figure-html/elections.2-out-1.png" style="display: block; margin: auto;" />
]
]

---
### Adding Labels

Without labeling the scatter points, it is impossible to tell who had won the elections represented by these points.

To do so, we pass in name labels as a `label` aesthetic into the function `geom_text_repel()` (from the `ggrepel` package).

In the `elections_historic` dataset, the name labels are contained in the variable `winner_label`.

---
### Adding Labels

.panelset[
.panel[.panel-name[R Code]

``` r
p <- ggplot(elections_historic, 
            aes(x = popular_pct, y = ec_pct,  label = winner_label))

p +  geom_point() +  geom_text_repel() +
  labs(title = p_title, 
       subtitle = p_subtitle, 
       caption = p_caption, 
       x = x_label, y = y_label)
```
]

.panel[.panel-name[Data]
<img src="LabelsAndAnnotations_files/figure-html/elections.2a-out-1.png" style="display: block; margin: auto;" />
]
]

---
### Cleaning Up the Plot: Displaying Reference Lines

Let's create a 50% reference line on both x- and y-axis to see if a point has crossed 50% mark. To draw a *horizontal* line, we use `geom_hline()` and specify the value of `yintercept` (i.e. where the horizontal line will intercept the y-axis). To draw a *vertical* line, we use  `geom_vline()` and specify the value of `xintercept` (i.e. where the vertical line will intercept the x-axis).

---
### Cleaning Up the Plot: Displaying Reference Lines

.panelset[
.panel[.panel-name[R Code]

``` r
p <- ggplot(elections_historic, 
            aes(x = popular_pct, y = ec_pct,  label = winner_label))

.panel[.panel-name[Data]
<img src="LabelsAndAnnotations_files/figure-html/elections.3-out-1.png" style="display: block; margin: auto;" />
]
]

---
### Cleaning Up the Plot: Adjusting the Axes Labels

The axes represent the percentage of votes won. As these percentages are currently shown as numbers between 0 to 1, we should display them as actual percentages (say, 65\% than 0.65) by  passing in `labels = scales::percent` into `scale_x_continuous()`.

---
### Cleaning Up the Plot: Adjusting the Axes Labels

.panelset[
.panel[.panel-name[R Code]

``` r
p <- ggplot(elections_historic, 
            aes(x = popular_pct, y = ec_pct,  label = winner_label))

p +  geom_point() +  
  geom_text_repel() + 
  geom_hline(yintercept = 0.5, size = 1.1, color = "gray80", alpha=0.4) +  
  geom_vline(xintercept = 0.5, size = 1.1, color = "gray80", alpha=0.4) +   
  scale_x_continuous(labels = scales::percent) + scale_y_continuous(labels = scales::percent) + 
  labs(x = x_label, y = y_label, 
       title = p_title, subtitle = p_subtitle,  
       caption = p_caption) 
```
]

.panel[.panel-name[Data]
<img src="LabelsAndAnnotations_files/figure-html/elections.4-out-1.png" style="display: block; margin: auto;" />
]
]

---
### Cleaning Up the Plot: Customizing the Axis Labels

Rather than using `labels = scales::percent`, we may customize our labels using `labels = scales::unit_format(scale = 100, unit = "%", sep = "")`. Here, we multiply (or scale up) the original variable by 100 and use the  percentage sign (%) to convey the unit of measurement.

We also use `sep = ""` to ensure that there is no space between the value label and the % symbol (otherwise, we will have something like 65 % instead of 65%).

**Note**: `unit_format()` has been retired. We may use other alternatives, like `label_number()` or `comma_format()`. Please see the documentation for details.

---
### Cleaning Up the Plot: Customizing the Axis Labels

.panelset[
.panel[.panel-name[R Code]

``` r
p <- ggplot(elections_historic, aes(x = popular_pct, y = ec_pct,  label = winner_label))

p +  geom_point() +  
  geom_text_repel() + 
  geom_hline(yintercept = 0.5, size = 1.1, color = "gray80", alpha=0.4) +
  geom_vline(xintercept = 0.5, size = 1.1, color = "gray80", alpha=0.4) +
  scale_x_continuous(labels = scales::unit_format(scale = 100, unit="%", sep="")) +  
  scale_y_continuous(labels = scales::unit_format(scale=100, unit="%")) + 
  labs(x = x_label, y = y_label, 
       title = p_title, subtitle = p_subtitle, 
       caption = p_caption) 
```
]

.panel[.panel-name[Data]
<img src="LabelsAndAnnotations_files/figure-html/elections.4a-out-1.png" style="display: block; margin: auto;" />
]
]

---
### Cleaning Up the Plot: Adjusting the Color Scales

We may adjust the scales of other settings such as color or fill. For example, to adjust the color scales, we use `scales_color_<kind>`. See chapter 5 KH for more details.

To illustrate, let's display the party colors using the `color` aesthetic. In `elections_historic`, the `win_party` variable shows the party affiliations of the election winners.

---
### Cleaning Up the Plot: Adjusting the Color Scales

.panelset[
.panel[.panel-name[R Code]

``` r
p <- ggplot(elections_historic, 
            aes(x = popular_pct, y = ec_pct,  label = winner_label, color = win_party))

p +  geom_point() +  geom_text_repel() + 
  geom_hline(yintercept = 0.5, size = 1.1, color = "gray80", alpha=0.4) +  geom_vline(xintercept = 0.5, size = 1.1, color = "gray80", alpha=0.4) +
  scale_x_continuous(labels = scales::percent) +  scale_y_continuous(labels = scales::percent) + 
  labs(x = x_label, y = y_label, 
       title = p_title, subtitle = p_subtitle,  
       caption = p_caption) 
```
]

.panel[.panel-name[Data]
<img src="LabelsAndAnnotations_files/figure-html/elections.5-out-1.png" style="display: block; margin: auto;" />
]
]

---
### Cleaning Up the Plot: Adjusting the Color Scales

Let's differentiate the scatter points using the party colors. From the legend, notice that the Democratic party is represented by the 2nd element and the Republic party is represented by the 3rd element.

Therefore, we set the color codes for the Democratic and Republican party  as the 2nd and 3rd elements in the vector `party_colors`.

``` r
party_colors <- c("#000000", "#2E74C0", "#CB454A", "#000000")
```

Let's use `party_colors` as a color aesthetic to distinguish the scatter points.

---
### Cleaning Up the Plot: Adjusting the Color Scales

.panelset[
.panel[.panel-name[R Code]

``` r
p <- ggplot(elections_historic, aes(x = popular_pct, y = ec_pct,  label = winner_label, color = win_party))

p +  geom_point() +  
  geom_text_repel() + 
  geom_hline(yintercept = 0.5, size = 1.1, color = "gray80", alpha=0.4) + 
  geom_vline(xintercept = 0.5, size = 1.1, color = "gray80", alpha=0.4) +   
  scale_x_continuous(labels = scales::percent) +  scale_y_continuous(labels = scales::percent) + 
  scale_color_manual(values = party_colors) + ## Adjust the color scales
  labs(x = x_label, y = y_label, 
       title = p_title, subtitle = p_subtitle, 
       caption = p_caption) 
```
]

.panel[.panel-name[Data]
<img src="LabelsAndAnnotations_files/figure-html/elections.7-out-1.png" style="display: block; margin: auto;" />
]
]

---
class: center, middle, inverse
# Annotation

---
### Adding Text As Annotation

The  `annotate()` function can be used to highlight some important information on the plot itself. For example, we may add a text label next to a data point on the plot by passing in the setting `geom = "text"` into `annotate()` and specifying the `x`,  `y`, and `label` settings.

The `x` and `y` settings indicate the position of the text label and the `label` setting provides the annotated text iself. We may also use settings such as `size` and `color` as well as `hjust` and `vjust` to adjust the position of the text. To introduce a line break, we use `\n`.

.pull-left[

``` r
p <- ggplot(data = organdata, 
            mapping = aes(x = roads, y = donors))  
# \n is the break command (i.e. new line).
# A positive hjust moves the text to the left.

p + geom_point() +  
  annotate(geom = "text", x = 157, y = 33, hjust = 0,   
           label = "A surprisingly high \nrecovery rate.") 
```
]

.pull-right[
<img src="LabelsAndAnnotations_files/figure-html/annotate.1-out-1.png" style="display: block; margin: auto;" />
]
]

---
### Adding a Shaded Area as Annotation

To shade an area in the plot, we pass `geom = "rect"` into `annotate()`. Like `geom_rect()`,  we must specify the values for `xmin`, `xmax`  `ymin`, and `ymax` to indicate the size and position of the (rectangular) shaded area to be displayed,.

To shade areas, it is better to use `annotate()` than `geom_rect()` as you may not get the desired color/opacity with the latter.

.pull-left[

``` r
p <- ggplot(data = organdata, 
            mapping = aes(x = roads, y = donors))

p + geom_point() + 
  annotate(geom = "text", x = 157, y = 33,  
           label = "A surprisingly high \nrecovery rate.", hjust = 0) +
   annotate(geom = "rect", xmin = 125, xmax = 155,  ymin = 30, ymax = 35, fill = "red", alpha = 0.2) 
```
]

.pull-right[
<img src="LabelsAndAnnotations_files/figure-html/annotate.2-out-1.png" style="display: block; margin: auto;" />
]
]

---
### Exercise: Movie Ratings

For this exercise, use the dataset `MovieRatings` and save it into `df`. These ratings are taken from IMDB. Use RMarkdown to generate a short report.

1. Summarize what you observe about the dataset in 60 words. Rename your variables into some more manageable. Use `colnames()`.

2. Plot the audience ratings against ratings from Rotten Tomatoes. Fit a OLS regression line without standard errors. Use budget as a size aesthetic and genre as a color aesthetic. Use `theme_minimal()`

3. Using `geom_text_repel()`, identify the films with a budget of more than 200 million (i.e. label the data points with the name of the film). Use `nudge_y = 10, nudge_x = 1, segment.size = 1` as settings in `geom_text_repel()`. Hint: For the data argument, pass in `filter(df, Budget.Million>200)` into `geom_text_repel()`. Use `Film` as a label aesthetic. Discuss the results with about 60 words.