class: center, middle, inverse, title-slide .title[ # Aesthetics and Settings ] .author[ ### Nicholas Sim ] .date[ ### 26 March 2024 ] --- class: center, middle, inverse # Introduction --- ### Topics * Aesthetics such as `color`, `fill`, `shape`, and `size` * Setting vs Aesthetic * Passing a setting into `aes()` (common mistake) * Set colors, size, etc. --- ### Required Libraries ```r library(tidyverse) library(gapminder) ``` --- class: center, middle, inverse # Aesthetic --- ### What is an Aesthetic? Ref: Chapter3 KH An aesthetic is a mapping of one or more variables from a data frame to some elements in a visualisation. Some aesthetics are mandatory. For instance, we must specify the variable to be shown on the plot as an aesthetic. Some aesthetics are optional, but are used to help create a visualisation that tells a sharper data story. Optional aesthetics include `color`, `fill`, `shape`, and `size`. For this presentation, we will declare these optional aesthetics **locally** than globally, as we may not want them to be applied throughout our plot. --- ### The Color Aesthetic The **color** aesthetic controls the color of the border, such as the borders of the bars in a barplot. For a scatter plot, the color aesthetic can be used to change the color of the scatter points because by default, these points do not have a hollow interior (and thus, without an border). The `color` aesthetic is useful for distinguishing various categories in a qualitative variable. For instance, we may use the color aesthetic to differentiate the color of the scatter points for Asia, Europe, Sub-Saharan Africa, etc. or for males versus females, marital status, and so on, in a barplot. --- ### Example: The Color Aesthetic Using the `mtcars` data frame, let's plot miles per gallon against displacement (i.e. weight), and use the number of cylinders as a color aesthetic. ```r mtcars$cyl <- as.factor(mtcars$cyl) # change the class of cyl into factors ggplot(data = mtcars, mapping = aes (x = disp, y= mpg)) + geom_point(mapping = aes(color = cyl)) ``` <img src="AestheticsSettings_files/figure-html/color.1-1.png" style="display: block; margin: auto;" /> --- ### The Fill Aesthestic In terms of applications, the **fill** aesthetic serves the same purpose as the `color` aesthetic. The main difference is that the `fill` aesthetic fills a <span style="color:red">hollow interior</span> with colors (representing a variable from the data frame), while the `color` aesthetic controls the color of the outline/border/exterior. For example, the bars in a barplot have a white hollow interior by default. To differentiate the bars with colors, we should use the `fill` aesthetic and not the `color` aesthetic. --- ### The Fill Aesthestic The `fill` aesthetic is usually used to differentiate colors in plots such as the histogram, bar chart, column chart, etc. where there are hollow interiors. By contrast, scatter points by default have a solid (i.e. non-hollow) interior. As such, the `color` aesthetic, not the `fill` aesthetic, should be utilized to differentiate the color of the scatter points. That being said, there could be scatter points with a hollow interior. If so, their colors should be differentiated by using the `fill` aesthetic. --- ### Example: The Fill Aesthetic Let's map the `fill` aesthetic to the number of cylinders. ```r ggplot(data = mtcars, mapping = aes (x = disp, y= mpg)) + geom_point(mapping = aes(fill = cyl)) ``` <img src="AestheticsSettings_files/figure-html/fill.1-1.png" style="display: block; margin: auto;" /> --- ### The Shape and Size Aesthestics For a qualitative (categorical) variable, we may differentiate its values by using a `shape` aesthetic. The `shape` aesthetic differentiates the different values in a variable by using different shapes. For a quantitative (numerical) variable, we may differentiate its values by using a `size` aesthetic. The `size` aesthetic differentiates different numerical values by using larger-sized displays for larger values. --- ### Example: The Shape Aesthetic Let's map the `shape` aesthetic to number of cylinders. ```r ggplot(data = mtcars, mapping = aes (x = disp, y= mpg)) + geom_point(mapping = aes(shape = cyl)) ``` <img src="AestheticsSettings_files/figure-html/shape.1-1.png" style="display: block; margin: auto;" /> --- ### Example: The Size Aesthetic Let's map the `size` aesthetic to the number of cylinders. ```r ggplot(data = mtcars, mapping = aes (x = disp, y= mpg)) + geom_point(mapping = aes(size = cyl)) ``` <img src="AestheticsSettings_files/figure-html/size.1-1.png" style="display: block; margin: auto;" /> --- ### Recap The `color`, `fill`, and `shape` aesthetics can be used to display information about classes (non-ordered categories). By contrast, the `size` aesthetic can only be used to display information about quantity. --- ### Exercise (Try on your own) Using the `iris` dataset, construct a scatter plot that plots sepal length against petal length. Declare species as a *local* color aesthetic. Add an OLS regression line (without confidence bands) and use `theme_classic()`. You should observe the plot below. Question: What happens if you declare species as a *global* color aesthetic? ``` ## [1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" "Species" ``` <img src="AestheticsSettings_files/figure-html/unnamed-chunk-3-1.png" style="display: block; margin: auto;" /> --- class: center, middle, inverse # Setting --- ### Setting versus Aesthetic Ref: Chapter 3 KH **Aesthetics** are characteristics on a plot (e.g. `color`, `fill`, `shape`, and `size`) that are determined by some variables in the data frame. **Settings** are parameters that are not mapped to variables but determined by the user. For example, instead of differentiating the colors of the scatter points based on a variable in the data frame, we may set the colors of these points to, red, blue, etc. --- ### The `iris` Dataset To elaborate this difference, let's use the `iris` dataset. Here is a quick look at the dataset
--- ### Setting a Color Let's consider a scatter plot of sepal length against petal length. ```r ggplot(iris, aes(x = Petal.Length, y = Sepal.Length)) + geom_point() ``` <img src="AestheticsSettings_files/figure-html/unnamed-chunk-5-1.png" style="display: block; margin: auto;" /> --- ### Setting a Color Let's set the color of the points to "red". Notice that `color = "red"` is not passed into the `aes()` function. By doing so, the color is set to red rather being determined by some variable in the iris dataset. ```r # Don't pass color through aes() ggplot(iris, aes(x = Petal.Length, y = Sepal.Length)) + geom_point(color = "red") ``` <img src="AestheticsSettings_files/figure-html/unnamed-chunk-6-1.png" style="display: block; margin: auto;" /> --- ### Setting the Size We may set the size of the points. Again, notice that `size = 2"` is not passed into the `aes()` function. ```r # Don't pass color, size through aes() ggplot(iris, aes(x = Petal.Length, y = Sepal.Length)) + geom_point(color = "red", size = 2) ``` <img src="AestheticsSettings_files/figure-html/unnamed-chunk-7-1.png" style="display: block; margin: auto;" /> --- ### Setting the Shape We may also set the shape of the points. ```r # Don't pass color, size, shape through aes() ggplot(iris, aes(x = Petal.Length, y = Sepal.Length)) + geom_point(color = "red", size=2, shape = 2) ``` <img src="AestheticsSettings_files/figure-html/unnamed-chunk-8-1.png" style="display: block; margin: auto;" /> --- class: center, middle, inverse # Confusing a Setting as an Aesthetic --- ### Passing a Setting into `aes()` What will happen if we mistakenly pass the option, `color = "red"`, into `aes()`? Recall that `aes()` maps the data to features in the visualization. If we pass (say) the option, `color = Region`, into `aes()`, *ggplot* will use attempt to use the `Region` variable as a `color` aesthetic. By analogy, if we pass `color = "red"` into `aes()`, *ggplot* will attempt to use the values from a variable, `red`, as a `color` aesthetic. However, there is no variable, `red`, in our data frame! --- ### Passing a Setting into `aes()` Recall the **recycling** property, which was discussed in Seminar 1 part 3. If we combine a variable with a shorter length (i.e. fewer number of elements) with another longer variable, R will recycle the elements in the shorter vector to match its length with the longer vector's. For example, if we try to "column" combine a vector with 5 observations with a data frame with 10 rows, R will recycle the values in the shorter vector to form a vector with 10 observations, before combining the shorter vector as a new column in the data frame. --- ### Passing a Setting into `aes()` From the concept of recycling, let's understand what may happen if we mistakenly pass `color = "red"` into `aes()` (red should be a color setting, not an aesthetic). Since there is no variable "red", R will "construct" a new variable "red" with a single value. As "red" only has one value, passing `color = "red"`into `aes()` will result in only one color being displayed, but the color is not red. --- ### Example: Passing in a Setting as an Aesthetic To illustrate, let's pass the setting `color = "red"` into `aes()`. Notice that the scatter points have a different color now. Although there is only one color, it is not red. ```r # R created a variable with one value "red". ggplot(iris, aes(x=Petal.Length, y=Sepal.Length)) + geom_point(aes(color = "red")) ``` <img src="AestheticsSettings_files/figure-html/unnamed-chunk-9-1.png" style="display: block; margin: auto;" /> --- ### Exercise Let's explore if greater income inequality is associated with higher economic growth. For this task, load the `tidyverse` and `gapminder`. Using `read_csv()` function, import `WDI_Data.csv` into R and save it into a data frame, named `df`. Do remove all the `NA`s by passing the data frame `df` into `na.omit()` and saving the output. .pull-left[ Construct a scatter plot with income inequality as the ``y` variable and economic growth as the `x` variable. In the dataset, income inequality is captured by `Gini` and growth is captured by `GDP.Growth`. Use `Region` as a `color` aesthetic and a trendline based on OLS regression. ] .pull-right[ <img src="AestheticsSettings_files/figure-html/unnamed-chunk-11-1.png" style="display: block; margin: auto;" /> ]