class: center, middle, inverse, title-slide .title[ # Further Approaches for Geo-Spatial Visualization ] .author[ ### Nicholas Sim ] .date[ ### 18 September 2024 ] --- class: center, middle, inverse # Introduction --- ### Topics * Constructing maps and boundaries using `geojson` and `shp` files. * Obtaining base maps using the `ggmap` package * Using `ggmap` functions, e.g. obtaining coordinates using `geocode()` and converting addresses into coordinates using `mutate_geocode()` * Construct interactive maps with the `leaflet` package --- ### Readings David Kahle and Hadley Wickham (2013) "ggmap: Spatial Visualization with ggplot2", The R Journal 5:1, pages 144-161. https://journal.r-project.org/archive/2013/RJ-2013-014/RJ-2013-014.pdf --- ### Required Libraries ```r library(tidyverse) library(socviz) library(ggthemes) # To use the map theme library(ggmap) # To use ggmap library(sf) # To read a shape file as an sf data frame library(leaflet) # To use leaflet library(geojsonsf) # To read a geojson file as sf library(tidygeocoder) # To geocode for free ``` --- ### Further Approaches We will consider further approaches for spatial visualization: * Constructing maps and boundaries using shape files, i.e. `geojson` and `shp` files * Using the `ggmap` package to extract the base map * Using the `leaflet` package to construct dynamic visualizations --- class: center, middle, inverse # Shape Files --- ### Types of Shape Files Map data are rarely available as data frames. Instead, they come as a "shape" file stored in various formats, e.g. as a `geojson` (Geographic JavaScript Object Notation), `shp` (shapefile) and `kml` (Keyhole Markup Language). For example, here is a list of shape files from data.gov.sg https://data.gov.sg/search?res_format=SHP. The link to Singapore's map data in the geojson format is https://data.gov.sg/dataset/national-map-polygon We will learn to work with `geojson` and `shp` files to create maps. The `geojson` file alone contains all the information you need to construct a map. The `shp` file is usually accompanied by other dependent files. To use the `shp` file, we need these accompanying files to be stored in the same directory as the shape file. --- ### Applications To illustrate, we will explore where the <span style="color:blue">dengue hotspots</span> are in Singapore. The Singapore map/polygon is contained in a `shp` file. The dengue polygon is contained in a `geojson` file. Both files can be converted into an `sf` (simple features) data frame by reading it using `read_sf()` from the `sf` package. --- ### Using the sf Package to read a Shape File We will use a `geojson` file (dengue polygon) and a `shp` file (Singapore polygon) to visualize where the dengue clusters are located in Singapore. These files are: * `dengue-clusters-geojson.geojson`, which contains polygons (i.e. shapes) on the dengue clusters * `poly.shp`, which contains polygons on Singapore. We will read these files using `read_sf()` by passing each file to the `dsn` (data source name) argument. This will create an `sf` data frame for each of these shape files. `sf` data frames are easy to use for plotting polygons using `geom_sf()`. --- ### Reading a Shape File as a Simple Features Data Frame Let's take a look at `poly.shp` and `dengue-clusters-geojson.geojson`. Take note of the variables `locname` or `name` contained in these files, which we will use as a fill aesthetic later. Also note the use of `st_zm()`, which strips away the Z-axis data so that we will be left with data for the XY-axes (i.e. the longitude and latitude). .panelset[ .panel[.panel-name[R Code] ```r # Singapore shape file # dsn: data source name SG.sf <- read_sf(dsn = "./Igismap/poly.shp") #SG.sf: SG shapefile ``` ```r # Dengue shape (geojson) file SG.dengue <- read_sf(dsn = 'dengue-clusters-geojson.geojson') %>% st_zm() # Strip away the z axis ``` ] .panel[.panel-name[Data: SG.sf] ``` ## Rows: 6 ## Columns: 16 ## $ id <int> 6956978, 3831712, 3831713, 3831714, 3831715, 3831716 ## $ country <chr> "SGP", "SGP", "SGP", "SGP", "SGP", "SGP" ## $ name <chr> "Pedra Branca", "Central", "Northeast", "Northwest", "South… ## $ enname <chr> NA, NA, NA, NA, NA, NA ## $ locname <chr> "Pedra Branca", "Central", "Northeast", "Northwest", "South… ## $ offname <chr> NA, NA, NA, NA, NA, NA ## $ boundary <chr> "administrative", "administrative", "administrative", "admi… ## $ adminlevel <int> 3, 6, 6, 6, 6, 6 ## $ wikidata <chr> "Q1190558", "Q2544592", "Q3710534", "Q5784118", "Q1687545",… ## $ wikimedia <chr> "en:Pedra Branca", NA, NA, "en:North West Community Develop… ## $ timestamp <chr> "2019-10-31 23:10:02", "2019-10-31 23:10:02", "2019-10-31 2… ## $ note <chr> NA, NA, NA, NA, NA, NA ## $ path <chr> "0,536780,6956978", "0,536780,3831712", "0,536780,3831713",… ## $ rpath <chr> "6956978,536780,0", "3831712,536780,0", "3831713,536780,0",… ## $ iso3166_2 <chr> NA, "SG-01", "SG-02", "SG-03", "SG-04", "SG-05" ## $ geometry <MULTIPOLYGON [°]> MULTIPOLYGON (((104.406 1.3..., MULTIPOLYGON (((103.7926 1.… ``` ] .panel[.panel-name[Data: SG.dengue] ``` ## Rows: 123 ## Columns: 3 ## $ Name <chr> "kml_1", "kml_2", "kml_3", "kml_4", "kml_5", "kml_6", "kml… ## $ Description <chr> "<center><table><tr><th colspan='2' align='center'><em>Att… ## $ geometry <POLYGON [°]> POLYGON ((103.951 1.359376,..., POLYGON ((103.8771… ``` ] ] --- ### Basic plot We will first plot the Singapore map using the data frame `SG.sf` (extracted from `poly.shp`), then overlay it with the dengue clusters from the data frame `SG.dengue` (extracted from `dengue-clusters-geojson.geojson`). To prevent inheriting unwanted aesthetics, which may produce an error, we will use `ggplot()` without global declarations and declare the Singapore and dengue clusters data frames locally. ```r ggplot() + geom_sf(data = SG.sf) + geom_sf(data = SG.dengue, fill = "Darkred") ``` <img src="FurtherSpatialVisualization_files/figure-html/dengue.1-1.png" style="display: block; margin: auto;" /> --- ### Cleaning Up: Filling in Colors for Regions (Own Reading) `locname` is a factor variable containing the labels of the regions in Singapore. Let's use it as a fill aesthetic. ```r ggplot() + geom_sf(data = SG.sf, aes(fill=locname)) + geom_sf(data = SG.dengue, fill = "Darkred") ``` <img src="FurtherSpatialVisualization_files/figure-html/dengue.2-1.png" style="display: block; margin: auto;" /> --- ### Cleaning Up: Adjusting the Coordinates (Own Reading) Let's adjust the coordinates to exclude Pedra Branca on the map. To do so, we specify the longitude and latitude limits using `coord_sf()`. Below, Pedra Branca is still present but is out of range. As such, the legend will still reflect Pedra Branca even though it is not shown on the map. .panelset[ .panel[.panel-name[R Code] ```r ggplot() + geom_sf(data = SG.sf, aes(fill=locname)) + geom_sf(data = SG.dengue, fill = "Darkred") + coord_sf(xlim = c(103.58, 104.11), ylim = c(1.13, 1.5), expand = FALSE) # "expand = FALSE" prevents ggplot # from expanding the plot ``` ] .panel[.panel-name[Plot] <img src="FurtherSpatialVisualization_files/figure-html/dengue.3-out-1.png" style="display: block; margin: auto;" /> ] ] --- ### Cleaning Up: Declaring the Titles (Own Reading) Let's use `labs()` to declare the titles, subtitles, axes and legend titles. .panelset[ .panel[.panel-name[R Code] ```r ggplot() + geom_sf(data = SG.sf, aes(fill=locname)) + geom_sf(data = SG.dengue, fill = "Darkred") + coord_sf(xlim = c(103.58, 104.11), ylim = c(1.13, 1.5), expand = FALSE) + labs(x = "Longitude", y = "Latitude", fill = "Regions", title = "Are You Near A Dengue Hotspot?", subtitle = "Dengue clusters across Singapore") ``` ] .panel[.panel-name[Plot] <img src="FurtherSpatialVisualization_files/figure-html/dengue.4-out-1.png" style="display: block; margin: auto;" /> ] ] --- ### Cleaning Up: Declaring the X-Y Labels and Titles (Own Reading) As a remark, we may use `ggtitle()` to declare the plot title (this is the same as `labs()`). .panelset[ .panel[.panel-name[R Code] ```r ggplot() + geom_sf(data = SG.sf, aes(fill=locname)) + geom_sf(data = SG.dengue, fill = "Darkred") + coord_sf(xlim = c(103.58, 104.11), ylim = c(1.13, 1.5), expand = FALSE) + labs(x = "Longitude", y = "Latitude", fill = "Regions") + ggtitle("Are You Near A Dengue Hotspot?", subtitle = "Dengue clusters across Singapore") ``` ] .panel[.panel-name[Plot] <img src="FurtherSpatialVisualization_files/figure-html/dengue.5-out-1.png" style="display: block; margin: auto;" /> ] ] --- ### Cleaning Up: Removing Pedra Branca (Own Reading) Let's remove Pedra Branca. We exclude it by using the filter condition, `!(name %in% 'Pedra Branca')`. .panelset[ .panel[.panel-name[R Code] ```r # Remove Pedra Branca from the data frame SG.sf.1 <- filter(SG.sf, !(name %in% 'Pedra Branca')) # This filter selects all the rows in name not containing the string "Pedra Branca". # There is no need to adjust the coordinates here as Pedra Branca has been removed. ggplot() + geom_sf(data = SG.sf.1, aes(fill=locname)) + geom_sf(data = SG.dengue, fill = "Darkred") + labs(x = "Longitude", y = "Latitude", fill = "Regions") + ggtitle("Are You Near A Dengue Hotspot?", subtitle = "Dengue clusters across Singapore") ``` ] .panel[.panel-name[Plot] <img src="FurtherSpatialVisualization_files/figure-html/dengue.6-out-1.png" style="display: block; margin: auto;" /> ] ] --- ### Marking Locations on the Map .pull-left[ Let's continue with the previous example using data that filtered out Pedra Branca. i.e. `SG.sf.1`. We can mark/overlay new locations on our maps by passing in their latitudes and longitudes as a data frame into `geom_point()`. In this example, let's will mark **Westgate mall** and **SUSS** on the Singapore map. To do so, let's construct a data frame named `places` that contain the coordinates and names of these places (i.e. Westgate and SUSS)] .pull-right[ ```r library(ggrepel) # Construct the data frame for the places to be marked Westgate <- c(103.7428,1.3345,'Westgate') SUSS <- c(103.7762, 1.3291, 'SUSS') places <- rbind(Westgate,SUSS) # ID contains the names of the places colnames(places) <- c("long","lat","ID") # To convert a matrix into a data frame places <- as_tibble(places) # Make sure the coordinates are numeric places$long <- as.numeric(places$long) places$lat <- as.numeric(places$lat) ``` ] --- ### Marking New Locations on the Map We will use this map for our exercise later (see the last slide). Let's refer to this map as **MAP A**. .panelset[ .panel[.panel-name[R Code] ```r # Save Singapore map p <- ggplot() + geom_sf(data = SG.sf.1, fill = 'gray90', color = 'White') # Singapore Polygon # Overlay dengue data and add locations p + geom_sf(data = SG.dengue, fill = "Darkred") + # Dengue data geom_point(data = places, aes(x = long, y = lat), color = "Navyblue", size = 5, alpha = 0.3) + # Add locations using geom_point() geom_text_repel(data = places, aes(x = long, y=lat, label = ID), fontface = "bold") + # Add locations text labs(x = "Longitude", y = "Latitude", fill = "Regions") + ggtitle("Are You Near A Dengue Hotspot?", subtitle = "Dengue clusters across Singapore") + theme_map() ``` ] .panel[.panel-name[Plot] <img src="FurtherSpatialVisualization_files/figure-html/dengue.7-out-1.png" style="display: block; margin: auto;" /> ] ] --- class: center, middle, inverse # The ggmap Package --- ### What is `ggmap`? The `ggmap` package integrates Google Map Platform API functionalities to offer a suite of geospatial services. These services include accessing Google maps, OpenStreet maps, and Stamen maps, as well as geocoding user-provided addresses to compute coordinates. There are several advantages to utilizing the `ggmap` package: 1. With `ggmap`, there's no need to manually plot polygons to delineate region boundaries; it provides a base map that includes these details. 2. Maps generated using `ggmap` offer richer features compared to simple polygons, encompassing terrains, roads, satellite imagery, and more. The syntax for overlaying additional features on these base maps aligns with the familiar style of ggplot2. This facilitates the seamless integration and extension of geospatial visualizations. --- ### Using the ggmap package The `ggmap` package extracts map data using an **Application Programming Interface** (API) from Google Map Platform. To use `ggmap`, you need to first sign up with Google Map Platform and obtain an API key. The API key is a unique identifier that allows Google (or a data provider) to identify the user. Therefore, **keep it a secret!** Once you have your API key, pass it into the `register_google()` function to connect your computer to Google (if not, `ggmap` will not work). ```r #### replace your API here. Otherwise, you cannot generate an output. # register_google(key = "Your_key") ``` --- ### Extracting a Base Map To extract a base map on a certain location, we must provide the location's coordinates, which can be obtained using the `ggmap::geocode()` command (note: we spell out `ggmap::geocode()` than `geocode()`, since the `tidygeocoder` package has a command with the same name). Below, we search "Singapore" for its coordinates and save them as `Singapore`: ```r # Obtaining Singapore's location singapore <- ggmap::geocode("Singapore") singapore #lon = 103.8198,lat = 1.352083 ``` ``` ## # A tibble: 1 × 2 ## lon lat ## <dbl> <dbl> ## 1 104. 1.35 ``` --- ### Extracting a Base Map After obtaining the coordinates on Singapore, we pass them into `ggmap::get_googlemap()` to fetch a base map for Singapore. Let's save the map as `map`: ```r map <- get_googlemap(center = c(lon = singapore$lon, lat = singapore$lat), zoom = 11) # We have to trial and error with the zoom option ``` --- class: center, middle, inverse # Geo-Spatial Visualization with ggmap --- ### Loading the Map To visualize the base map save as `map`, we pass it into `ggmap()`. ```r # map is the base map that is fetched from get_googlemap() ggmap(map) ``` <img src="FurtherSpatialVisualization_files/figure-html/unnamed-chunk-11-1.png" style="display: block; margin: auto;" /> --- ### Marking Locations The `ggmap` package facilitates the utilization of ggplot2-like syntax for map plotting. For instance, we can mark locations on the map using `geom_point()`. Let's illustrate by marking two places - Westgate Mall and SUSS - on a map. First, let's store the coordinates of these locations in a data frame: ```r # Save the longitude, latitude and location name Westgate <- c(103.7428, 1.3345, 'Westgate') SUSS <- c(103.7762, 1.3291, 'SUSS') # Construct the data frame for the places to be marked places <- rbind(Westgate, SUSS) %>% as.data.frame() # ID contains the names of the places colnames(places) <- c("long", "lat", "ID") # Ensure that the coordinates are numeric places$long <- as.numeric(places$long) places$lat <- as.numeric(places$lat) ``` --- ### Marking Locations Let's mark these places on our base map saved earlier as `map`: ```r ggmap(map) + geom_point(data = places, aes(x = long, y = lat), size = 4, alpha = 0.3 ) + geom_text_repel(data = places, aes(x = long, y = lat, label = ID), fontface = "bold") # Add locations text ``` <img src="FurtherSpatialVisualization_files/figure-html/unnamed-chunk-13-1.png" style="display: block; margin: auto;" /> --- ### Overlaying Polygons onto a Base Map from ggmap We may also overlay polygons on a base map extracted from `ggmap`. For example, let's overlay the dengue clusters polygons from `SG.dengue`, a simple features (`sf`) data frame, on the Singapore base map, `map`. Notice that `SG.dengue` is declared as a local data frame, and `inherit.aes = F` is included as a setting in `geom_sf`. .panelset[ .panel[.panel-name[R Code] ```r ggmap(map) + geom_point(data = places, aes(x = long, y = lat), size = 4, alpha = 0.3 ) + geom_text_repel(data = places, aes(x = long, y = lat, label = ID), fontface = "bold") + geom_sf(data = SG.dengue, alpha = 0.3, fill = "tomato", inherit.aes = F) + ggtitle("Dengue Clusters in Singapore") ``` ] .panel[.panel-name[Plot] <img src="FurtherSpatialVisualization_files/figure-html/dengue.ggmap.1-out-1.png" style="display: block; margin: auto;" /> ] ] --- ### Remark: Combining ggmap with geom_polygon and geom_sf When using a standard data frame, we can add `geom_polygon` to `ggmap(map)` to overlay polygons without encountering errors. However, if we attempt to add `geom_sf` to overlay `sf` multipolygons on `ggmap(map)`, an error arises. This error may seem unexpected, since there are issues when adding `sf` polygons through `geom_sf` to `ggplot()`. So, why does an error occur when adding `geom_sf` to `ggmap(map)`, but not to `ggplot()`? --- ### Remark: Combining ggmap with geom_polygon and geom_sf The explanation lies in the distinction between global and local aesthetics declarations in ggplot. When we run `ggplot() + geom_sf(SG.dengue)` (with `SG.dengue` being an `sf` data frame), nothing is inherited by `geom_sf` because `ggplot()` is essentially "empty". Hence, plotting the dengue polygons proceeds without issues. However, when overlaying `sf` polygons onto `ggmap(map)`, `geom_sf` will attempt to inherit an aesthetic from `ggmap(map)` that doesn't exist, namely "geometry". To prevent `geom_sf` from attempting to inherit a non-existent aesthetic, we should include the setting `inherit.aes = FALSE` in `geom_sf`, as shown here: ```r ggmap(map) + geom_sf(data = SG.dengue, inherit.aes = F) ``` --- ### Remark: Combining ggmap with geom_polygon and geom_sf This example shows that `sf` polygons can be overlaid onto `ggmap(map)` if we include the `inherit.aes = F` setting in `geom_sf`: ```r library(sf) library(tidyverse) ggmap(map) + geom_sf(data = SG.dengue, fill = "tomato", alpha = 0.5, inherit.aes = F) ``` <img src="FurtherSpatialVisualization_files/figure-html/unnamed-chunk-15-1.png" style="display: block; margin: auto;" /> --- class: center, middle, inverse # Other Useful Functions in ggmap --- ### Geocoding Location Names Suppose we wish to mark the locations of every MOS Burger outlet in Singapore on a Singapore map. To accomplish this, we must geocode these addresses to obtain their coordinates. We start by utilizing the file "Mosburger.xlsx," which contains the addresses of all the restaurants. Let's geocode the locations and mark them on a Google map. ```r mosburger <- readxl::read_excel("Mosburger.xlsx") head(mosburger) ``` ``` ## # A tibble: 6 × 6 ## places address postcode telephone delivery region ## <chr> <chr> <chr> <chr> <chr> <chr> ## 1 Plaza Singapura 68 Orchard Road… 238839 (+65) 68… 0 Centr… ## 2 China Square 51 Telok Ayer S… 048441 (+65) 62… 0 Centr… ## 3 Millenia Walk 9 Raffles Boule… 039596 (+65) 68… 0 Centr… ## 4 HarbourFront Centre 1 Maritime Squa… 099253 (+65) 62… 1 Centr… ## 5 Cathay Cineleisure Orchard 8 Grange Road #… 239695 (+65) 62… 1 Centr… ## 6 Novena Square 238 Thomson Roa… 307683 (+65) 63… 1 Centr… ``` --- ### Geocoding Location Names To geocode the locations, we use the `mutate_geocode()` function from the `ggmap` package. ```r mosburger.df <- mutate_geocode(data = mosburger, postcode) head(mosburger.df) ``` ``` ## # A tibble: 6 × 8 ## places address postcode telephone delivery region lon lat ## <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> ## 1 Plaza Singapura 68 Orc… 238839 (+65) 68… 0 Centr… 104. 1.30 ## 2 China Square 51 Tel… 048441 (+65) 62… 0 Centr… 104. 1.28 ## 3 Millenia Walk 9 Raff… 039596 (+65) 68… 0 Centr… 104. 1.29 ## 4 HarbourFront Centre 1 Mari… 099253 (+65) 62… 1 Centr… 104. 1.26 ## 5 Cathay Cineleisure Orc… 8 Gran… 239695 (+65) 62… 1 Centr… 104. 1.30 ## 6 Novena Square 238 Th… 307683 (+65) 63… 1 Centr… 104. 1.32 ``` --- ### Geocoding Location Names We use `geom_point()` to mark locations on a `ggmap` generated base map. First, we obtain a hybrid-type map and save it as `map.2`. Then, we plot the locations of the restaurants using `geom_point()`. .panelset[ .panel[.panel-name[R Code] ```r map.2 <- get_map(center = c(singapore$lon, singapore$lat), maptype = "hybrid", zoom = 11) # We use a hybrid map ggmap(map.2) + geom_point(data = mosburger.df, aes(x = lon, y = lat), color = "yellow", size = 3) + ggtitle("MOS Burger Restaurants in Singapore") ``` ] .panel[.panel-name[Plot] <img src="FurtherSpatialVisualization_files/figure-html/map.2-out-1.png" style="display: block; margin: auto;" /> ] ] --- ### Geocoding Location Names We may even plot the restaurant density by using `stat_density2d` (it's fun, but perhaps not too sensible). .panelset[ .panel[.panel-name[R Code] ```r map.3 <- get_map(center = c(singapore$lon, singapore$lat), color = "bw", zoom = 11) # We use a hybrid map ggmap(map.3) + geom_point(data = mosburger.df, aes(x = lon, y = lat), color = "yellow", size = 3) + stat_density2d(aes(fill = ..level..), alpha = 0.4, geom = "polygon", data = mosburger.df) + guides(fill = FALSE) + ggtitle("MOS Burger Restaurants in Singapore") ``` ] .panel[.panel-name[Plot] <img src="FurtherSpatialVisualization_files/figure-html/map.3-out-1.png" style="display: block; margin: auto;" /> ] ] --- ### Applying the Grammar of Graphics The `ggmap` package allows us to exploit `ggplot2`-like syntax. For instance, below, we facet the map by `delivery`. .panelset[ .panel[.panel-name[R Code] ```r ggmap(map.3) + geom_point(data = mosburger.df, aes(x = lon, y = lat), color = "Green", size = 3) + facet_wrap(~ delivery, labeller = labeller(delivery = c("0" = "Does Not Deliver", "1" = "Delivers") )) + # use "labeller" to change the facet titles ggtitle("MOS Burger Restaurants in Singapore") ``` ] .panel[.panel-name[Plot] <img src="FurtherSpatialVisualization_files/figure-html/map.4-out-1.png" style="display: block; margin: auto;" /> ] ] --- class: center, middle, inverse # Geocoding for Free --- ### The tidygeocoder Package The `tidygeocoder` allows us to geocode in R without needing to connect to the `ggmap` API (Warning: it can be very slow). The documentation can be found in https://jessecambon.github.io/tidygeocoder/. As an example, let's geocode three shopping malls in Singapore: Westgate, Takashimaya, and Waterway Point. .panelset[ .panel[.panel-name[R Code] ```r some_addresses <- tibble::tribble( ~name, ~addr, "Westgate", "3 Gateway Dr, Singapore 608532", "Takashimaya", "391A Orchard Road, Singapore 238873", "Waterway Point", "83 Punggol Central, Singapore 828761" ) # geocode the addresses lat_longs <- some_addresses %>% tidygeocoder::geocode(addr, method = 'osm', lat = latitude , long = longitude) lat_longs ``` ] .panel[.panel-name[Output] ``` ## # A tibble: 3 × 4 ## name addr latitude longitude ## <chr> <chr> <dbl> <dbl> ## 1 Westgate 3 Gateway Dr, Singapore 608532 1.33 104. ## 2 Takashimaya 391A Orchard Road, Singapore 238873 1.30 104. ## 3 Waterway Point 83 Punggol Central, Singapore 828761 1.41 104. ``` ] ] --- ### The tidygeocoder Package: Exercise Based on the `mosburger` data frame, geocode the Mosburger branches using `tidygeocoder`. You may combine the variable `places` and `postcode` to construct the address list that looks, for example, like "Plaza Singapura Singapore 238839".
--- ### The OneMap API We may also geocode Singapore addresses for free using the OneMap API. In fact, this is a much faster approach than using the `tidygeocoder` package. To do so, we need to construct an API query for each address. To save space, we only present the main code blocks below. Details can be found in the tutorial `OneMapGeocode.Rmd`. We first construct an empty data frame, `out.list`, to store our results: ```r # Declare a null output to store the results for the loop below out.list <- tibble(address = character(), lat = character(), lng = character()) ``` We declare the base url of the API here. ```r # Base url of the OneMap API to query from queryurl = "https://www.onemap.gov.sg/api/common/elastic/search" ``` --- ### The OneMap API Using a loop, we build an API link for each address to query the OneMap API. ```r # Building a loop to query the addresses for(i in 1:length(df$address)){ # Build up the url for the query. myquery <- list("searchVal" = df$address[i], "returnGeom" = "Y", "getAddrDetails" = "Y", "pageNum" = "1") # Extract results from a GET call. Here, "queryurl" is the base url and "myquery" is the query. out.json <- GET(queryurl, query = myquery) # Extract text from JSON. out.text = fromJSON(rawToChar(out.json$content)) # Extract the address. There could be multiple addresses in out.text$results based on similarity of addresses from the query. #We pick the first one, i.e. out.text$results[1,] as it is the correct one. #For unsuccessful query, out.text$found would be 0. For such cases, assign an NA. if(out.text$found !=0){ out.list[i,] <- out.text$results[1,][c("ADDRESS", "LATITUDE", "LONGITUDE")] } else { out.list[i,] <- rep(NA,3)} } ``` --- class: center, middle, inverse # Further Discussion: Converting Multipolygon Coordinates into a Data Frame (Own Reading) --- ### Extracting the Coordinates from an sf Data Frame In certain cases where we encounter difficulties visualizing polygons from an `sf` data frame, a workaround involves extracting the coordinates and saving them as a regular data frame. To illustrate, let's extract the coordinates from `SG.dengue`, which is a list comprising three objects. We will utilize the 2nd object, which contains the names of the places, and the 3rd object, which contains the polygons of those places. First, we extract the coordinates of each dengue cluster. Then, we extract the names of the locations within the cluster where an incidence was reported. Below are the coordinates of the 5th location. ```r k = 5 # fifth location SG.dengue[[3]][[k]][[1]] %>% head(3) ``` ``` ## [,1] [,2] ## [1,] 103.8817 1.341956 ## [2,] 103.8822 1.342886 ## [3,] 103.8824 1.343019 ``` --- ### Extracting the Coordinates from SG.dengue Since the coordinates for each dengue cluster are stored in `SG.dengue[[3]][[k]][[1]]`, we will iterate over `k` from `1` to 123. For each cluster indexed by `k`, we will save its coordinates in the data frame `df.x` and assign an ID to it. Then, we will append `df.x` to `df.coord`, which will store the coordinates for all clusters. Subsequently, we will extract the coordinates for the next cluster `k` into `df.x` and append them to `df.coord`, and continue this process until the loop concludes. ```r df.coord <- NULL # Initialize an empty matrix for (k in 1:nrow(SG.dengue)){ # The coordinates are stored in SG.dengue[[3]][[k]][[1]]. Use trial and error to find out. # The group ID will be k df.x <- data.frame(SG.dengue[[3]][[k]][[1]], k) # Append df.coord with the data in df.x by rows df.coord <- rbind(df.coord, df.x) } # Clean up the names of the columns colnames(df.coord) <- c("longitude", "latitude", "ID") ``` --- ### Extracting the Location Names from SG.dengue Finally, we extract the names of the locations from `SG.dengue[[2]]`. By inspection, notice that the location is located after the tag `<th>LOCALITY</th> <td>`. For example, the fifth location (`k = 5`) is ```r k = 5 # fifth location SG.dengue[[2]][k] %>% strwrap(width = 70) ``` ``` ## [1] "<center><table><tr><th colspan='2'" ## [2] "align='center'><em>Attributes</em></th></tr><tr bgcolor=\"#E3E3F3\">" ## [3] "<th>LOCALITY</th> <td>Upp Paya Lebar Rd (Botanique At Bartley)</td>" ## [4] "</tr><tr bgcolor=\"\"> <th>CASE_SIZE</th> <td>2</td> </tr><tr" ## [5] "bgcolor=\"#E3E3F3\"> <th>NAME</th> <td>Dengue_Cluster</td> </tr><tr" ## [6] "bgcolor=\"\"> <th>HYPERLINK</th>" ## [7] "<td>https://www.nea.gov.sg/dengue-zika/dengue/dengue-clusters</td>" ## [8] "</tr><tr bgcolor=\"#E3E3F3\"> <th>HOMES</th> <td></td> </tr><tr" ## [9] "bgcolor=\"\"> <th>PUBLIC_PLACES</th> <td></td> </tr><tr" ## [10] "bgcolor=\"#E3E3F3\"> <th>CONSTRUCTION_SITES</th> <td></td> </tr><tr" ## [11] "bgcolor=\"\"> <th>INC_CRC</th> <td>C8D56736C876D976</td> </tr><tr" ## [12] "bgcolor=\"#E3E3F3\"> <th>FMEL_UPD_D</th> <td>20200513170817</td>" ## [13] "</tr></table></center>" ``` --- ### Extracting the Location Names from SG.dengue Therefore, we will use the `str_split` function from `stringr` package to split `SG.dengue[[2]]` based on the pattern `<th>LOCALITY</th> <td>`. The address will be stored in the second column before the tag `</td>`. ```r # Split the string SG.dengue.address.step1 <- str_split(SG.dengue[[2]], pattern = "<th>LOCALITY</th> <td>", n = 2, simplify = T) # Save the second column of the matrix SG.dengue.address.step1 <- SG.dengue.address.step1[,2] # Check the first entry SG.dengue.address.step1[1] %>% strwrap(width = 70) ``` ``` ## [1] "Tampines St 42 (Blk 450A, 450C)</td> </tr><tr bgcolor=\"\">" ## [2] "<th>CASE_SIZE</th> <td>3</td> </tr><tr bgcolor=\"#E3E3F3\">" ## [3] "<th>NAME</th> <td>Dengue_Cluster</td> </tr><tr bgcolor=\"\">" ## [4] "<th>HYPERLINK</th>" ## [5] "<td>https://www.nea.gov.sg/dengue-zika/dengue/dengue-clusters</td>" ## [6] "</tr><tr bgcolor=\"#E3E3F3\"> <th>HOMES</th> <td></td> </tr><tr" ## [7] "bgcolor=\"\"> <th>PUBLIC_PLACES</th> <td></td> </tr><tr" ## [8] "bgcolor=\"#E3E3F3\"> <th>CONSTRUCTION_SITES</th> <td></td> </tr><tr" ## [9] "bgcolor=\"\"> <th>INC_CRC</th> <td>519FC6B16BF39FE2</td> </tr><tr" ## [10] "bgcolor=\"#E3E3F3\"> <th>FMEL_UPD_D</th> <td>20200513170817</td>" ## [11] "</tr></table></center>" ``` --- ### Extracting the Location Names from SG.dengue We further utilize `str_split` to split `SG.dengue.address.step1` and extract the addresses, located before the tag `</td>`. Then, we will save these addresses, assign a group ID to each address, and merge this data frame with `df.coord` by `ID`. Additionally, we will count the number of cases based on the unique addresses associated with each dengue cluster. .panelset[ .panel[.panel-name[R Code] ```r # Split the string SG.dengue.address <- str_split(string = SG.dengue.address.step1, pattern = "</td>", n = 2, simplify = T) # Count the number of dengue incidence based on unique address SG.dengue.cases <- str_count(SG.dengue.address[,1], pattern = "/") + str_count(SG.dengue.address[,1], pattern = ",") + 1 # Save the addresses, which is stored in the first column. Include the group ID SG.dengue.address <- data.frame(address = SG.dengue.address[,1], cases = SG.dengue.cases, ID = 1:nrow(SG.dengue.address)) head(SG.dengue.address) ``` ] .panel[.panel-name[Output] ``` ## address ## 1 Tampines St 42 (Blk 450A, 450C) ## 2 Fernvale Lk (Blk 413B, 415A) ## 3 Westwood Ave / Westwood Ave (Westwood Residences) / Westwood Cres / Westwood Dr / Westwood Rd / Westwood Ter / Westwood Walk ## 4 Bangkit Rd (Chestervale) ## 5 Upp Paya Lebar Rd (Botanique At Bartley) ## 6 Upp Aljunied Ln (Blk 5) ## cases ID ## 1 2 1 ## 2 2 2 ## 3 7 3 ## 4 1 4 ## 5 1 5 ## 6 1 6 ``` ] ] --- ### Extracting the Location Names from SG.dengue Combining the coordinates data frame with the addresses: ```r # Left join df.coord with SG.dengue.address.step2 SG.dengue.df <- left_join(df.coord, SG.dengue.address, by = "ID") # Check the final output SG.dengue.df %>% head(3) ``` ``` ## longitude latitude ID address cases ## 1 103.9510 1.359376 1 Tampines St 42 (Blk 450A, 450C) 2 ## 2 103.9512 1.359380 1 Tampines St 42 (Blk 450A, 450C) 2 ## 3 103.9512 1.359820 1 Tampines St 42 (Blk 450A, 450C) 2 ``` --- ### Saving the Addresses to SG.dengue For convenience, we save the addresses and cases to `SG.dengue`. ```r # Save the address to SG.dengue SG.dengue$address <- SG.dengue.address$address SG.dengue$cases <- SG.dengue.address$cases # View the output glimpse(SG.dengue) ``` ``` ## Rows: 123 ## Columns: 5 ## $ Name <chr> "kml_1", "kml_2", "kml_3", "kml_4", "kml_5", "kml_6", "kml… ## $ Description <chr> "<center><table><tr><th colspan='2' align='center'><em>Att… ## $ geometry <POLYGON [°]> POLYGON ((103.951 1.359376,..., POLYGON ((103.8771… ## $ address <chr> "Tampines St 42 (Blk 450A, 450C)", "Fernvale Lk (Blk 413B,… ## $ cases <dbl> 2, 2, 7, 1, 1, 1, 2, 1, 2, 2, 8, 13, 10, 3, 3, 1, 3, 10, 1… ``` --- class: center, middle, inverse # The leaflet Package --- ### Leaflet The `leaflet` package enables us to create dynamic and interactive maps. Here is an example of creating location markers of the MOS Burger restaurants on leaflet. .panelset[ .panel[.panel-name[R Code] ```r mosburger.df %>% leaflet() %>% addTiles() %>% addMarkers(popup = mosburger.df$places, label = mosburger.df$places, clusterOptions = markerClusterOptions()) ``` ] .panel[.panel-name[Plot]
] ] --- ### Basic Commands The documentation on leaflet can be found in See https://rstudio.github.io/leaflet/markers.html. The starting point of using leaflet is the `leaflet()` function, which creates a basic storable map widget. To bring up a map, "add" the `addTiles()` function, which adds map data from OpenStreetMap. Here, we create a leaflet object called `mymap` simply by calling the `leaflet()` function and adding tiles: ```r mymap <- leaflet() %>% addTiles() mymap ```
--- ### Marking Locations Using the object `mymap`, let's add markers to the map. Here, let's mark the location of `SUSS` by supplying the coordinates to the `addMarkers()` function: .panelset[ .panel[.panel-name[R Code] ```r # Pass your data frame containing location points through leaflet(). # Here, we pass in empty and just simply load the mapping data by calling addTiles() # Note that in leaflet, longitude is abbreviated as "lng" mymap.suss <- mymap %>% addMarkers(lat = 1.3291, lng = 103.7762, popup = "SUSS") mymap.suss ``` ] .panel[.panel-name[Plot]
] ] --- ### Marking Locations Recall that data frame `places` contains the coordinates of Westgate and SUSS. Let's consider a slightly more complicated example by plotting these location and adding a popup that contains the location's name: .panelset[ .panel[.panel-name[R Code] ```r # Pass in the places data frame that contains location markers for Westgate and SUSS # The popup option in addMarkers() gives you a popup containing information # on the location when you click on it. places %>% leaflet() %>% addTiles() %>% addMarkers(popup = places$ID) ``` ] .panel[.panel-name[Plot]
] .panel[.panel-name[Data] ```r head(places) ``` ``` ## long lat ID ## Westgate 103.7428 1.3345 Westgate ## SUSS 103.7762 1.3291 SUSS ``` ] ] --- ### Using Icons as Markers Let's improve our plot by using icons instead of the default bubbles to mark locations. .panelset[ .panel[.panel-name[R Code] ```r places.icons <- iconList( Westgate <- makeIcon(iconUrl = "./Westgate.jpg", iconWidth = 40, iconHeight = 40), SUSS <- makeIcon(iconUrl = "./suss.png", iconWidth = 40, iconHeight = 40) ) places %>% leaflet() %>% addTiles() %>% addMarkers(popup = places$ID, icon = places.icons) ``` ] .panel[.panel-name[Plot]
] ] --- ### Clickable Links for Sites Let's include the hyperlinks of Westgate and SUSS. Here, the hyperlinks are contained in the html attribute tags, `<a>` (`</a>` closes the attribute tag). .panelset[ .panel[.panel-name[R Code] ```r places.links <- c( "<a href='https://www.capitaland.com/sg/malls/westgate/en.html'>Westgate Mall</a>", "<a href='www.suss.edu.sg'>Singapore University of Social Sciences</a>" ) places %>% leaflet() %>% addTiles() %>% addMarkers(icon = places.icons, popup = places.links) ``` ] .panel[.panel-name[Plot]
] ] --- class: center, middle, inverse # Further Example 1: Visualizing sf Polygons on Leaflet --- ### Where are the Dengue Hotspots? Visualizing polygons from an `sf` data frame on leaflet is relatively straightforward. Let's visualize the dengue clusters using polygons and addresses from `SG.dengue`, which contains two new columns, the addresses and the number of cases for each cluster. .panelset[ .panel[.panel-name[R Code] ```r leaflet() %>% addTiles() %>% addPolygons(data = SG.dengue, fillOpacity = 0.8, color = "red", popup = paste(SG.dengue$address, "<br>No of cases:", SG.dengue$cases), stroke = F) ``` ] .panel[.panel-name[Plot]
] ] --- ### Where are the Dengue Hotspots? Let's fill the clusters with colors that correlate with the number of cases. To do so, we use the `colorNumeric()` function from leaflet to construct a palette function, `pal_fun()`, that specifies the color palette and the range of values. Then, we pass `SG.dengue$cases` into the palette function to construct a choropleth map. .panelset[ .panel[.panel-name[R Code] ```r pal_fun <- colorNumeric(palette = "magma", domain = c(0, 20), reverse = T) leaflet() %>% addTiles() %>% addPolygons(data = SG.dengue, fillOpacity = 0.8, color = ~pal_fun(SG.dengue$cases), popup = paste(SG.dengue$address, "<br>No of cases:", SG.dengue$cases), stroke = F) ``` ] .panel[.panel-name[Plot]
] ] --- class: center, middle, inverse # Further Example 2: Visualizing Real-Time Data on Leaflet --- ### Where are the Taxis? Let's use the `datagovsgR` API from data.gov.sg to obtain real-time data on the locations of taxi. We pull data from `taxi_availability()` and add markers based on the taxi coordinates. .panelset[ .panel[.panel-name[R Code] ```r library(datagovsgR) # Extracting and assembling the current date and time time.vec <- Sys.time() %>% str_split(" ") %>% flatten() %>% unlist() time.current <- paste0(time.vec[1], "T", time.vec[2]) # Extracting the taxi data taxi.data <- taxi_availability(date_time = time.current) # Plotting the leaflet map leaflet() %>% addTiles() %>% addMarkers(data= taxi.data, clusterOptions = markerClusterOptions()) %>% # cluster the data points addControl(paste0("Taxi Positions at: ", time.vec[1], ",", time.vec[2]), position = "bottomleft", className="map-title") ``` ] .panel[.panel-name[Plot]
] ] --- class: center, middle, inverse # Exercise --- ### Exercise 1 Find your current location (you may use Google Maps, find where you are and right click, and select "What's here?".) **1.** Using ggplot, construct a very simple data frame containing your longitude, latitude and a string value "Me". Follow the tutorial on "Marking New Locations" and mark your present location on MAP A. Hint: Store your latitude, longitude, and the location ID as `lat`, `lng` and `ID`. Then, construct the data frame `MyData <- data.frame(lat, lng)`. Then, use `geom_point()` and `MyData`, where the `x` aesthetic is the longitude, the `y` aesthetic is the latitude. Use `geom_text_repel()` and `ID` as your label aesthetic. Adjust the plot to clean it up. --- ### Exercise 1 <img src="FurtherSpatialVisualization_files/figure-html/unnamed-chunk-29-1.png" style="display: block; margin: auto;" /> --- ### Exercise 2 **2.** Mark your location on MAP B using addMarkers. In MAP B, add `addMarkers(data = MyData, popup = "Me")`. Remember that this is not ggplot. You overlay features on a leaftlet by using `%>%`, not `+`.