Advertisement
lvalnegri

r_mapping-notes.R

May 16th, 2018
707
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
R 24.60 KB | None | 0 0
  1. # # # RESOURCES -----------------------------------------------------------------------------------------------------------------
  2. #
  3. # https://media.readthedocs.org/pdf/a-little-book-of-r-for-time-series/latest/a-little-book-of-r-for-time-series.pdf
  4. # http://brettromero.com/wordpress/data-science-a-kaggle-walkthrough-introduction/
  5. # http://amsantac.co/blog.html
  6. #
  7. # - boundaries
  8. #   * https://stat.ethz.ch/pipermail/r-sig-geo/2011-May/011814.html
  9. #   * http://eco-data-science.github.io/spatial_analysis2_R/
  10. #   * https://mgimond.github.io/Spatial/index.html
  11. #   * https://rpubs.com/BeccaStubbs/bringing_shapefiles_into_R
  12. #   * https://github.com/Robinlovelace/sdvwR
  13. #   *
  14. #
  15. ### - mapping general
  16. #   * http://eriqande.github.io/rep-res-web/lectures/making-maps-with-R.html
  17. #   * http://zevross.com/blog/2014/07/16/mapping-in-r-using-the-ggplot2-package/
  18. #   * https://cran.r-project.org/web/packages/maptools/vignettes/combine_maptools.pdf
  19. #   * https://stat.ethz.ch/pipermail/r-sig-geo/2011-May/011814.html
  20. #   * http://www.molecularecologist.com/2012/09/making-maps-with-r/
  21. #   * http://www.geog.uoregon.edu/GeogR/examples/maps_examples01.htm
  22. #   * http://www.kevjohnson.org/making-maps-in-r/
  23. #   * http://www.computerworld.com/article/3038270/data-analytics/create-maps-in-r-in-10-fairly-easy-steps.html
  24. #   * http://clarkdatalabs.github.io/mapping_R/
  25. #   * http://geog.uoregon.edu/bartlein/courses/geog495/lec06.html    
  26. #   * http://stackoverflow.com/documentation/r/1372/introduction-to-geographical-maps#t=201702271434306481539
  27. #   * http://www.molecularecologist.com/2016/07/making-maps-in-r-volume-2-ggplots/
  28. #   * https://r-resources.massey.ac.nz/lurn/LURNch16.html
  29. #   * http://axismaps.github.io/thematic-cartography/
  30. #   * https://cran.r-project.org/doc/contrib/intro-spatial-rl.pdf
  31. #   * https://cran.r-project.org/web/packages/maptools/vignettes/combine_maptools.pdf
  32. #   * https://github.com/Robinlovelace/Creating-maps-in-R/blob/master/intro-spatial.Rmd
  33. #   * http://urbanspatialanalysis.com/portfolio/predicting-gentrification-using-longitudinal-census-data/
  34. #   * http://axismaps.github.io/thematic-cartography/articles/classification.html
  35. #   *
  36. #   *
  37. #
  38. # - ggplot (for mapping)
  39. #   * http://tutorials.iq.harvard.edu/R/Rgraphics/Rgraphics.html
  40. #   * http://www.gradaanwr.net/
  41. #   * http://stackoverflow.com/questions/19791210/r-ggplot2-merge-with-shapefile-and-csv-data-to-fill-polygons
  42. #   * http://rmhogervorst.nl/cleancode/blog/2017/01/06/plotting-a-map-with-ggplot2.html
  43. #   * http://r-statistics.co/Top50-Ggplot2-Visualizations-MasterList-R-Code.html
  44. #   *
  45. #
  46. # - ggmap
  47. #   * https://github.com/dkahle/ggmap
  48. #   * https://journal.r-project.org/archive/2013-1/kahle-wickham.pdf
  49. #   * https://www.nceas.ucsb.edu/~frazier/RSpatialGuides/ggmap/ggmapCheatsheet.pdf
  50. #   * https://blog.dominodatalab.com/geographic-visualization-with-rs-ggmaps/
  51. #   * http://www.kevjohnson.org/making-maps-in-r-part-2/
  52. #   * http://bcb.dfci.harvard.edu/~aedin/courses/R/CDC/maps.html
  53. #   * http://urbanspatialanalysis.com/dataviz-tutorial-mapping-san-francisco-home-prices-using-r/
  54. #   *
  55. #   *
  56. #
  57. # - tmap
  58. #   * https://github.com/Arevaju/mapping-tool
  59. #   * https://behrica.github.io/blog/2016/10/food-consumption-tutorial.html
  60. #   * http://www.computerworld.com/article/3175623/data-analytics/mapping-in-r-just-got-a-whole-lot-easier.html
  61. #   *
  62. #
  63. # - leaflet
  64. #   * http://zevross.com/blog/2015/10/14/manipulating-and-mapping-us-census-data-in-r-using-the-acs-tigris-and-leaflet-packages-3/#convert-the-data.frame-back-to-a-spatialpolygonsdataframe
  65. #   * https://lilbigdataboy.wordpress.com/2017/01/14/les-crimes-de-san-francisco-en-temps-reel-shiny-et-leaflet/
  66. #   *
  67. #   *
  68. #
  69.  
  70. # - projects
  71. #   * http://urbanspatialanalysis.com/portfolio/predicting-gentrification-using-longitudinal-census-data/
  72. #   * http://blog.revolutionanalytics.com/2017/02/forecasting-gentrification.html
  73. #   * http://urbanspatialanalysis.com/dataviz-tutorial-mapping-san-francisco-home-prices-using-r/
  74. #   * http://zevross.com/blog/2016/04/19/r-powered-web-applications-with-shiny-a-tutorial-and-cheat-sheet-with-40-example-apps/
  75. #   * http://www.computerworld.com/article/2497464/business-intelligence/business-intelligence-60-r-resources-to-improve-your-data-skills.html
  76. #   *
  77. #
  78.  
  79. # --------------------------------------------------------------------------------
  80. mytext <- "
  81.    The following is the R code to create arbitrary aggregation boundaries, both in shapefile (for tmap and leaflet) and dataframe (for ggplot and ggmap) format, starting from some base shapefile layer.
  82.    Specifically, in the UK the reference is the union of Output Areas in England and Wales (EW), Output Areas in Scotland (SC), and Small Areas in Northern Ireland (NI).
  83.    There is no official global UK map for these census 2011 small areas, so it has to be created binding together the three different maps, each available from the corresponding Statistical Office as for the following links:
  84.    EW: Output Areas (Generalised Clipped) http://geoportal.statistics.gov.uk/datasets/09b8a48426e3482ebbc0b0c49985c0fb_2
  85.    SC: Output Areas https://www.nrscotland.gov.uk/statistics-and-data/geography/our-products/census-datasets/2011-census/2011-boundaries
  86.    NI: Small Areas http://www.nisra.gov.uk/geography/SmallAreas.htm
  87.    Once downloaded, rename each group of files with the corresponding two characters acronym. It's necessary to keep only the files with the four following extensions: shp, shx, prj, dbf.
  88.    One small problem to be aware of is the different projection of the boundaries:
  89.    EW: '+init=epsg:27700'  # http://spatialreference.org/ref/epsg/osgb-1936-british-national-grid/
  90.    SC: '+init=epsg:27700'   # http://spatialreference.org/ref/epsg/osgb-1936-british-national-grid/
  91.    NI: '+init=epsg:29902'   # http://spatialreference.org/ref/epsg/29902/
  92.    but all of the above will be transformed in the standard WGS84 for web mapping.
  93.    The following are the number of features in each set of boundaries:
  94.    EW: 181,408 (E 171,372 + W 10,036)
  95.    SC:    46,351
  96.    NI:       4,537
  97.    for a total UK of 232,296. This is important to know because when simplifying the polygons overall, some of the features could dissolve.
  98.    To successfully aggregate, the script needs also a lookup table that gives back a unique aggregated area code for every output/small area (from now on just OA).
  99.    This table could be built for many types of aggregations using, for example, the Postcodes Directories
  100.    NHS Postcode Directory: http://ons.maps.arcgis.com/home/item.html?id=4e63d9b589604c0ba10fbbd3bef08e6f
  101.    ONS Postcode Directory: http://ons.maps.arcgis.com/home/item.html?id=dfa0ff74981b4b228d2030d852f0b14a
  102.    These are maintained and published every quarter by the ONS, that relates both current and terminated postcodes in the UK to a range of current and past statutory administrative, electoral, health and other geographies. The downside to this approach is that some OA could belong to different aggregations, and it becomes somewhat arbitrary, without more knowledge, decide which is the right destination area.
  103.    When saving to the database, there is an added column 'type' that is part of the foreign key ('id', 'type') connecting to the primary key ('boundary_id', 'type') in the table *boundaries_ids*, used to retrieve the correct id of the feature to link to external data.
  104.    I saved all files in s:\ICS\Projects\Data Management\R_Shiny\Resources\boundaries_aggregations\
  105.    It is finally important to understand that when reproducing ONS content, the output must include a source accreditation to ONS.
  106. "
  107.  
  108.  
  109. # # # STEP 0a - PREPARATION OF BOUNDARIES BY COUNTRIES  -------------------------------------------------------------------
  110.  
  111. # Boundaries from Census 2011
  112. # EW: Output Area (Generalised Clipped) http://geoportal.statistics.gov.uk/datasets/09b8a48426e3482ebbc0b0c49985c0fb_2
  113. # SC: Output Area https://www.nrscotland.gov.uk/statistics-and-data/geography/our-products/census-datasets/2011-census/2011-boundaries
  114. # NI: Small Area http://www.nisra.gov.uk/geography/SmallAreas.htm
  115. # IE: Small Area http://www.cso.ie/en/census/census2011boundaryfiles/  (=> they lack the projection file)
  116. # Once downloaded, rename each group of files with the corresponding two characters acronym
  117.  
  118. # Original projections
  119. # ewgrid = '+init=epsg:27700'  # http://spatialreference.org/ref/epsg/osgb-1936-british-national-grid/
  120. # scgrid = '+init=epsg:27700'  # http://spatialreference.org/ref/epsg/osgb-1936-british-national-grid/
  121. # nigrid = '+init=epsg:29902'  # http://spatialreference.org/ref/epsg/29902/
  122. # iegrid = '+init=epsg:29902'  # http://spatialreference.org/ref/epsg/29902/
  123.  
  124. # Number of features:
  125. #  - EW: 181408 (171372+10036)
  126. #  - SC:  46351
  127. #  - NI:   4537
  128. #  UK => 232296
  129.  
  130. # Reference Country codes
  131. # E92000001 = England;
  132. # W92000004 = Wales;
  133. # S92000003 = Scotland;
  134. # N92000002 = Northern Ireland;
  135.  
  136. # # # STEP 0b - LOAD PACKAGES ---------------------------------------------------------------------------------------------------
  137. pkg <- c('broom', 'classInt', 'data.table', 'ggplot2', 'ggmap', 'maptools', 'RColorBrewer', 'rgeos', 'rgdal', 'rmapshaper', 'RMySQL', 'sf', 'tmap')
  138. invisible( lapply(pkg, require, character.only = TRUE) )
  139.  
  140. # Additional linux libraries needed to install geospatial packages
  141. # - CentOS 7:
  142. #   * sudo yum install epel-release
  143. #   * sudo yum install gdal
  144. #   * sudo yum install gdal-devel
  145. #   * sudo yum install geos-devel
  146. #   * sudo yum install proj-epsg
  147. # - Ubuntu 14.04 LTS:
  148. #   * sudo aptitude install libproj-dev
  149. #   * sudo aptitude install libgdal-dev
  150. #   * sudo apt-get install libv8-dev
  151.  
  152.  
  153. # # # STEP 0c - SET VARIABLES ---------------------------------------------------------------------------------------------------
  154.  
  155. # set the directory of the boundaries shapefiles
  156. boundaries.path <- 'C:/cloud/OneDrive - University College London/data/boundaries/shp'
  157. # set the projection string for WGS84
  158. proj.wgs <- '+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0'
  159. # set UK centroid and bounded box
  160. UK_centre <- c(lon = -2.421976, lat = 53.825564)
  161. UK_bounds <- c(lng1 = 1.8, lat1 = 49.9, lng2 = -8.3, lat2 = 59.0 )
  162.  
  163.  
  164. # # # STEP 1 - LOAD AND MERGE EW, SC AND NI OAs MAP INTO A UNIQUE UK MAP  -------------------------------------------------------
  165.  
  166. # England and Wales
  167. # read polygons from shapefile (using rgdal, projection is automatically applied if the prj file is found)
  168. shp.ew <- readOGR(boundaries.path, layer = 'EW')
  169. # check the projection, and read the field to keep as future id
  170. summary(shp.ew)
  171. # transform the shapefile to WGS84 projection
  172. shp.ew <- spTransform(shp.ew, CRS(proj.wgs))
  173. # keep in the data slot only the ONS Output Area id, renaming it as 'id'
  174. shp.ew <- shp.ew[, 'oa11cd']
  175. colnames(shp.ew@data) <- c('id')
  176. # reassign the polygon IDs
  177. shp.ew <- spChFIDs(shp.ew, as.character(shp.ew$id))
  178.  
  179. # Scotland
  180. shp.sc <- readOGR(boundaries.path, layer = 'SC')
  181. summary(shp.sc)
  182. shp.sc <- spTransform(shp.sc, CRS(proj.wgs))
  183. shp.sc <- shp.sc[, 'code']
  184. colnames(shp.sc@data) <- c('id')
  185. shp.sc <- spChFIDs(shp.sc, as.character(shp.sc$id))
  186.  
  187. # N.Ireland
  188. shp.ni <- readOGR(boundaries.path, layer = 'NI')
  189. summary(shp.ni)
  190. shp.ni <- spTransform(shp.ni, CRS(proj.wgs))
  191. shp.ni <- shp.ni[, 'SA2011']
  192. colnames(shp.ni@data) <- c('id')
  193. shp.ni <- spChFIDs(shp.ni, as.character(shp.ni$id))
  194.  
  195. # UK as a merge of all previous boundaries
  196. shp.uk <- spRbind(spRbind(shp.ew, shp.sc), shp.ni)
  197. # reduce the complexity of the boundaries
  198. # shp.uk <- ms_simplify(shp.uk, keep = 0.05)
  199. # save merged polygons as a unique shapefile
  200. try( do.call(file.remove, as.list(paste0(boundaries.path, '/', list.files(path = boundaries.path, pattern = 'UK') ) ) ), silent = FALSE )
  201. writeOGR(shp.uk, dsn = boundaries.path, layer = 'UK', driver = 'ESRI Shapefile')
  202.  
  203.  
  204. # # # STEP 2 - CREATE AGGREGATION MAPS USING UK MAP AND LOOKUP TABLE ------------------------------------------------------------
  205.  
  206. # Define the area type that will be the base for aggrega
  207. base.loc <- 'OA'
  208. # read polygons from shapefile and apply projection
  209. shp.uk <- readOGR(boundaries.path, layer = base.loc)
  210. # connect to database
  211. db_conn <- dbConnect(MySQL(), user = 'root', password = 'root', dbname = 'geography')
  212. # retrieve lookup table
  213. lookups <- dbReadTable(db_conn, 'lookups')
  214. # close db connection
  215. dbDisconnect(db_conn)
  216. # list of aggregations to process
  217. # areas <- c('LSOA', 'MSOA', 'LAD', 'CTY', 'RGN', 'CTRY')
  218. areas <- c('CCG', 'LAT', 'NHSR', 'CCR', 'CTRY')
  219. for(area in areas){
  220.     # join shapefile data slot and lookup table on the area code
  221.     shp.uk.tmp <- merge(shp.uk, lookups[, c(base.loc, area)], by.x = 'id', by.y = base.loc)
  222.     # Build the list of subareas
  223.     subareas <- sort(unique(shp.uk.tmp[[area]]))
  224.     # Define first area
  225.     subarea <- subareas[1]
  226.     # Print a processing message
  227.     print(paste('Processing', area, 'subarea', subarea, '- number 1 out of', length(subareas)))
  228.     # select all OA contained in first subarea
  229.     shp.area <- subset(shp.uk.tmp, shp.uk.tmp[[area]] == subarea)
  230.     # dissolve submap
  231.     shp.area <- ms_dissolve(shp.area)
  232.     # define object id
  233.     shp.area$id <- subarea
  234.     shp.area <- spChFIDs(shp.area, as.character(shp.area$id))
  235.     # proceed in the same way for all other subareas, attaching every results to previous object
  236.     for(idx in 2:length(subareas)){
  237.         subarea <- subareas[idx]
  238.         print(paste('Processing', area, 'subarea', subarea, '- number', idx, 'out of', length(subareas)))
  239.         shp.tmp <- subset(shp.uk.tmp, shp.uk.tmp[[area]] == subarea)
  240.         shp.tmp <- ms_dissolve(shp.tmp)
  241.         shp.tmp$id <- subarea
  242.         shp.tmp <- spChFIDs(shp.tmp, as.character(shp.tmp$id))
  243.         shp.area <- spRbind(shp.area, shp.tmp)
  244.     }
  245.     print(paste0('Simplifying ', area, 's'))
  246.     # reduce the details of the boundaries for less space (it's a statistical map, not an OS Explorer Map!)
  247.     shp.area <- ms_simplify(shp.area, keep = 0.05)
  248.     # Print a processing message
  249.     print(paste('Saving', area, 'in dataframe format'))
  250.     # save lookup (rmapshaperid, id) adding the area type
  251.     lkp <- shp.area@data
  252.     lkp$type <- area
  253.     colnames(lkp) <- c('boundary_id', 'id', 'type')
  254.     # connect to database
  255.     db_conn <- dbConnect(MySQL(), user = 'root', password = 'root', dbname = 'geography')
  256.     # save lookup to database
  257.     dbSendQuery(db_conn, paste0("DELETE FROM boundaries_ids WHERE type = '", area, "'") )
  258.     dbWriteTable(db_conn, 'boundaries_ids', lkp, row.names = FALSE, append = TRUE)
  259.     # close db connection
  260.     dbDisconnect(db_conn)
  261.     # create dataframe suitable for ggplot
  262.     df.area <- tidy(shp.area)
  263.     # change names in dataframe to coordinates to avoid possible mismatching with programming languages keywords
  264.     setnames(df.area, c('long', 'lat'), c('X_lon', 'Y_lat'))
  265.     # add "type" column
  266.     df.area$type <- area
  267.     # connect to database
  268.     db_conn <- dbConnect(MySQL(), user = 'root', password = 'root', dbname = 'geography')
  269.     # save dataframe to database
  270.     dbSendQuery(db_conn, paste0("DELETE FROM boundaries WHERE type = '", area, "'") )
  271.     dbWriteTable(db_conn, 'boundaries', df.area, row.names = FALSE, append = TRUE)
  272.     # close db connection
  273.     dbDisconnect(db_conn)
  274.     # Print a processing message
  275.     print(paste('Saving', area, 'in shapefile format'))
  276.     # delete the rmapshaperid from Polygons
  277.     shp.area <- shp.area[, 'id']
  278.     # save Polygons as shapefile
  279.     try( do.call(file.remove, as.list(paste0(boundaries.path, '/', list.files(path = boundaries.path, pattern = area) ) ) ), silent = TRUE )
  280.     writeOGR(shp.area, dsn = boundaries.path, layer = area, driver = 'ESRI Shapefile')
  281.     # some polishing...
  282.     rm(df.area)
  283.     rm(shp.area)
  284.     gc()
  285. }
  286.  
  287.  
  288. # # # STEP 2b - ADDING IRELAND ---------------------------------------------------------------------------------------------------------
  289.  
  290. shp.ie <- readOGR(boundaries.path, layer = 'IE_OA')
  291. shp.ie <- spTransform(shp.ie, CRS(proj.wgs))
  292. summary(shp.ie)
  293. shp.ie <- shp.ie[, 'SMALL_AREA']
  294. colnames(shp.ie@data) <- c('id')
  295. shp.ie <- spChFIDs(shp.ie, as.character(shp.ie$id))
  296. try( do.call(file.remove, as.list(paste0(boundaries.path, '/', list.files(path = boundaries.path, pattern = 'IE_OA') ) ) ), silent = TRUE )
  297. writeOGR(shp.ie, dsn = boundaries.path, layer = 'IE_OA', driver = 'ESRI Shapefile')
  298. shp.ie <- ms_dissolve(shp.ie)
  299. shp.ie <- ms_simplify(shp.ie, keep = 0.05)
  300. colnames(shp.ie@data) <- c('id')
  301. shp.ie@data$id <- 'IE'
  302. try( do.call(file.remove, as.list(paste0(boundaries.path, '/', list.files(path = boundaries.path, pattern = '^IE\\.') ) ) ), silent = TRUE )
  303. writeOGR(shp.ie, dsn = boundaries.path, layer = 'IE', driver = 'ESRI Shapefile')
  304. df.ie <- tidy(shp.ie)
  305. setnames(df.ie, c('long', 'lat'), c('X_lon', 'Y_lat'))
  306. df.ie$type <- 'IE'
  307. db_conn <- dbConnect(MySQL(), user = 'root', password = 'root', dbname = 'geography')
  308. dbSendQuery(db_conn, paste0("DELETE FROM boundaries WHERE type = 'IE'") )
  309. dbWriteTable(db_conn, 'boundaries', df.ie, row.names = FALSE, append = TRUE)
  310. dbDisconnect(db_conn)
  311.  
  312. # HOW TO PLOT ALL COUNTRIES
  313. shp.ctry <- readOGR(boundaries.path, layer = 'CTRY')
  314. shp.ctry <- spRbind(shp.ie, shp.ctry)
  315. shp.ctry
  316. # OR
  317. shp.ctry <- dbGetQuery(db_conn, "SELECT * FROM boundaries WHERE type IN ('CTRY', 'IE')")
  318. ggplot(shp.ctry, aes(X_lon, Y_lat, group = group)) + geom_polygon(fill = 'darkred', colour = 'yellow') + coord_map()
  319. # END
  320.  
  321.  
  322. # # # STEP 3 - CALCULATE "METRICS" (using rgeos) --------------------------------------------------------------------------------
  323.  
  324. # Extract subareas codes
  325. y <- shp.area@data
  326.    
  327. # Add (mass) CENTROIDS
  328. y <- cbind(y, as.data.frame(gCentroid(shp.area, byid = TRUE) ) )
  329.  
  330. # Add PERIMETER (Length)
  331. y <- cbind(y, as.data.frame(gLength(shp.area, byid = TRUE) ) )
  332.  
  333. # Add AREA
  334. y <- cbind(y, as.data.frame(gArea(shp.area, byid = TRUE) ) )
  335.  
  336. # Other interesting functions:
  337.  
  338.  
  339.  
  340.  
  341.  
  342. # # # STEP 4 - PLOT MAPS  -------------------------------------------------------------------------------------------------------
  343.  
  344. # function to read and polish dataframe boundaries
  345. load_boundaries <- function(cut.out.shetland = TRUE){
  346.     db_conn <- dbConnect(MySQL(), user = 'root', password = 'root', dbname = 'geography')
  347.     if(cut.out.shetland){
  348.         bnd <- data.table( dbGetQuery(db_conn, 'SELECT * FROM boundaries WHERE Y_lat < 58.74') )
  349.     } else {
  350.         bnd <- data.table( dbReadTable(db_conn, 'boundaries') )
  351.     }
  352.     bnd_lkp <- data.table( dbReadTable(db_conn, 'boundaries_ids') )
  353.     dbDisconnect(db_conn)
  354.     setkeyv(bnd, c('id', 'type'))
  355.     setkeyv(bnd_lkp, c('boundary_id', 'type'))
  356.     bnd <- bnd[bnd_lkp]
  357.     bnd[, id := NULL]
  358.     setnames(bnd, 'i.id', 'id')
  359.     bnd[, `:=`(
  360.         order = as.integer(order),
  361.         hole = as.logical(hole),
  362.         piece = as.factor(piece),
  363.         group = as.factor(group),
  364.         id = as.factor(id),
  365.         type = as.factor(type)
  366.     )]
  367. }
  368. boundaries <- load_boundaries()
  369.  
  370. # read dataframe boundaries and create spatialPolygonDataframes (only for one-piece polygons)
  371. create.poly <- function(area_bid, area_type){
  372.     poly <- boundaries[id == area_bid & type == area_type & piece == 1, .(X_lon, Y_lat) ]
  373.     Polygons( list(Polygon(poly)), bnd_lkp[boundary_id == area_bid & type == area_type, id] )
  374. }
  375. area.types <- c('CCG', 'LAT', 'NHSR', 'CCR', 'CTRY')
  376. sp.polys <- list()
  377. for (area.type in area.types){
  378.     areas <- bnd_lkp[type == area.type & boundary_id %in% unique(boundaries[type == area.type, id]), .(boundary_id, id)]
  379.     sp.poly <- lapply(areas$boundary_id, function(x) create.poly(x, area.type) )  
  380.     sp.poly <- SpatialPolygons(sp.poly, proj4string = CRS(proj.wgs) )
  381.     sp.polys[[area.type]] <- SpatialPolygonsDataFrame(sp.poly, data = data.frame(row.names = areas$id, id = areas$id))
  382. }
  383.  
  384. # # # STEP 4a - PLOT MAPS USING TMAP AND SHAPEFILES -----------------------------------------------------------------------------
  385.  
  386. #
  387.  
  388.  
  389.  
  390.  
  391. tm_shape(
  392.         shp = shape.name                    # defines the object to use for subsequent layer, there could be more than one call
  393.         projection = proj.name              # allows to swap projections (as with coord_map() in ggplot2)
  394.     ) +                                    
  395.         tm_borders(col = 'colname') +
  396.     tm_fill(
  397.         col = 'varname',                    # define which variable / metric to use to color-in the polygons and how
  398.         style = 'quantile'
  399.     ) +      
  400.     tm_bubbles(size = 'varname') +
  401.     tm_raster(
  402.         col = 'layer.name',                 # needed when the object to plot is a raster
  403.         palette = 'pal.name',
  404.     ) +
  405.     tm_dots() +
  406.     tm_text() +
  407.     tm_polygons() +
  408.     tm_lines() +
  409.     tm_grid(n.x = , n.y = ) +               # adds equispaced longitude and latitude lines
  410.     tm_legend(position = 'pos.name') +
  411.     tm_compass() +
  412.     tmap_style('style.name')  +
  413.     tm_style_classic()
  414.  
  415.  
  416. # # # STEP 4b - PLOT MAPS USING GGPLOT/GGMAP AND CSV/DB -------------------------------------------------------------------------
  417.  
  418. # retrieve boundaries, points. and and data
  419. db_conn <- dbConnect(MySQL(), user = 'root', password = 'root', dbname = 'geography')
  420. boundaries <- dbReadTable(db_conn, 'boundaries')
  421. hospitals <- dbReadTable(db_conn, 'hospitals')
  422. # dataset <- dbReadTable(db_conn, 'hospitals')
  423. dbDisconnect(db_conn)
  424.  
  425. # initialize hospitals table for mapping
  426. coordinates(hospitals) <- ~X_lon+Y_lat
  427. # impose a projection on hospitals
  428. proj4string(hospitals) <- CRS(proj.wgs)
  429. # go back to dataframe/datatable
  430. hospitals <- as.data.frame(hospitals)
  431.  
  432. # get some map
  433. UK_map <- get_map(location = UK_centre, zoom = 6, scale = 2, source = 'stamen', maptype = 'toner')
  434.  
  435. # extract boundaries for requested area = 'XXX'
  436. y <- boundaries[type = 'XXX']
  437.  
  438. # join boundaries and data to calculate var.fill
  439. y <- merge(boundaries, dataset[, c('XXX', 'metric')], by.x = 'id', by.y = 'XXX')
  440.  
  441. # build tooltip
  442. y$ttip <- ''
  443.  
  444. # build first layer
  445. g <- ggmap(ggmap = UK_map, base_layer = ggplot(data = boundaries, aes(x = X_lon, y = Y_lat, group = group)) )
  446.  
  447. # add boundaries layer with fill reference, hover tooltip and reaction to click
  448. # the alpha argument in the call allows the map to show through the polygons
  449. g <- g + geom_polygon_interactive(aes(tooltip = ttip, data_id = ttip, fill = var.fill), colour = 'black', size = 0.2, alpha = 0.8)
  450.  
  451. # apply breaks to boundaries (equal: bin with same width, quantile: bin with same number of units, pretty: clear integer limits, fixed ?)
  452.  
  453.  
  454. # apply palette style to boundaries (usually sequential, some times diverging, but never qualitative - unless there is nothing to display)
  455.  
  456.  
  457. # add additional layer drawing thicker borders for outer areas
  458.  
  459.  
  460. # join hospitals and data to calculate var.size or/and var.color
  461.  
  462.  
  463. # build tooltip
  464.  
  465.  
  466. # add hospitals layer
  467. g <- g + geom_point_interactive(data = hospitals, aes(x = X_lon, y = Y_lat, size = var.size, colour = var.color, tooltip = ttip, data_id = ttip))
  468.  
  469. # apply breaks to hospitals
  470.  
  471.  
  472. # apply palette style to hospitals
  473.  
  474.  
  475. # g <- g + coord_map()   ???
  476.  
  477.  
  478.  
  479. # 2- MAPPING
  480.  
  481. # 2A- GGPLOT
  482. # - operates on data frames, therefore polygons objects (shapefiles) need to be translated into a data frame format beforehand.
  483. #   Maps in this format can then be plotted using the polygon geom, geom_polygon() draws lines between subsequent points and closes them up.
  484. #   Of course, x = long and y = lat are the main aesthetics, but the "group" column have to be mapped to the "group" aesthetic to let ggplot understand when a polygon has to be closed, and start with another one
  485. # - ggplot provides the map_data('mapname') function that loads a map found in the maps or mapdata package as a dataframe.
  486. #   The structure of the resulting data frames is prepared as wanted with the following:
  487. #    * long/lat are the coordinates: long is the longitud, and lat is the latitude
  488. #    * order. This just shows in which order ggplot should connect the dots
  489. #    * region and subregion tell what region or subregion a set of points surrounds.
  490. #    * group. nt! ggplots functions can take a group argument which controls (amongst other things) whether adjacent points should be connected by lines. If they are in the same group, then they get connected, but if they are in different groups then they don’t.
  491. #      Essentially, having to points in different groups means that ggplot lifts the pen when going between them.
  492.  
  493.  
  494.  
  495. ggplot() +
  496.     geom_polygon(data = internal, aes(x = X_lon, y = Y_lat, group = group, fill = Y), color = 'black', size = 0.25) +
  497.     geom_polygon(data = external, aes(x = X_lon, y = Y_lat, group = group), fill = NA, color = 'black', size = 0.40) +
  498.     geom_point(data = points, aes(X_lon, Y_lat, group = id, fill = Z1, size = Z2), colour = 'black', size = 0.25) +
  499.     coord_map(xlim = UK_bounds[c('lng2', 'lng1')], ylim = UK_bounds[c('lat2', 'lat1')])
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement