Advertisement
Guest User

Untitled

a guest
Sep 12th, 2016
87
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 2.78 KB | None | 0 0
  1. ---
  2. title: "Apply dendrogram labels (in cummeRbund)"
  3. author: "Matthew L Bendall"
  4. date: "September 12, 2016"
  5. output: github_document
  6. always_allow_html: yes
  7. ---
  8.  
  9. ```{r setup, include=FALSE}
  10. library(cummeRbund)
  11. library(knitr)
  12. library(DT)
  13.  
  14. options(digits=6)
  15. knit_hooks$set(inline = function(x) {prettyNum(x, big.mark=",")})
  16. knitr::opts_chunk$set(echo = TRUE)
  17. ```
  18.  
  19. ### Load cummeRbund data
  20.  
  21. ```{r load-data}
  22. # Use CummeRbund to load data from cuffdiff output
  23. # This takes awhile the first time around, but is faster on subsequent loads
  24. cuff <- readCufflinks('/Users/bendall/Projects/asthma/results20160218/cuffdiff',
  25. genome="hg38")
  26. cuff
  27. ```
  28.  
  29. ### Take a look at the replicate-level data
  30.  
  31. ```{r}
  32. DT::datatable(replicates(cuff),
  33. options=list(searching=FALSE, scrollX=TRUE))
  34. ```
  35.  
  36. ### Plot the dendrogram
  37.  
  38. ```{r}
  39. dend.rep <- csDendro(genes(cuff), replicates=T)
  40. ```
  41.  
  42. As you can see here, the tips of the resulting dendrogram have useless names.
  43. Plus, these names are not stable across analyses with different groupings or
  44. numbers of samples.
  45.  
  46. ### Set up vector with desired names
  47.  
  48. Create a named vector where the values are the desired names, and the names are
  49. the labels used in the above dendrogram. In this case, the files used in the
  50. analysis were named with the patient ID, so we can extract the patient IDs from
  51. the `file` column.
  52.  
  53. ```{r}
  54. # goodnames (in the same order as the table):
  55. goodnames <- gsub('/lustre/groups/cbi/asthma/host/hg38/', '', replicates(cuff)$file)
  56. goodnames <- gsub('/abundances.cxb', '', goodnames)
  57.  
  58. # give the vector names using the rep_name column
  59. names(goodnames) <- replicates(cuff)$rep_name
  60. goodnames
  61. ```
  62.  
  63. Check that the new vector matches the `replicates(cuff)` table.
  64.  
  65. ### Relabel dendrogram
  66.  
  67. Set the labels in the new dendrogram to desired labels. The `labels` function
  68. is in the `dendextend` package (install using `install.packages("dendextend")`)
  69.  
  70. ```{r}
  71. newdend <- dend.rep
  72. dendextend::labels(newdend) <- goodnames[labels(newdend)]
  73. plot(newdend)
  74. ```
  75.  
  76. ### Alternate approach using ggdendro
  77.  
  78. You can make more elaborate plots using the ggdendro package. This allows you to
  79. work with the dendrogram like a ggplot object.
  80.  
  81. ```{r}
  82. library(ggplot2)
  83. library(ggdendro) # Loads `ggendro::dendro_data` as `dendro_data`
  84.  
  85. # Convert to dendro data object
  86. ggd <- ggdendro::dendro_data(dend.rep)
  87. # Set up data.frame with the other variables
  88. dvars <- replicates(cuff)
  89. rownames(dvars) <- dvars$rep_name
  90. # Reorder data.frame (same order as ggd)
  91. dvars <- dvars[as.character(ggd$labels$label),]
  92. # Add the good names to data.frame
  93. dvars$goodname <- goodnames[rownames(dvars)]
  94.  
  95. ggdendrogram(ggd, rotate=T) +
  96. geom_text(data=ggd$labels, aes(x, y-0.05, label=dvars$goodname, color=dvars$sample_name)) +
  97. coord_flip() + scale_y_reverse() + theme_dendro() + theme(legend.position="none")
  98.  
  99. ```
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement