This week only. Pastebin PRO Accounts Christmas Special! Don't miss out!Want more features on Pastebin? Sign Up, it's FREE!
Guest

Untitled

By: a guest on Jan 4th, 2013  |  syntax: None  |  size: 2.65 KB  |  views: 20  |  expires: Never
download  |  raw  |  embed  |  report abuse  |  print
Text below is selected. Please press Ctrl+C to copy to your clipboard. (⌘+C on Mac)
  1. Views on a 10000000-length Rle subject
  2.  
  3.  views:
  4.       start      end   width
  5.  [1]      1     1000    1000 [100 100 100 100 100 100 100 100 100 100 ...]
  6.  [2]   1001     2000    1000 [190 190 190 190 190 190 190 190 190 190 ...]
  7.  [3]   2001     3000    1000 [280 280 280 280 280 280 280 280 280 280 ...]
  8.  [4]   3001     4000    1000 [370 370 370 370 370 370 370 370 370 370 ...]
  9.  [5]   4001     5000    1000 [460 460 460 460 460 460 460 460 460 460 ...]
  10.  ...    ...      ...     ... ...
  11.  [9996] 995001  9996000 9001000 [89650 89650 89650 89650 89650 89650 ...]
  12.  [9997] 996001  9997000 9001000 [89740 89740 89740 89740 89740 89740 ...]
  13.  [9998] 997001  9998000 9001000 [89830 89830 89830 89830 89830 89830 ...]
  14.  [9999] 998001  9999000 9001000 [89920 89920 89920 89920 89920 89920 ...]
  15. [10000] 999001 10000000 9001000 [90010 90010 90010 90010 90010 90010 ...]
  16.        
  17. [1] 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100
  18.        
  19. library('GenomicRanges')
  20. # generating data frame
  21. df=data.frame(chrom=rep('Chr1',100000),start=seq(1,1000000,by=1000),end=seq(1000,10000000,by=1000),strand=rep("+",100000))
  22.  
  23. # making GRanges object
  24. gr=GRanges(seqnames=as.vector(df[,1]),IRanges(start=df[,2],end=df[,3]),strand=df[,4])
  25. # obtaining coverage using function coverage in the form of RLE object
  26. gr.cov=coverage(gr)
  27. # generating views for specific start and end
  28. gr.views=Views(gr.cov[[1]],start=seq(1,1000000,by=1000),end=seq(1000,10000000,by=1000))
  29. # putting in temp variable
  30. d=gr.views
  31.  
  32. # this following code calculates the matrix (where each line is 20 points) for 10 lines
  33. # reduce or increase the number in the outermost sapply loop to increase/decrease the lines to be calculated
  34.  
  35. sapply(1:10,function(j)
  36.    sapply(1:20,
  37.    function(i)as.numeric(
  38.      format(
  39.        mean(
  40.          as(d[[j]][(
  41.            seq(0,length(d[[j]]),floor(length(d[[j]])/20))+1)[i]:
  42.              c((seq(0,length(d[[j]]),floor(length(d[[j]])/20)))[
  43.                -length((seq(0,length(d[[j]]),floor(length(d[[j]])/20))))
  44.                ],length(d[[j]]))[i+1]],
  45.             "RangedData")$score),
  46.        digits=2)
  47.      )
  48.    )
  49. )
  50.        
  51. genes <- GRanges(seqnames, IRanges(geneStarts, geneEnds))
  52.        
  53. n <- 50L
  54. starts0 <- Map(function(...) floor(seq(...)), start(genes), end(genes),
  55.                MoreArgs=list(length.out=n + 1L))
  56. ends <- lapply(starts0, function(x) floor(x)[-1])
  57. starts <- lapply(starts0, function(x) x[-length(x)])
  58.        
  59. v <- Views(gr.cov[[1]], start=unlist(starts), end=unlist(ends))
  60.        
  61. split(viewMeans(v), rep(seq_along(genes), each=n))
  62.        
  63. > (x <- Rle(c(rep(100, 10), rep(200, 10))))
  64. numeric-Rle of length 20 with 2 runs
  65.   Lengths:  10  10
  66.   Values : 100 200
  67. > runValue(x)
  68. [1] 100 200
  69. > runLength(x)
  70. [1] 10 10
clone this paste RAW Paste Data