Advertisement
Guest User

Untitled

a guest
Apr 25th, 2015
201
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 2.25 KB | None | 0 0
  1. #! /usr/bin/env ruby
  2.  
  3. # This is a game that helps people understand genome assembly. Given a string,
  4. # it generates sequence reads giving perfect coverage of the string and with a
  5. # fixed overlap. The idea is to print the generated reads, cut them out, and
  6. # have learners assemble them by hand. Different difficulties can be
  7. # demonstrated by using a string with repeats, low complexity, etc., to mimic
  8. # real assembly problems, or by adjusting the parameters (overlap and number
  9. # of fragments).
  10.  
  11. # generate n approximately equally-sized fragments with each contiguous pair
  12. # of fragments overlapping by k such that the entire quote is covered by the
  13. # resulting fragments
  14. def sequence_quote_with_overlap(quote, n_fragments, k)
  15.  
  16. quote = quote.downcase
  17.  
  18. # the quote length may not be divisible by the number of fragments,
  19. # so we distribute the remainder across the fragments randomly.
  20. remainder = quote.length % n_fragments
  21. bump = ([1] * remainder + [0] * (n_fragments - remainder)).shuffle
  22. fraglen = quote.length / n_fragments + k
  23.  
  24. # because the final fragment will be too short by the overlap size,
  25. # we recover k characters from the other fragments at random
  26. (0...bump.length).to_a.sample(k).each{ |i| bump[i] -= 1 }
  27.  
  28. # for each fragment, adjust the fragment length by the bump
  29. # and sequence the fragment from the quote
  30. fragments = []
  31. firstchar = 0
  32. adj_fraglen = fraglen + bump[0]
  33. lastchar = firstchar + adj_fraglen - 1
  34.  
  35. (1...n_fragments).each do |i|
  36. adj_fraglen = fraglen + bump[i]
  37. fragments << quote[firstchar..lastchar]
  38. firstchar = lastchar - k + 1
  39. lastchar = firstchar + adj_fraglen - 1
  40. end
  41.  
  42. # store the final fragment
  43. fragments << quote[firstchar..lastchar]
  44.  
  45. fragments
  46. end
  47.  
  48.  
  49. # demos
  50.  
  51. # simple assembly
  52.  
  53. quote = "Try a thing you haven’t done three times. Once, to get over the fear of doing it. Twice, to learn how to do it. And a third time, to figure out whether you like it or not."
  54.  
  55. n_fragments = 18
  56. k = 6
  57.  
  58. frags = sequence_quote_with_overlap quote, n_fragments, k
  59. frags.shuffle.each { |f| puts "\"#{f}\"" }
  60.  
  61. # with repeats
  62.  
  63. quote = "Happiness resides not in possessions, and not in gold, happiness dwells in the soul."
  64. k = 3
  65. frags = sequence_quote_with_overlap quote, n_fragments, k
  66. frags.shuffle.each { |f| puts "\"#{f}\"" }
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement