daily pastebin goal
4%
SHARE
TWEET

Untitled

a guest Jun 25th, 2018 59 Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
  1. require 'redis'
  2. require 'yaml'
  3. require 'ruby-debug'
  4.  
  5. r = Redis.new
  6.  
  7. def average(vals)
  8.   vals.inject(0){|sum,i| sum += i } / vals.size.to_f
  9. end
  10. def std_dev(vals)
  11.   mean = 0.0
  12.   s = 0.0
  13.   vals.each_with_index do |x, idx|
  14.     delta = x - mean
  15.     mean += delta / (idx+1)
  16.     s += delta * (x - mean)
  17.   end
  18.   variance = s / vals.size.to_f
  19.   Math.sqrt(variance)
  20. end
  21.  
  22. # on small sets
  23. puts <<-TEXT
  24. Smallish Batches
  25. ================
  26. 1000 times for each line below, SRANDMEMBER was ran against a set of 20 items, for
  27. each line below. items 20 and 11 consistently get picked less often than the rest,
  28. and there is a strong bais towards items 18 and sometimes 16 to get picked more often.
  29. TEXT
  30. 1.upto(20).each {|i| r.sadd "smallish", i }
  31. distributions = []
  32. allcounts = []
  33. 50.times do
  34.   counts = r.smembers('smallish').inject({}){|m,i| m.update(i => 0) }
  35.   1000.times { counts[r.srandmember("smallish")] += 1 }
  36.   puts counts.map{|i| [i.first, i.last]}.sort_by{|i| i.last}.map{|i| i.first}.join(" ")
  37.   allcounts += counts.values
  38. end
  39. puts "\nStatistics across all randoms: min: #{allcounts.min}, max: #{allcounts.max} avg: #{average(allcounts)} stddev: #{std_dev(allcounts)}"
  40.  
  41. # on larger sets
  42. puts <<-TEXT
  43. Largish Batches
  44. ===============
  45. The same experiment, except a set of 2000 items, ran 50,000 times. The least popular
  46. and the most popular items are shown. The items that get picked more or less often
  47. seem to vary more from run to run but if you look closeley at the values below you
  48. will notice some trends.
  49. TEXT
  50.  
  51. 1.upto(2000) {|i| r.sadd "largish", i }
  52. distributions = []
  53. allcounts = []
  54. 30.times do
  55.   counts = r.smembers('largish').inject({}){|m,i| m.update(i => 0) }
  56.   50000.times { counts[r.srandmember('largish')] += 1 }
  57.   sorted = counts.map{|i| [i.first, i.last]}.sort_by{|i| i.last}.map{|i| i.first }
  58.   puts "#{sorted.take(7).join(' ')} ... #{sorted.values_at(-7..-1).join(' ')}"
  59.   allcounts += counts.values
  60. end
  61. puts "\nStatistics across all randoms: min: #{allcounts.min}, max: #{allcounts.max} avg: #{average(allcounts)} stddev: #{std_dev(allcounts)}"
RAW Paste Data
We use cookies for various purposes including analytics. By continuing to use Pastebin, you agree to our use of cookies as described in the Cookies Policy. OK, I Understand
 
Top