Advertisement
creativity_404

RDataMarket Instructions

Mar 15th, 2014
168
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.60 KB | None | 0 0
  1. rdatamarket works with DataMarket.com, and it's pretty useful for a variety of things.
  2.  
  3. For example, let's say we want to obtain data from BBC News Visitor Traffic. http://datamarket.com/data/set/4e87/bbc-news-visitor-traffic#!ds=4e87!7ja7&display=line
  4.  
  5. One method of obtaining this information is as follows.
  6.  
  7. library(rdatamarket)
  8. bbcinfo=dminfo('http://data.is/1gmWeFg')#use short url obtained from website
  9.  
  10. Now we want to extract some information, here are two ways of going about it.
  11.  
  12. bbclist=dmlist(bbcinfo,Region=c('African','Asian'))
  13. bbclist2=dmlist('http://data.is/1gmWeFg',Region=c('African','Asian'))
  14.  
  15. Note that the Region parameter will not do anything in the second case. `bbclist` should be a 7132 by 3 data frame, and `bbclist2` should be 19137 by 3. The first column is the region, the second column is the timestamp, and the third column is the value represented (how much viewership is above or below normal levels; I'm not sure how they get values of less than -100%, though).
  16.  
  17. You can always use the `reshape2` package to make the data easier to work with if you have multiple regions.
  18.  
  19. library(reshape2)
  20. bbccast=dcast(bbclist,'DateTime~Region')
  21.  
  22. And you will have the different regions in the same row. It's understood that in the case of a data frame with 3 variables that the one not mentioned is the value variable (otherwise that has to be explicitly inserted as a `value.var` parameter.
  23.  
  24. There is also the `dmseries` function, which seems to work well for univariate time series data, but I've been having trouble getting it to work with more complicated data.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement