Advertisement
josephxsxn

Pig HBase Analysis Script

Jun 9th, 2017
125
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Bash 0.78 KB | None | 0 0
  1. --HBASE SEARCH
  2.  
  3. raw = LOAD 'hbase/*' using TextLoader();
  4. columns = FOREACH raw GENERATE FLATTEN(REGEX_EXTRACT_ALL($0, '(\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}:\\d{2},\\d{0,4}) (\\w{0,5}) (.*)'));
  5.  
  6. --2017-06-08 22:21:42,853
  7. named = FOREACH columns GENERATE ToDate($0, 'yyyy-MM-dd HH:mm:ss,SSSS') as timstim, $1 as logger, $2 as payload;
  8. fil1 = FILTER named BY timstim > ToDate('2017-06-08T16:05:00.000-04:00') AND timstim < ToDate('2017-06-08T17:45:00.000-04:00');
  9.  
  10.  
  11.  
  12. --GC INFO
  13. gcstops = FILTER fil1 BY payload MATCHES '.*util.JvmPauseMonitor.*';
  14.  
  15.  
  16. --SCANS
  17. scans = FILTER fil1 BY payload MATCHES '.*scan.*';
  18.  
  19.  
  20. --compcation
  21. compactions = FILTER fil1 BY payload MATCHES '.*compactions.*';
  22.  
  23.  
  24. --OTHER
  25.  
  26. unions = union gcstops, scans, compactions;
  27. sorted = order unions by timstim;
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement