josephxsxn

Pig Python Stream Example

Aug 16th, 2017
82
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.67 KB | None | 0 0
  1. register 'hdfs:///user/jniemie8/pigpythonudf.py' using org.apache.pig.scripting.streaming.python.PythonScriptEngine as pythonudfs;
  2. rawdata = LOAD 'pigpythonudfexample/*' using TextLoader();
  3. sortedrecords = FOREACH rawdata GENERATE FLATTEN(pythonudfs.reorderColumns($0));
  4. store sortedrecords into 'pigpythonudfexample/';
  5.  
  6.  
  7. /*##datafile## <- remove this line
  8. 1, 7, 5, 4, 3
  9. 10, 2, 5, 4, 7
  10. 1, 9, 4, 4, 3
  11. */
  12.  
  13. # pigpythonudf.py
  14. from pig_util import outputSchema
  15. @outputSchema("t1:tuple(col0:int, cal1:int, col2:int, col3:int, col4:int)")
  16. def reorderColumns(line):
  17. columns=sorted([int(x.strip()) for x in line.split(',')])
  18. return tuple(columns)
Add Comment
Please, Sign In to add comment