Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- register 'hdfs:///user/jniemie8/pigpythonudf.py' using org.apache.pig.scripting.streaming.python.PythonScriptEngine as pythonudfs;
- rawdata = LOAD 'pigpythonudfexample/*' using TextLoader();
- sortedrecords = FOREACH rawdata GENERATE FLATTEN(pythonudfs.reorderColumns($0));
- store sortedrecords into 'pigpythonudfexample/';
- /*##datafile## <- remove this line
- 1, 7, 5, 4, 3
- 10, 2, 5, 4, 7
- 1, 9, 4, 4, 3
- */
- # pigpythonudf.py
- from pig_util import outputSchema
- @outputSchema("t1:tuple(col0:int, cal1:int, col2:int, col3:int, col4:int)")
- def reorderColumns(line):
- columns=sorted([int(x.strip()) for x in line.split(',')])
- return tuple(columns)
Add Comment
Please, Sign In to add comment