Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- from pyspark import SparkContext
- sc = SparkContext.getOrCreate()
- result = sc.textFile( "users.csv" ) \
- .map(lambda x: (x.split('|')[3],1) ) \
- .filter( lambda x: x[0] != 'other' ) \
- .reduceByKey( lambda x,y:x+y ) \
- .sortBy( lambda x: -x[1] ).collect()
- for line in result:
- print line
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement