Advertisement
Guest User

Untitled

a guest
Feb 17th, 2020
111
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.45 KB | None | 0 0
  1. from pyspark import SparkContext, SparkConf
  2.  
  3. sparkConf = SparkConf().setAppName("CCA 175 Problem 84")
  4. sc = SparkContext(conf = sparkConf)
  5.  
  6. contentRDD = sc.textFile("Content.txt")
  7. nonEmptyLines = sc.filter(lambda line: len(line) > 0)
  8. words = nonEmptyLines.flatMap(lambda x: x.split(" "))
  9. finalRDD = words.filter(lambda x: len(x) > 2)
  10. for word in finalRDD:
  11. print(word)
  12. finalRDD.saveAsTextFile("problem84")
  13.  
  14. spark-submit -master yarn problem84.py
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement