Guest User

Untitled

a guest
Mar 22nd, 2018
109
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.40 KB | None | 0 0
  1. import sys
  2. sys.path.insert(0, '.')
  3.  
  4. from pyspark import SparkContext, SparkConf
  5. SparkContext.setSystemProperty('spark.executor.memory', '100g')
  6. from Utils import Utils
  7. def splitComma(line):
  8. splits = Utils.COMMA_DELIMITER.split(line)
  9. return "{}, {}".format(splits[1], splits[6])
  10.  
  11.  
  12. '''
  13. Create a Spark program to read the airport data from in/airports.text, find all the airports whose latitude are bigger than 40.
  14. Then output the airport's name and the airport's latitude to out/airports_by_latitude.text.
  15.  
  16. Each row of the input file contains the following columns:
  17. Airport ID, Name of airport, Main city served by airport, Country where airport is located, IATA/FAA code,
  18. ICAO Code, Latitude, Longitude, Altitude, Timezone, DST, Timezone in Olson format
  19. from pyspark import SparkContext
  20. SparkContext.setSystemProperty('spark.executor.memory', '2g')
  21. sc = SparkContext("local", "App Name")
  22. Sample output:
  23. "St Anthony", 51.391944
  24. "Tofino", 49.082222
  25. ...
  26. '''
  27.  
  28.  
  29.  
  30. if __name__ == "__main__":
  31. conf = SparkConf().setAppName("airports").setMaster("local[4]")
  32. sc = SparkContext(conf = conf)
  33.  
  34. airports = sc.textFile("in/airports.text")
  35. lattitudeover40 = airports.filter(lambda line : float(Utils.COMMA_DELIMITER.split(line)[6]) > 40)
  36.  
  37. airportsNameAndLattitude = lattitudeover40.map(splitComma)
  38. airportsNameAndLattitude.saveAsTextFile("out/lattitudeoverforty.text")
Add Comment
Please, Sign In to add comment