Advertisement
Guest User

Untitled

a guest
Jul 7th, 2017
626
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.57 KB | None | 0 0
  1. ```
  2. id first_name last_name email department ip_salary
  3. 1 Randy Griffin rgriffin0@discuz.net Product Management €3244,84
  4. 2 Teresa Perry tperry1@microsoft.com Services €2138,48
  5. 3 Matthew Ellis mellis2@livejournal.com Human Resources €2431,08
  6. 4 Joyce Rogers jrogers3@miibeian.gov.cn Marketing €3100,05
  7. 5 Joyce Parker jparker4@delicious.com Engineering €1718,04
  8. 6 Kenneth Willis kwillis5@canalblog.com Support €2579,97
  9. 7 Nicole Armstrong narmstrong6@nbcnews.com Support €3679,80
  10. 8 Harry Chavez hchavez7@symantec.com Product Management €4101,69
  11. 9 Frances Wright fwright8@bigcartel.com Training €2055,84
  12. 10 James Freeman jfreeman9@prlog.org Research and Development €4039,92
  13. 11 Emily Mason emasona@cnn.com Engineering €1309,85
  14. 12 Willie Alexander walexanderb@state.gov Legal €1800,52
  15. 13 Gerald Weaver gweaverc@imageshack.us Accounting €2776,68
  16. 14 Teresa Burns tburnsd@fastcompany.com Research and Development €2390,65
  17. 15 Mary Lee mleee@tumblr.com Product Management €2313,01
  18. 16 Carolyn Cooper ccooperf@addtoany.com Business Development €2521,03
  19. 17 Cheryl Fox cfoxg@cargocollective.com Product Management €1420,52
  20. 18 Diane Lawrence dlawrenceh@diigo.com Sales €1883,02
  21. 19 Susan Porter sporteri@shareasale.com Research and Development €3540,25
  22. 20 Irene Turner iturnerj@sbwire.com Product Management €3162,10
  23. ```
  24. val df = spark.read.option("header", "true").option("delimiter", "\t").csv("file:///Users/mprescha/salaries.txt")
  25. val df2 = df.withColumn("salary", translate(col("ip_salary").substr(2,10),",","." ).cast( DoubleType))
  26. df2.groupBy("department").avg("salary").show
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement