Guest User

Untitled

a guest
Oct 23rd, 2017
74
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.99 KB | None | 0 0
  1. from pyspark.sql.functions import col, desc
  2. stateByZhvi = home.select('State','Zhvi').groupBy((col("State"))).avg("Zhvi").show()
  3.  
  4. +-----+------------------+
  5. |State| avg(Zhvi)|
  6. +-----+------------------+
  7. | AZ|246687.01298701297|
  8. | SC|143188.94736842104|
  9. | LA|159991.74311926606|
  10. | MN|236449.40239043825|
  11. | NJ| 367156.5637065637|
  12. | DC| 586109.5238095238|
  13. | OR| 306646.3768115942|
  14. | VA| 282764.4986449864|
  15.  
  16. home.createOrReplaceTempView("home")
  17.  
  18. spark.sql("select State, round(avg(Zhvi)) as avg_Zhvi from home group by State order by 2 desc").show()
  19.  
  20. // input dataframe
  21. +-----+------------------+
  22. |State| avg|
  23. +-----+------------------+
  24. | AZ|246687.01298701297|
  25. | SC|143188.94736842104|
  26. | LA|159991.74311926606|
  27. +-----+------------------+
  28.  
  29. df.orderBy(desc("avg")).show()
  30.  
  31. //
  32. +-----+------------------+
  33. |State| avg|
  34. +-----+------------------+
  35. | AZ|246687.01298701297|
  36. | LA|159991.74311926606|
  37. | SC|143188.94736842104|
  38. +-----+------------------+
Add Comment
Please, Sign In to add comment