Advertisement
Javi

Spark: RDD vs Dataframe

Jul 17th, 2017
145
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.69 KB | None | 0 0
  1. Ejemplo RDD (java)
  2. ==================================
  3. http://www.agildata.com/apache-spark-rdd-vs-dataframe-vs-dataset/
  4.  
  5. rdd.filter(p -> p.getAge() < 21)
  6. .map(p -> p.getLast())
  7. .saveAsObjectFile("under21.bin");
  8.  
  9. Ejemplo con Dataframes (scala)
  10. ==================================
  11. https://databricks.com/blog/2016/07/14/a-tale-of-three-apache-spark-apis-rdds-dataframes-and-datasets.html
  12.  
  13. Calcula la media de la temperatura y la humedad por cada país si la temperatura supera los 25 grados.
  14.  
  15. val dsAvgTmp = ds.filter(d => {d.temp > 25})
  16. .map(d => (d.temp, d.humidity, d.cca3))
  17. .groupBy($"_3")
  18. .avg()
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement