Guest User

Untitled

a guest
Mar 22nd, 2018
285
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.97 KB | None | 0 0
  1. +-------+--------------------+----+
  2. |User-ID| Location| Age|
  3. +-------+--------------------+----+
  4. | 1| nyc, new york, usa|NULL|
  5. | 2|stockton, califor...| 18|
  6. | 3|moscow, yukon ter...|NULL|
  7. | 4|porto, v.n.gaia, ...| 17|
  8. | 5|farnborough, hant...|NULL|
  9. | 6|santa monica, cal...| 61|
  10. | 7| washington, dc, usa|NULL|
  11. | 8|timmins, ontario,...|NULL|
  12. | 9|germantown, tenne...|NULL|
  13. | 10|albacete, wiscons...| 26|
  14. | 11|melbourne, victor...| 14|
  15. | 12|fort bragg, calif...|NULL|
  16. | 13|barcelona, barcel...| 26|
  17. | 14|mediapolis, iowa,...|NULL|
  18. | 15|calgary, alberta,...|NULL|
  19. | 16|albuquerque, new ...|NULL|
  20. | 17|chesapeake, virgi...|NULL|
  21. | 18|rio de janeiro, r...| 25|
  22. | 19| weston, ,| 14|
  23. | 20|langhorne, pennsy...| 19|
  24. +-------+--------------------+----+
  25.  
  26. +-------+----------+-----------+
  27. |User-ID| ISBN|Book-Rating|
  28. +-------+----------+-----------+
  29. | 276725|034545104X| 0|
  30. | 276726|0155061224| 5|
  31. | 276727|0446520802| 0|
  32. | 276729|052165615X| 3|
  33. | 276729|0521795028| 6|
  34. | 276733|2080674722| 0|
  35. | 276736|3257224281| 8|
  36. | 276737|0600570967| 6|
  37. | 276744|038550120X| 7|
  38. | 276745| 342310538| 10|
  39. | 276746|0425115801| 0|
  40. | 276746|0449006522| 0|
  41. | 276746|0553561618| 0|
  42. | 276746|055356451X| 0|
  43. | 276746|0786013990| 0|
  44. | 276746|0786014512| 0|
  45. | 276747|0060517794| 9|
  46. | 276747|0451192001| 0|
  47. | 276747|0609801279| 0|
  48. | 276747|0671537458| 9|
  49. +-------+----------+-----------+
  50.  
  51. def crearDataFrame(nombre_fichero):
  52. df = spark.read.format("csv").option("header", "true").option("delimiter", ";").load(nombre_fichero)
  53. return df
  54.  
  55. dfUser = crearDataFrame("BX-Users.csv")
  56. dfBooks_Rating = crearDataFrame("BX-Book-Ratings.csv")
  57.  
  58. df_ = dfUser.join(dfBooks_Rating, dfUser.User-ID == dfBooks_Rating.User-ID, 'inner').show()
Add Comment
Please, Sign In to add comment