Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- +-------+--------------------+----+
- |User-ID| Location| Age|
- +-------+--------------------+----+
- | 1| nyc, new york, usa|NULL|
- | 2|stockton, califor...| 18|
- | 3|moscow, yukon ter...|NULL|
- | 4|porto, v.n.gaia, ...| 17|
- | 5|farnborough, hant...|NULL|
- | 6|santa monica, cal...| 61|
- | 7| washington, dc, usa|NULL|
- | 8|timmins, ontario,...|NULL|
- | 9|germantown, tenne...|NULL|
- | 10|albacete, wiscons...| 26|
- | 11|melbourne, victor...| 14|
- | 12|fort bragg, calif...|NULL|
- | 13|barcelona, barcel...| 26|
- | 14|mediapolis, iowa,...|NULL|
- | 15|calgary, alberta,...|NULL|
- | 16|albuquerque, new ...|NULL|
- | 17|chesapeake, virgi...|NULL|
- | 18|rio de janeiro, r...| 25|
- | 19| weston, ,| 14|
- | 20|langhorne, pennsy...| 19|
- +-------+--------------------+----+
- +-------+----------+-----------+
- |User-ID| ISBN|Book-Rating|
- +-------+----------+-----------+
- | 276725|034545104X| 0|
- | 276726|0155061224| 5|
- | 276727|0446520802| 0|
- | 276729|052165615X| 3|
- | 276729|0521795028| 6|
- | 276733|2080674722| 0|
- | 276736|3257224281| 8|
- | 276737|0600570967| 6|
- | 276744|038550120X| 7|
- | 276745| 342310538| 10|
- | 276746|0425115801| 0|
- | 276746|0449006522| 0|
- | 276746|0553561618| 0|
- | 276746|055356451X| 0|
- | 276746|0786013990| 0|
- | 276746|0786014512| 0|
- | 276747|0060517794| 9|
- | 276747|0451192001| 0|
- | 276747|0609801279| 0|
- | 276747|0671537458| 9|
- +-------+----------+-----------+
- def crearDataFrame(nombre_fichero):
- df = spark.read.format("csv").option("header", "true").option("delimiter", ";").load(nombre_fichero)
- return df
- dfUser = crearDataFrame("BX-Users.csv")
- dfBooks_Rating = crearDataFrame("BX-Book-Ratings.csv")
- df_ = dfUser.join(dfBooks_Rating, dfUser.User-ID == dfBooks_Rating.User-ID, 'inner').show()
Add Comment
Please, Sign In to add comment