Advertisement
Guest User

Untitled

a guest
Jan 23rd, 2020
99
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.60 KB | None | 0 0
  1. //Zadanie3
  2. Dataset<Row> department = spark.read().option("header", "true").csv("departments.csv");
  3. Dataset<Row> products = spark.read().option("header", "true").csv("products.csv");
  4. Dataset<Row> products_department = products.join(department, "department_id");
  5. Dataset<Row> pdCtr = products_department.groupBy("department").count();
  6.  
  7. Dataset<Row> all = pdCtr.select(pdCtr.col("department"), pdCtr.col("count"), pdCtr.col("count").divide(products.count()).multiply(100).as("precent"));
  8. all.show();
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement