Advertisement
Guest User

Untitled

a guest
Aug 22nd, 2017
75
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.27 KB | None | 0 0
  1. user_1, {"question_id": "choice_id", ...}
  2. user_2, {"question_id": "choice_id", ...}
  3.  
  4. from pyspark.mllib.linalg.distributed import RowMatrix
  5.  
  6. rows = sc.parallelize([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
  7.  
  8. mat = RowMatrix(rows)
  9.  
  10. mat.columnSimilarities(threshold)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement