Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- user_1, {"question_id": "choice_id", ...}
- user_2, {"question_id": "choice_id", ...}
- from pyspark.mllib.linalg.distributed import RowMatrix
- rows = sc.parallelize([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
- mat = RowMatrix(rows)
- mat.columnSimilarities(threshold)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement