Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- wordsList = ['cat', 'elephant', 'rat', 'rat', 'cat']
- wordsRDD = sc.parallelize(wordsList, 4)
- wordCounts = wordPairs.reduceByKey(lambda x,y:x+y)
- print wordCounts.collect()
- #PRINTS--> [('rat', 2), ('elephant', 1), ('cat', 2)]
- from operator import add
- totalCount = (wordCounts
- .map(<< FILL IN >>)
- .reduce(<< FILL IN >>))
- #SHOULD PRINT 5
- #(wordCounts.values().sum()) // does the trick but I want to this with map() and reduce()
- I need to use a reduce() action to sum the counts in wordCounts and then divide by the number of unique words.
- .map(lambda x:x.values())
- .reduce(lambda x:sum(x)))
- AND,
- .map(lambda d:d[k] for k in d)
- .reduce(lambda x:sum(x)))
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement