Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- rdd = sqlCtx.createDataFrame(df_new)
- summary = Statistics.colStats(rdd)
- df_new = df.applymap(lambda s: dic.get(s) if s in dic else s) #df is a pandas dataframe
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement