Guest User

Untitled

a guest
Dec 15th, 2018
96
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.28 KB | None | 0 0
  1. from pyspark.ml.feature import StringIndexer
  2. indexer = StringIndexer(inputCol='Country', outputCol='Country_ID')
  3. modified_df = indexer.fit(df).transform(df)
  4. modified_df.select('UserId').filter(df['Country_ID'] == 2).show()
  5.  
  6. modified_df.columns
  7.  
  8. ['UserId', 'Country', 'Country_ID']
Add Comment
Please, Sign In to add comment