ArcheontPB

Feature engineering with PySpark

Mar 15th, 2020
57
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.45 KB | None | 0 0
  1. # Name and value of col with max corr
  2. corr_max = 0
  3. corr_max_col = columns[0]
  4.  
  5. # Loop to check all columns contained in list
  6. for col in columns:
  7. # Check the correlation of a pair of columns
  8. corr_val = df.corr('SALESCLOSEPRICE', col)
  9. # Logic to compare corr_max with current corr_val
  10. if corr_val > corr_max:
  11. # Update the column name and corr value
  12. corr_max = corr_val
  13. corr_max_col = col
  14.  
  15. print(corr_max_col)
Add Comment
Please, Sign In to add comment