Advertisement
Guest User

Untitled

a guest
Jun 16th, 2019
72
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.12 KB | None | 0 0
  1. def apply_transformation(dataframe, train_target):
  2. """
  3. Apply transformation on features and calculate the correlation with the target
  4. :param dataframe: pandas dataframe
  5. :param train_target: pandas series
  6. :return: pandas dataframe contains the correlation between each feature and the target for different
  7. applied transformations
  8. """
  9. # remove negative values and zeros to avoid math problem
  10. for col_i in dataframe.columns:
  11. dataframe[col_i] += abs(min(dataframe[col_i])) + 1
  12.  
  13. # 1 means the original values. If the type is number, it means x^number
  14. transformation_type = [1, "log", 0.25, 0.5, 0.75, 2, 3, 4]
  15. correlation_dataframe = pd.DataFrame(columns=[str(x) for x in transformation_type])
  16.  
  17. for trans_i in transformation_type:
  18. if trans_i == "log":
  19. dataframe_trans = np.log(dataframe)
  20. else:
  21. dataframe_trans = (dataframe) ** trans_i
  22.  
  23. correlation_dataframe[str(trans_i)] = [round(np.corrcoef([dataframe_trans[x], train_target])[0][1], 2)
  24. for x in dataframe.columns]
  25. return correlation_dataframe
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement