Guest User

Untitled

a guest
Feb 2nd, 2018
481
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.88 KB | None | 0 0
  1. user_id ZIP City email
  2. 105 100051 Lond. jsmith@hotmail.com
  3. 382 251574 jgjefferson@gmail.com
  4. 225 0100051 London john.smith@hotmail.com
  5.  
  6. user_id ZIP City email new_id
  7. 105 100051 Lond. jsmith@hotmail.com 105
  8. 382 251574 jgjefferson@gmail.com 382
  9. 225 0100051 London john.smith@hotmail.com 105
  10.  
  11. email<-c("jsmith@hotmail.com","jgjefferson@gmail.com","john.smith@hotmail.com")
  12.  
  13. dist<-stringdistmatrix(email,email,method="jw")
  14. dist[dist==0]<-1
  15.  
  16. cbind(email,email_near=email[apply(dist, 1, which.min)],dist=apply(dist, 1, FUN=min))
  17.  
  18. email email_near dist
  19. [1,] "jsmith@hotmail.com" "john.smith@hotmail.com" "0.208754208754209"
  20. [2,] "jgjefferson@gmail.com" "jsmith@hotmail.com" "0.281746031746032"
  21. [3,] "john.smith@hotmail.com" "jsmith@hotmail.com" "0.208754208754209"
Add Comment
Please, Sign In to add comment