Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- user_id ZIP City email
- 105 100051 Lond. jsmith@hotmail.com
- 382 251574 jgjefferson@gmail.com
- 225 0100051 London john.smith@hotmail.com
- user_id ZIP City email new_id
- 105 100051 Lond. jsmith@hotmail.com 105
- 382 251574 jgjefferson@gmail.com 382
- 225 0100051 London john.smith@hotmail.com 105
- email<-c("jsmith@hotmail.com","jgjefferson@gmail.com","john.smith@hotmail.com")
- dist<-stringdistmatrix(email,email,method="jw")
- dist[dist==0]<-1
- cbind(email,email_near=email[apply(dist, 1, which.min)],dist=apply(dist, 1, FUN=min))
- email email_near dist
- [1,] "jsmith@hotmail.com" "john.smith@hotmail.com" "0.208754208754209"
- [2,] "jgjefferson@gmail.com" "jsmith@hotmail.com" "0.281746031746032"
- [3,] "john.smith@hotmail.com" "jsmith@hotmail.com" "0.208754208754209"
Add Comment
Please, Sign In to add comment