Advertisement
Guest User

Untitled

a guest
Feb 23rd, 2019
80
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 3.36 KB | None | 0 0
  1. def pagerank2(E, return_vector = False, max_iters = 1000, tolerance = 1e-6):
  2. """compute page rank from dense adjacency matrix
  3.  
  4. Inputs:
  5. E: adjacency matrix with links going from cols to rows.
  6. E is a matrix of 0s and 1s, where E[i,j] = 1 means
  7. that web page (vertex) j has a link to web page i.
  8. return_vector = False: If True, return the eigenvector as well as the ranking.
  9. max_iters = 1000: Maximum number of power iterations to do.
  10. tolerance = 1e-6: Stop when the eigenvector norm changes by less than this.
  11.  
  12. Outputs:
  13. ranking: Permutation giving the ranking, most important first
  14. vector (only if return_vector is True): Dominant eigenvector of PageRank matrix
  15.  
  16. This computes page rank by the following steps:
  17. 1. Add links from any dangling vertices to all vertices.
  18. 2. Scale the columns to sum to 1.
  19. 3. Add a constant matrix to represent jumping at random 15% of the time.
  20. 4. Find the dominant eigenvector with the power method.
  21. 5. Sort the eigenvector to get the rankings.
  22.  
  23. The homework problem due February 22 asks you to rewrite this code so
  24. it takes input E as a scipy csr_sparse matrix, and then never creates
  25. a full matrix or any large matrix other than E.
  26. """
  27.  
  28. #E = sparse.csr_matrix(E)
  29.  
  30.  
  31. #nnz = np.count_nonzero(E) # This call for sparse E may be different
  32. # outdegree = sparse.csr_matrix.sum(E, axis=0).T # This call for sparse E may be different
  33.  
  34.  
  35.  
  36. #assert np.max(E) == 1 and np.sum(E) == nnz, 'E must contain only zeros and ones'
  37.  
  38. # 1. Add links from any dangling vertices to all other vertices.
  39. # 3. Add a constant matrix to represent jumping at random 15% of the time.
  40. nrows, n = E.shape
  41. outdegree = E.sum(axis=0)# This call for sparse E may be different
  42. outdegree = np.squeeze(np.asarray(outdegree))
  43. outdegreef = np.zeros(n)
  44. assert nrows == n, 'E must be square'
  45.  
  46. for j in range(n):
  47. if outdegree[j] == 0:
  48. outdegreef[j] = 1/(n-1)
  49. outdegree[j] = 1
  50.  
  51. s = 1.0/n
  52. m = 0.15
  53. #E = (1 - m) * E/sparse.csr_matrix.sum(E, 0) + m*s
  54.  
  55. e = np.ones(n)
  56. v = e / npla.norm(e)
  57. #v = np.array(v)
  58.  
  59. for iteration in range(max_iters):
  60. oldv = v
  61. Ev = E.dot((v/outdegree))
  62. Fv = np.ones(n) * (outdegreef.dot(v)) - (outdegreef*v)
  63. Sv = np.ones(n) * (np.sum(v)* m/n)
  64.  
  65. v = (1-m)*(Ev+Fv)+Sv
  66. #np.squeeze(np.asarray(v))
  67. #v = sparse.csr_matrix(v)
  68. eigval = npla.norm(v)
  69. v = v / eigval
  70.  
  71. if npla.norm(v - oldv) < tolerance:
  72. break
  73.  
  74. if npla.norm(v - oldv) < tolerance:
  75. print('Dominant eigenvalue is %f after %d iterations.\n' % (eigval, iteration+1))
  76. else:
  77. print('Did not converge to tolerance %e after %d iterations.\n' % (tolerance, max_iters))
  78.  
  79. # Check that the eigenvector elements are all the same sign, and make them positive
  80. #assert np.all(v > 0) or np.all(v < 0), 'Error: eigenvector is not all > 0 or < 0'
  81. vector = np.abs(v)
  82.  
  83. # 5. Sort the eigenvector and reverse the permutation to get the rankings.
  84. ranking = np.argsort(vector)[::-1]
  85.  
  86. if return_vector:
  87. return ranking, vector
  88. else:
  89. return ranking
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement