Advertisement
Guest User

Untitled

a guest
Jul 30th, 2014
241
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.90 KB | None | 0 0
  1. #read in the data
  2. A = mmread('data/yelpData.mtx');
  3.  
  4. #extract the ratings of each review
  5. ratings = full(A(:,1));
  6. A(:,1) = [];
  7.  
  8. #split into test/train, should replace with better julia way of doing it
  9. data = randperm(numData);
  10. ind = floor(numData*0.7);
  11. training = data(1:ind);
  12. test = data(ind+1:end);
  13. trainReviews = A(training,:);
  14. trainRatings = ratings(training,:);
  15. testReviews = A(test,:);
  16. testRatings = ratings(test,:);
  17.  
  18. #pick some value of lambda i think 100 should work
  19. lambda = 100
  20.  
  21. #CVX fails here, i used matrix stuffing in actual code, but lsq should be better
  22. cvx_begin
  23. variables w(1000) v(1)
  24. minimize sum_square(trainReviews*w + v - trainRatings) + lambda*sum_square(w)
  25. cvx_end
  26.  
  27. #calculate root mean squared error for test/train
  28. yhat = trainReviews*w + v;
  29. trainRMS = sqrt(mean((trainRatings - yhat).^2));
  30.  
  31. yhat2 = testReviews*w + v;
  32. testRMS = sqrt(mean((testRatings - yhat2).^2));
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement