Advertisement
Guest User

Untitled

a guest
Dec 21st, 2014
354
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.91 KB | None | 0 0
  1. from sklearn.feature_extraction.text import CountVectorizer
  2. from sklearn.cross_validation import train_test_split
  3. from sklearn.naive_bayes import MultinomialNB
  4.  
  5. X = vectorizer.fit_transform(df.quote)
  6. X = X.tocsc()
  7. Y = (df.fresh == 'fresh').values.astype(np.int)
  8.  
  9. xtrain, xtest, ytrain, ytest = train_test_split(X, Y)
  10.  
  11. clf = MultinomialNB().fit(xtrain, ytrain)
  12.  
  13. new_review = ['this is a new review, movie was awesome']
  14. new_review = vectorizer.fit_transform(new_review)
  15.  
  16. print df.quote[15]
  17. print(clf.predict(df.quote[10])) #predict existing review in dataframe
  18. print(clf.predict(new_review)) #predict new review
  19.  
  20.  
  21. Technically, Toy Story is nearly flawless.
  22. ---------------------------------------------------------------------------
  23. TypeError Traceback (most recent call last)
  24. <ipython-input-91-27a0698bbd1f> in <module>()
  25. 15
  26. 16 print df.quote[15]
  27. ---> 17 print(clf.predict(df.quote[10])) #predict existing quote in dataframe
  28. 18 print(clf.predict(new_review)) #predict new review
  29.  
  30. //anaconda/lib/python2.7/site-packages/sklearn/naive_bayes.pyc in predict(self, X)
  31. 60 Predicted target values for X
  32. 61 """
  33. ---> 62 jll = self._joint_log_likelihood(X)
  34. 63 return self.classes_[np.argmax(jll, axis=1)]
  35. 64
  36.  
  37. //anaconda/lib/python2.7/site-packages/sklearn/naive_bayes.pyc in _joint_log_likelihood(self, X)
  38. 439 """Calculate the posterior log probability of the samples X"""
  39. 440 X = atleast2d_or_csr(X)
  40. --> 441 return (safe_sparse_dot(X, self.feature_log_prob_.T)
  41. 442 + self.class_log_prior_)
  42. 443
  43.  
  44. //anaconda/lib/python2.7/site-packages/sklearn/utils/extmath.pyc in safe_sparse_dot(a, b, dense_output)
  45. 178 return ret
  46. 179 else:
  47. --> 180 return fast_dot(a, b)
  48. 181
  49. 182
  50.  
  51. TypeError: Cannot cast array data from dtype('float64') to dtype('S32') according to the rule 'safe'
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement