Advertisement
lalkaed

SportsCategorizer

Sep 4th, 2018
100
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 0.75 KB | None | 0 0
  1. from sklearn.datasets import fetch_20newsgroups
  2. from sklearn.naive_bayes import MultinomialNB
  3. from sklearn.feature_extraction.text import CountVectorizer
  4.  
  5. train_emails = fetch_20newsgroups(categories = ['comp.sys.ibm.pc.hardware', 'rec.sport.hockey'],subset='train',shuffle='true',random_state=108)
  6. test_emails = fetch_20newsgroups(categories = ['comp.sys.ibm.pc.hardware', 'rec.sport.hockey'],subset='test',shuffle='true',random_state=108)
  7. counter = CountVectorizer()
  8. counter.fit(test_emails.data + train_emails.data)
  9. train_counts = counter.transform(train_emails.data)
  10. test_counts = counter.transform(test_emails.data)
  11. classifier = MultinomialNB()
  12. classifier.fit(train_counts, train_emails.target)
  13. print(classifier.score(test_counts, test_emails.target))
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement