Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- # Tokenizing text
- from sklearn.feature_extraction.text import CountVectorizer
- count_vect = CountVectorizer()
- X_train_counts = count_vect.fit_transform(twenty_train.data)
- from sklearn.feature_extraction.text import TfidfTransformer
- tf_transformer = TfidfTransformer(use_idf=False).fit(X_train_counts)
- X_train_tf = tf_transformer.transform(X_train_counts)
Add Comment
Please, Sign In to add comment