Advertisement
Guest User

Untitled

a guest
Aug 21st, 2019
79
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 0.63 KB | None | 0 0
  1. # Vectorize ticket descriptions using Word2Vec
  2. textList = [TaggedDocument(doc, [i]) for i, doc in enumerate(unformattedList)]
  3. numCores = multiprocessing.cpu_count() # Compute the number of logical processors (how many workers we will use to train the Doc2Vec model)
  4. x = np.array(Doc2Vec(textList, workers=numCores, vector_size=150))
  5.  
  6. # Vectorize ticket labels/tags using MultiLabelBinarizer
  7. tagList = relevantDF.Tags
  8. vectorizer2 = MultiLabelBinarizer()
  9. vectorizer2.fit(tagList)
  10. y = vectorizer2.transform(tagList)
  11.  
  12. # Split test data and convert test data to arrays
  13. xTrain, xTest, yTrain, yTest = train_test_split(x, y, test_size=0.20)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement