Advertisement
Guest User

Untitled

a guest
Apr 24th, 2019
80
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.37 KB | None | 0 0
  1. ## Made in Python
  2. Used the following modules:
  3. * nltk
  4. * flask
  5.  
  6. NLTK
  7. nltk.tag.tnt
  8. TnT - Statistical POS tagger
  9.  
  10. TnT uses a second order Markov model to produce tags for a sequence of input
  11. The set of possible tags for a given word is derived from the training data.
  12. (Training Data: It is the set of all tags that exact word has been assigned.)
  13. TnT DOES NOT AUTOMATICALLY DEAL WITH UNSEEN WORDS
  14. TnT SHOULD BE USED WITH SENTENCE-DELIMITED INPUT
  15. Input for tag function is a single sentence Input for tagdata function is a list of sentences . Output is of a similar form ('WORD' , 'TAG')
  16.  
  17. WORKING
  18. The set of possible tags for a given word is derived from the training data. It is the set of all tags that exact word has been assigned.
  19. The probability of a tag for a given word is the linear interpolation of 3 markov models; a zero-order, first-order, and a second order model.
  20.  
  21. Functions used
  22. * word_tokenize(s)
  23. Tokenize a string to split off punctuation other than periods
  24. * train(data)
  25. Uses a set of tagged data to train the tagger.
  26. * tag(data)
  27. Determine the most appropriate tag sequence for the given token sequence, and return a corresponding list of tagged tokens. A tagged token is encoded as a tuple ('WORD' , 'TAG')
  28. * evaluate(gold)
  29. Score the accuracy of the tagger against the gold standard. Strip the tags from the gold standard text, retag it using the tagger, then compute the accuracy score.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement