alvations

toktok moses test

Mar 4th, 2016
257
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 8.17 KB | None | 0 0
  1. alvas@ubi:~/git/nltk$ python2.7 -m doctest nltk/tokenize/toktok.py -v
  2. Trying:
  3. toktok = ToktokTokenizer()
  4. Expecting nothing
  5. ok
  6. Trying:
  7. text = u'Is 9.5 or 525,600 my favorite number?'
  8. Expecting nothing
  9. ok
  10. Trying:
  11. print (toktok.tokenize(text))
  12. Expecting:
  13. Is 9.5 or 525,600 my favorite number ?
  14. ok
  15. Trying:
  16. text = u'The https://github.com/jonsafari/tok-tok/blob/master/tok-tok.pl is a website with/and/or slashes and sort of weird : things'
  17. Expecting nothing
  18. ok
  19. Trying:
  20. print (toktok.tokenize(text))
  21. Expecting:
  22. The https://github.com/jonsafari/tok-tok/blob/master/tok-tok.pl is a website with/and/or slashes and sort of weird : things
  23. ok
  24. Trying:
  25. text = u'This, is a sentence with weird� symbols\u2026 appearing everywhere�'
  26. Expecting nothing
  27. ok
  28. Trying:
  29. expected = u'This , is a sentence with weird � symbols \u2026 appearing everywhere �'
  30. Expecting nothing
  31. ok
  32. Trying:
  33. assert toktok.tokenize(text) == expected
  34. Expecting nothing
  35. ok
  36. 2 items had no tests:
  37. toktok
  38. toktok.ToktokTokenizer.tokenize
  39. 1 items passed all tests:
  40. 8 tests in toktok.ToktokTokenizer
  41. 8 tests in 3 items.
  42. 8 passed and 0 failed.
  43. Test passed.
  44. alvas@ubi:~/git/nltk$ python3.4 -m doctest nltk/tokenize/toktok.py -v
  45. Trying:
  46. toktok = ToktokTokenizer()
  47. Expecting nothing
  48. ok
  49. Trying:
  50. text = u'Is 9.5 or 525,600 my favorite number?'
  51. Expecting nothing
  52. ok
  53. Trying:
  54. print (toktok.tokenize(text))
  55. Expecting:
  56. Is 9.5 or 525,600 my favorite number ?
  57. ok
  58. Trying:
  59. text = u'The https://github.com/jonsafari/tok-tok/blob/master/tok-tok.pl is a website with/and/or slashes and sort of weird : things'
  60. Expecting nothing
  61. ok
  62. Trying:
  63. print (toktok.tokenize(text))
  64. Expecting:
  65. The https://github.com/jonsafari/tok-tok/blob/master/tok-tok.pl is a website with/and/or slashes and sort of weird : things
  66. ok
  67. Trying:
  68. text = u'This, is a sentence with weird» symbols… appearing everywhere¿'
  69. Expecting nothing
  70. ok
  71. Trying:
  72. expected = u'This , is a sentence with weird » symbols … appearing everywhere ¿'
  73. Expecting nothing
  74. ok
  75. Trying:
  76. assert toktok.tokenize(text) == expected
  77. Expecting nothing
  78. ok
  79. 2 items had no tests:
  80. toktok
  81. toktok.ToktokTokenizer.tokenize
  82. 1 items passed all tests:
  83. 8 tests in toktok.ToktokTokenizer
  84. 8 tests in 3 items.
  85. 8 passed and 0 failed.
  86. Test passed.
  87. alvas@ubi:~/git/nltk$ python3.5 -m doctest nltk/tokenize/toktok.py -v
  88. Trying:
  89. toktok = ToktokTokenizer()
  90. Expecting nothing
  91. ok
  92. Trying:
  93. text = u'Is 9.5 or 525,600 my favorite number?'
  94. Expecting nothing
  95. ok
  96. Trying:
  97. print (toktok.tokenize(text))
  98. Expecting:
  99. Is 9.5 or 525,600 my favorite number ?
  100. ok
  101. Trying:
  102. text = u'The https://github.com/jonsafari/tok-tok/blob/master/tok-tok.pl is a website with/and/or slashes and sort of weird : things'
  103. Expecting nothing
  104. ok
  105. Trying:
  106. print (toktok.tokenize(text))
  107. Expecting:
  108. The https://github.com/jonsafari/tok-tok/blob/master/tok-tok.pl is a website with/and/or slashes and sort of weird : things
  109. ok
  110. Trying:
  111. text = u'This, is a sentence with weird» symbols… appearing everywhere¿'
  112. Expecting nothing
  113. ok
  114. Trying:
  115. expected = u'This , is a sentence with weird » symbols … appearing everywhere ¿'
  116. Expecting nothing
  117. ok
  118. Trying:
  119. assert toktok.tokenize(text) == expected
  120. Expecting nothing
  121. ok
  122. 2 items had no tests:
  123. toktok
  124. toktok.ToktokTokenizer.tokenize
  125. 1 items passed all tests:
  126. 8 tests in toktok.ToktokTokenizer
  127. 8 tests in 3 items.
  128. 8 passed and 0 failed.
  129. Test passed.
  130. alvas@ubi:~/git/nltk$ python2.7 -m doctest nltk/tokenize/moses.py -v
  131. Trying:
  132. mtokenizer = MosesTokenizer()
  133. Expecting nothing
  134. ok
  135. Trying:
  136. text = u'Is 9.5 or 525,600 my favorite number?'
  137. Expecting nothing
  138. ok
  139. Trying:
  140. print (mtokenizer.tokenize(text))
  141. Expecting:
  142. Is 9.5 or 525,600 my favorite number ?
  143. ok
  144. Trying:
  145. text = u'The https://github.com/jonsafari/tok-tok/blob/master/tok-tok.pl is a website with/and/or slashes and sort of weird : things'
  146. Expecting nothing
  147. ok
  148. Trying:
  149. print (mtokenizer.tokenize(text))
  150. Expecting:
  151. The https : / / github.com / jonsafari / tok-tok / blob / master / tok-tok.pl is a website with / and / or slashes and sort of weird : things
  152. ok
  153. Trying:
  154. text = u'This, is a sentence with weird� symbols\u2026 appearing everywhere�'
  155. Expecting nothing
  156. ok
  157. Trying:
  158. expected = u'This , is a sentence with weird � symbols \u2026 appearing everywhere �'
  159. Expecting nothing
  160. ok
  161. Trying:
  162. assert mtokenizer.tokenize(text) == expected
  163. Expecting nothing
  164. ok
  165. 11 items had no tests:
  166. moses
  167. moses.MosesTokenizer
  168. moses.MosesTokenizer.__init__
  169. moses.MosesTokenizer.escape_xml
  170. moses.MosesTokenizer.handles_nonbreaking_prefixes
  171. moses.MosesTokenizer.has_numeric_only
  172. moses.MosesTokenizer.isalpha
  173. moses.MosesTokenizer.islower
  174. moses.MosesTokenizer.penn_tokenize
  175. moses.MosesTokenizer.replace_multidots
  176. moses.MosesTokenizer.restore_multidots
  177. 1 items passed all tests:
  178. 8 tests in moses.MosesTokenizer.tokenize
  179. 8 tests in 12 items.
  180. 8 passed and 0 failed.
  181. Test passed.
  182. alvas@ubi:~/git/nltk$ python3.4 -m doctest nltk/tokenize/moses.py -v
  183. Trying:
  184. mtokenizer = MosesTokenizer()
  185. Expecting nothing
  186. ok
  187. Trying:
  188. text = u'Is 9.5 or 525,600 my favorite number?'
  189. Expecting nothing
  190. ok
  191. Trying:
  192. print (mtokenizer.tokenize(text))
  193. Expecting:
  194. Is 9.5 or 525,600 my favorite number ?
  195. ok
  196. Trying:
  197. text = u'The https://github.com/jonsafari/tok-tok/blob/master/tok-tok.pl is a website with/and/or slashes and sort of weird : things'
  198. Expecting nothing
  199. ok
  200. Trying:
  201. print (mtokenizer.tokenize(text))
  202. Expecting:
  203. The https : / / github.com / jonsafari / tok-tok / blob / master / tok-tok.pl is a website with / and / or slashes and sort of weird : things
  204. ok
  205. Trying:
  206. text = u'This, is a sentence with weird» symbols… appearing everywhere¿'
  207. Expecting nothing
  208. ok
  209. Trying:
  210. expected = u'This , is a sentence with weird » symbols … appearing everywhere ¿'
  211. Expecting nothing
  212. ok
  213. Trying:
  214. assert mtokenizer.tokenize(text) == expected
  215. Expecting nothing
  216. ok
  217. 11 items had no tests:
  218. moses
  219. moses.MosesTokenizer
  220. moses.MosesTokenizer.__init__
  221. moses.MosesTokenizer.escape_xml
  222. moses.MosesTokenizer.handles_nonbreaking_prefixes
  223. moses.MosesTokenizer.has_numeric_only
  224. moses.MosesTokenizer.isalpha
  225. moses.MosesTokenizer.islower
  226. moses.MosesTokenizer.penn_tokenize
  227. moses.MosesTokenizer.replace_multidots
  228. moses.MosesTokenizer.restore_multidots
  229. 1 items passed all tests:
  230. 8 tests in moses.MosesTokenizer.tokenize
  231. 8 tests in 12 items.
  232. 8 passed and 0 failed.
  233. Test passed.
  234. alvas@ubi:~/git/nltk$ python3.5 -m doctest nltk/tokenize/moses.py -v
  235. Trying:
  236. mtokenizer = MosesTokenizer()
  237. Expecting nothing
  238. ok
  239. Trying:
  240. text = u'Is 9.5 or 525,600 my favorite number?'
  241. Expecting nothing
  242. ok
  243. Trying:
  244. print (mtokenizer.tokenize(text))
  245. Expecting:
  246. Is 9.5 or 525,600 my favorite number ?
  247. ok
  248. Trying:
  249. text = u'The https://github.com/jonsafari/tok-tok/blob/master/tok-tok.pl is a website with/and/or slashes and sort of weird : things'
  250. Expecting nothing
  251. ok
  252. Trying:
  253. print (mtokenizer.tokenize(text))
  254. Expecting:
  255. The https : / / github.com / jonsafari / tok-tok / blob / master / tok-tok.pl is a website with / and / or slashes and sort of weird : things
  256. ok
  257. Trying:
  258. text = u'This, is a sentence with weird» symbols… appearing everywhere¿'
  259. Expecting nothing
  260. ok
  261. Trying:
  262. expected = u'This , is a sentence with weird » symbols … appearing everywhere ¿'
  263. Expecting nothing
  264. ok
  265. Trying:
  266. assert mtokenizer.tokenize(text) == expected
  267. Expecting nothing
  268. ok
  269. 11 items had no tests:
  270. moses
  271. moses.MosesTokenizer
  272. moses.MosesTokenizer.__init__
  273. moses.MosesTokenizer.escape_xml
  274. moses.MosesTokenizer.handles_nonbreaking_prefixes
  275. moses.MosesTokenizer.has_numeric_only
  276. moses.MosesTokenizer.isalpha
  277. moses.MosesTokenizer.islower
  278. moses.MosesTokenizer.penn_tokenize
  279. moses.MosesTokenizer.replace_multidots
  280. moses.MosesTokenizer.restore_multidots
  281. 1 items passed all tests:
  282. 8 tests in moses.MosesTokenizer.tokenize
  283. 8 tests in 12 items.
  284. 8 passed and 0 failed.
  285. Test passed.
Add Comment
Please, Sign In to add comment