Advertisement
TakesxiSximada

nltk data

Jan 24th, 2016
225
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 6.46 KB | None | 0 0
  1. Downloader> l
  2.  
  3. Packages:
  4. [*] abc................. Australian Broadcasting Commission 2006
  5. [*] alpino.............. Alpino Dutch Treebank
  6. [*] averaged_perceptron_tagger Averaged Perceptron Tagger
  7. [*] basque_grammars..... Grammars for Basque
  8. [*] biocreative_ppi..... BioCreAtIvE (Critical Assessment of Information
  9. Extraction Systems in Biology)
  10. [*] bllip_wsj_no_aux.... BLLIP Parser: WSJ Model
  11. [*] book_grammars....... Grammars from NLTK Book
  12. [*] brown............... Brown Corpus
  13. [*] brown_tei........... Brown Corpus (TEI XML Version)
  14. [*] cess_cat............ CESS-CAT Treebank
  15. [*] cess_esp............ CESS-ESP Treebank
  16. [*] chat80.............. Chat-80 Data Files
  17. [*] city_database....... City Database
  18. [*] cmudict............. The Carnegie Mellon Pronouncing Dictionary (0.6)
  19. [*] comparative_sentences Comparative Sentence Dataset
  20. [*] comtrans............ ComTrans Corpus Sample
  21. [*] conll2000........... CONLL 2000 Chunking Corpus
  22. [*] conll2002........... CONLL 2002 Named Entity Recognition Corpus
  23. [*] conll2007........... Dependency Treebanks from CoNLL 2007 (Catalan
  24. and Basque Subset)
  25. Hit Enter to continue:
  26. [ ] crubadan............ Crubadan Corpus
  27. [ ] dependency_treebank. Dependency Parsed Treebank
  28. [ ] europarl_raw........ Sample European Parliament Proceedings Parallel
  29. Corpus
  30. [ ] floresta............ Portuguese Treebank
  31. [ ] framenet_v15........ FrameNet 1.5
  32. [ ] gazetteers.......... Gazeteer Lists
  33. [*] genesis............. Genesis Corpus
  34. [*] gutenberg........... Project Gutenberg Selections
  35. [*] hmm_treebank_pos_tagger Treebank Part of Speech Tagger (HMM)
  36. [*] ieer................ NIST IE-ER DATA SAMPLE
  37. [*] inaugural........... C-Span Inaugural Address Corpus
  38. [*] indian.............. Indian Language POS-Tagged Corpus
  39.  
  40. [*] jeita............... JEITA Public Morphologically Tagged Corpus (in
  41. ChaSen format)
  42. [*] kimmo............... PC-KIMMO Data Files
  43. [*] knbc................ KNB Corpus (Annotated blog corpus)
  44. [*] large_grammars...... Large context-free and feature-based grammars
  45. for parser comparison
  46.  
  47.  
  48.  
  49. [*] lin_thesaurus....... Lin's Dependency Thesaurus
  50. [*] mac_morpho.......... MAC-MORPHO: Brazilian Portuguese news text with
  51. part-of-speech tags
  52. Hit Enter to continue:
  53. [ ] machado............. Machado de Assis -- Obra Completa
  54. [ ] masc_tagged......... MASC Tagged Corpus
  55. [ ] maxent_ne_chunker... ACE Named Entity Chunker (Maximum entropy)
  56. [ ] maxent_treebank_pos_tagger Treebank Part of Speech Tagger (Maximum entropy)
  57. [ ] moses_sample........ Moses Sample Models
  58. [ ] movie_reviews....... Sentiment Polarity Dataset Version 2.0
  59. [*] mte_teip5........... MULTEXT-East 1984 annotated corpus 4.0
  60. [*] names............... Names Corpus, Version 1.3 (1994-03-29)
  61. [*] nombank.1.0......... NomBank Corpus 1.0
  62. [*] nps_chat............ NPS Chat
  63. [*] oanc_masc........... Open American National Corpus: Manually
  64. Annotated Sub-Corpus
  65.  
  66. [*] omw................. Open Multilingual Wordnet
  67. [*] opinion_lexicon..... Opinion Lexicon
  68.  
  69.  
  70.  
  71.  
  72.  
  73.  
  74.  
  75.  
  76. [-] panlex_lite......... PanLex Lite Corpus
  77. [*] panlex_swadesh...... PanLex Swadesh Corpora
  78. [*] paradigms........... Paradigm Corpus
  79. [*] pe08................ Cross-Framework and Cross-Domain Parser
  80. Evaluation Shared Task
  81. [*] pil................. The Patient Information Leaflet (PIL) Corpus
  82. [*] pl196x.............. Polish language of the XX century sixties
  83. Hit Enter to continue:
  84. [ ] ppattach............ Prepositional Phrase Attachment Corpus
  85. [ ] problem_reports..... Problem Report Corpus
  86. [ ] product_reviews_1... Product Reviews (5 Products)
  87. [ ] product_reviews_2... Product Reviews (9 Products)
  88. [ ] propbank............ Proposition Bank Corpus 1.0
  89. [ ] pros_cons........... Pros and Cons
  90. [*] ptb................. Penn Treebank
  91. [*] punkt............... Punkt Tokenizer Models
  92. [*] qc.................. Experimental Data for Question Classification
  93. [*] reuters............. The Reuters-21578 benchmark corpus, ApteMod
  94. version
  95. [*] rslp................ RSLP Stemmer (Removedor de Sufixos da Lingua
  96. Portuguesa)
  97. [*] rte................. PASCAL RTE Challenges 1, 2, and 3
  98. [*] sample_grammars..... Sample Grammars
  99. [*] semcor.............. SemCor 3.0
  100. [*] senseval............ SENSEVAL 2 Corpus: Sense Tagged Text
  101. [*] sentence_polarity... Sentence Polarity Dataset v1.0
  102. [*] sentiwordnet........ SentiWordNet
  103. [*] shakespeare......... Shakespeare XML Corpus Sample
  104. [*] sinica_treebank..... Sinica Treebank Corpus Sample
  105. Hit Enter to continue:
  106. [ ] smultron............ SMULTRON Corpus Sample
  107. [ ] snowball_data....... Snowball Data
  108. [ ] spanish_grammars.... Grammars for Spanish
  109. [ ] state_union......... C-Span State of the Union Address Corpus
  110. [ ] stopwords........... Stopwords Corpus
  111. [ ] subjectivity........ Subjectivity Dataset v1.0
  112. [*] swadesh............. Swadesh Wordlists
  113. [*] switchboard......... Switchboard Corpus Sample
  114. [*] tagsets............. Help on Tagsets
  115. [*] timit............... TIMIT Corpus Sample
  116. [*] toolbox............. Toolbox Sample Files
  117. [*] treebank............ Penn Treebank Sample
  118. [*] twitter_samples..... Twitter Samples
  119. [*] udhr2............... Universal Declaration of Human Rights Corpus
  120. (Unicode Version)
  121. [*] udhr................ Universal Declaration of Human Rights Corpus
  122. [*] unicode_samples..... Unicode Samples
  123. [*] universal_tagset.... Mappings to the Universal Part-of-Speech Tagset
  124. [*] universal_treebanks_v20 Universal Treebanks Version 2.0
  125. [*] verbnet............. VerbNet Lexicon, Version 2.1
  126. [*] webtext............. Web Text Corpus
  127. Hit Enter to continue:
  128. [ ] word2vec_sample..... Word2Vec Sample
  129. [ ] wordnet............. WordNet
  130. [ ] wordnet_ic.......... WordNet-InfoContent
  131. [*] words............... Word Lists
  132. [*] ycoe................ York-Toronto-Helsinki Parsed Corpus of Old
  133. English Prose
  134.  
  135. Collections:
  136. [-] all-corpora......... All the corpora
  137. [-] all................. All packages
  138. [P] book................ Everything used in the NLTK Book
  139.  
  140. ([*] marks installed packages; [-] marks out-of-date or corrupt packages;
  141. [P] marks partially installed collections)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement