Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- $ ./unisearch index unisearch.idx -f test.txt -c unisearch-index-config_ru.xml -m morphology_ru.txt
- 0.000 Use file system as files source
- 0.000 Memory used before indexing: 0 Mib
- 0.000 Indexing
- 0.000 Config file: 'unisearch-index-config_ru.xml'
- 0.000 Morphology file:
- 0.000 'morphology_ru.txt'
- 0.000 Feed input files:
- 0.000 'test.txt'
- 0.128 Mode: Real indexing
- 0.128 Loading morphologies
- 4.255 Memory used after loading morphology: 0 Mib
- 4.256 Feeding file: 'test.txt'
- 4.256 Feeds count : 0
- Warning: Unknown table name: adm_div.settlement; line: 12
- 4.256 Total feeds count : 20
- 4.256 Memory used after feeding data: 0 Mib
- 4.256 Creating index file 'unisearch.idx'
- 4.256 Branch weights calculation
- 4.256 max_center_distance = -1.0000009537
- 4.256 max_flamp_rating = -1.0000009537
- 4.256 max_olap_weight = -1.0000009537
- 4.256 max_parent_olap_weight = -1.0000009537
- 4.256 max_children_count = 0.0000000000
- 4.257 Convert Geometry
- 4.257 Handle IndexData (Produce detailed adresses, phones, Filter synonyms, Process named after)
- 4.257 Building Keyboards
- 4.257 Building Keyboard Data
- 4.257 Building Dictionary
- 4.257 Building Charset
- 4.276 Feeding Dictionary
- 4.278 Building Dictionary
- 4.625 Building Morphology (decoder)
- 4.851 Lexemes before homonymy optimization: 7715
- 4.869 Removed lexemes as duplicate: 121
- 4.916 Removed lexemes as absorbed: 139
- 5.244 IndexData: Filter trivial cascades
- 5.244 IndexData: Calculate street weights
- 5.244 IndexData: Filter hierarchy words
- 5.244 IndexData: Filter addresses in organizations' names
- 5.244 IndexData: Filter well-known objects' duplicates
- 5.245 IndexData: Filter spatial data
- 5.245 IndexData: Remove Garbage Suggests
- 5.245 IndexData: Check synonyms congeniality
- 5.245 IndexData: Set token positions
- 5.245 IndexData: Sort links and titles
- 5.245 Building Morpho Encoder
- 5.245 Building Morphology (encoder)
- 5.245 Building StringMap
- 5.245 Building Transcription
- 5.247 Building Schema Data
- 5.247 Building Clusters
- 5.247 Detecting Relation Candidates
- 5.247 Cluster Relation Candidates: 0
- 5.247 Cluster Committed Relations: 0
- 5.247 Building Advertisment Data
- 5.247 Building Lexicon
- 5.250 Building Entries
- 5.250 Lexeme count: 28
- Strong by markup count: 0
- Fields word count: 0
- Strong by table count: 0
- Strong single word count: 12
- Strong by Lexicon count: 0
- Weak by Lexicon count: 0
- Weak optional count: 0
- Strong number count: 0
- Weak number count: 0
- Weak single letter count: 0
- Weak lexeme count: 4
- Weak by markup count: 0
- Selectivity Calculator count: 12
- 5.250 Building Entries Data
- 5.250 Calculating sections
- 5.251 Writing Header
- 5.251 Writing MetaData
- 5.251 Writing Keyboard Data
- 5.251 Writing Normalizer
- 5.252 Writing Tokenizer
- 5.252 Writing NumeralParser
- 5.252 Writing Charset
- 5.252 Writing Transcription
- 5.252 Writing Dictionary
- 5.253 Writing Morphology (decoder)
- 5.254 Writing Morphology (encoder)
- 5.254 Writing Optional Words
- 5.254 Writing Clusters
- 5.254 Writing Entries Data
- 5.254 Writing Schema data
- 5.254 Writing StringMap
- 5.254 Writing Advertisment Data
- 5.254 Done
- 5.255 Memory used after indexing finished: 0 Mib
- 5.255 Index file 'unisearch.idx' was created
- 5.281 All done
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement