Advertisement
Guest User

Untitled

a guest
Nov 9th, 2015
4,521
1
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 40.35 KB | None | 1 0
  1. name value description
  2. editor_image_xpos 590 Editor image X Pos
  3. editor_image_ypos 10 Editor image Y Pos
  4. editor_image_menuheight 50 Add to image height for menu bar
  5. editor_image_word_bb_color 7 Word bounding box colour
  6. editor_image_blob_bb_color 4 Blob bounding box colour
  7. editor_image_text_color 2 Correct text colour
  8. editor_dbwin_xpos 50 Editor debug window X Pos
  9. editor_dbwin_ypos 500 Editor debug window Y Pos
  10. editor_dbwin_height 24 Editor debug window height
  11. editor_dbwin_width 80 Editor debug window width
  12. editor_word_xpos 60 Word window X Pos
  13. editor_word_ypos 510 Word window Y Pos
  14. editor_word_height 240 Word window height
  15. editor_word_width 655 Word window width
  16. textord_debug_tabfind 0 Debug tab finding
  17. textord_debug_bugs 0 Turn on output related to bugs in tab finding
  18. textord_testregion_left -1 Left edge of debug reporting rectangle
  19. textord_testregion_top -1 Top edge of debug reporting rectangle
  20. textord_testregion_right 2147483647 Right edge of debug rectangle
  21. textord_testregion_bottom 2147483647 Bottom edge of debug rectangle
  22. textord_tabfind_show_partitions 0 Show partition bounds, waiting if >1
  23. devanagari_split_debuglevel 0 Debug level for split shiro-rekha process.
  24. edges_max_children_per_outline 10 Max number of children inside a character outline
  25. edges_max_children_layers 5 Max layers of nested children inside a character outline
  26. edges_children_per_grandchild 10 Importance ratio for chucking outlines
  27. edges_children_count_limit 45 Max holes allowed in blob
  28. edges_min_nonhole 12 Min pixels for potential char in box
  29. edges_patharea_ratio 40 Max lensq/area for acceptable child outline
  30. edges_maxedgelength 16000 Max steps in any outline
  31. textord_fp_chop_error 2 Max allowed bending of chop cells
  32. textord_tabfind_show_images 0 Show image blobs
  33. textord_skewsmooth_offset 2 For smooth factor
  34. textord_skewsmooth_offset2 1 For smooth factor
  35. textord_test_x -1 coord of test pt
  36. textord_test_y -1 coord of test pt
  37. textord_min_blobs_in_row 4 Min blobs before gradient counted
  38. textord_spline_minblobs 8 Min blobs in each spline segment
  39. textord_spline_medianwin 6 Size of window for spline segmentation
  40. textord_max_blob_overlaps 4 Max number of blobs a big blob can overlap
  41. textord_min_xheight 10 Min credible pixel xheight
  42. textord_lms_line_trials 12 Number of linew fits to do
  43. oldbl_holed_losscount 10 Max lost before fallback line used
  44. pitsync_linear_version 6 Use new fast algorithm
  45. pitsync_fake_depth 1 Max advance fake generation
  46. textord_tabfind_show_strokewidths 0 Show stroke widths
  47. textord_dotmatrix_gap 3 Max pixel gap for broken pixed pitch
  48. textord_debug_block 0 Block to do debug on
  49. textord_pitch_range 2 Max range test on pitch
  50. textord_words_veto_power 5 Rows required to outvote a veto
  51. wordrec_display_segmentations 0 Display Segmentations
  52. classify_radius_gyr_min_man 255 Minimum Radius of Gyration Mantissa 0-255:
  53. classify_radius_gyr_min_exp 0 Minimum Radius of Gyration Exponent 0-255:
  54. classify_radius_gyr_max_man 158 Maximum Radius of Gyration Mantissa 0-255:
  55. classify_radius_gyr_max_exp 8 Maximum Radius of Gyration Exponent 0-255:
  56. classify_num_cp_levels 3 Number of Class Pruner Levels
  57. image_default_resolution 300 Image resolution dpi
  58. equationdetect_save_bi_image 0 Save input bi image
  59. equationdetect_save_spt_image 0 Save special character image
  60. equationdetect_save_seed_image 0 Save the seed image
  61. equationdetect_save_merged_image 0 Save the merged image
  62. textord_debug_images 0 Use greyed image background for debug
  63. textord_debug_printable 0 Make debug windows printable
  64. textord_space_size_is_variable 0 If true, word delimiter spaces are assumed to have variable width, even though characters have fixed pitch.
  65. textord_tabfind_show_initial_partitions 0 Show partition bounds
  66. textord_tabfind_show_reject_blobs 0 Show blobs rejected as noise
  67. textord_tabfind_show_columns 0 Show column bounds
  68. textord_tabfind_show_blocks 0 Show final block bounds
  69. textord_tabfind_find_tables 1 run table detection
  70. textord_tabfind_show_color_fit 0 Show stroke widths
  71. devanagari_split_debugimage 0 Whether to create a debug image for split shiro-rekha process.
  72. textord_show_fixed_cuts 0 Draw fixed pitch cell boundaries
  73. edges_use_new_outline_complexity 0 Use the new outline complexity module
  74. edges_debug 0 turn on debugging for this module
  75. edges_children_fix 0 Remove boxy parents of char-like children
  76. gapmap_debug 0 Say which blocks have tables
  77. gapmap_use_ends 0 Use large space at start and end of rows
  78. gapmap_no_isolated_quanta 0 Ensure gaps not less than 2quanta wide
  79. textord_heavy_nr 0 Vigorously remove noise
  80. textord_show_initial_rows 0 Display row accumulation
  81. textord_show_parallel_rows 0 Display page correlated rows
  82. textord_show_expanded_rows 0 Display rows after expanding
  83. textord_show_final_rows 0 Display rows after final fitting
  84. textord_show_final_blobs 0 Display blob bounds after pre-ass
  85. textord_test_landscape 0 Tests refer to land/port
  86. textord_parallel_baselines 1 Force parallel baselines
  87. textord_straight_baselines 0 Force straight baselines
  88. textord_old_baselines 1 Use old baseline algorithm
  89. textord_old_xheight 0 Use old xheight algorithm
  90. textord_fix_xheight_bug 1 Use spline baseline
  91. textord_fix_makerow_bug 1 Prevent multiple baselines
  92. textord_debug_xheights 0 Test xheight algorithms
  93. textord_biased_skewcalc 1 Bias skew estimates with line length
  94. textord_interpolating_skew 1 Interpolate across gaps
  95. textord_new_initial_xheight 1 Use test xheight mechanism
  96. textord_really_old_xheight 0 Use original wiseowl xheight
  97. textord_oldbl_debug 0 Debug old baseline generation
  98. textord_debug_baselines 0 Debug baseline generation
  99. textord_oldbl_paradef 1 Use para default mechanism
  100. textord_oldbl_split_splines 1 Split stepped splines
  101. textord_oldbl_merge_parts 1 Merge suspect partitions
  102. oldbl_corrfix 1 Improve correlation of heights
  103. oldbl_xhfix 0 Fix bug in modes threshold for xheights
  104. textord_ocropus_mode 0 Make baselines for ocropus
  105. textord_tabfind_only_strokewidths 0 Only run stroke widths
  106. textord_tabfind_vertical_text 1 Enable vertical detection
  107. textord_tabfind_force_vertical_text 0 Force using vertical text page mode
  108. textord_tabfind_vertical_horizontal_mix 1 find horizontal lines such as headers in vertical page mode
  109. textord_tabfind_show_initialtabs 0 Show tab candidates
  110. textord_tabfind_show_finaltabs 0 Show tab vectors
  111. textord_dump_table_images 0 Paint table detection output
  112. textord_show_tables 0 Show table regions
  113. textord_tablefind_show_mark 0 Debug table marking steps in detail
  114. textord_tablefind_show_stats 0 Show page stats used in table finding
  115. textord_tablefind_recognize_tables 0 Enables the table recognizer for table layout and filtering.
  116. textord_all_prop 0 All doc is proportial text
  117. textord_debug_pitch_test 0 Debug on fixed pitch test
  118. textord_disable_pitch_test 0 Turn off dp fixed pitch algorithm
  119. textord_fast_pitch_test 0 Do even faster pitch algorithm
  120. textord_debug_pitch_metric 0 Write full metric stuff
  121. textord_show_row_cuts 0 Draw row-level cuts
  122. textord_show_page_cuts 0 Draw page-level cuts
  123. textord_pitch_cheat 0 Use correct answer for fixed/prop
  124. textord_blockndoc_fixed 0 Attempt whole doc/block fixed pitch
  125. textord_show_initial_words 0 Display separate words
  126. textord_show_new_words 0 Display separate words
  127. textord_show_fixed_words 0 Display forced fixed pitch words
  128. textord_blocksall_fixed 0 Moan about prop blocks
  129. textord_blocksall_prop 0 Moan about fixed pitch blocks
  130. textord_blocksall_testing 0 Dump stats when moaning
  131. textord_test_mode 0 Do current test
  132. textord_pitch_scalebigwords 0 Scale scores on big words
  133. textord_restore_underlines 1 Chop underlines & put back
  134. textord_fp_chopping 1 Do fixed pitch chopping
  135. textord_force_make_prop_words 0 Force proportional word segmentation on all rows
  136. textord_chopper_test 0 Chopper is being tested.
  137. wordrec_display_all_blobs 0 Display Blobs
  138. wordrec_display_all_words 0 Display Words
  139. wordrec_blob_pause 0 Blob pause
  140. poly_debug 0 Debug old poly
  141. poly_wide_objects_better 1 More accurate approx on wide things
  142. wordrec_display_splits 0 Display splits
  143. editor_image_win_name EditorImage Editor image window name
  144. editor_dbwin_name EditorDBWin Editor debug window name
  145. editor_word_name BlnWords BL normalized word window
  146. editor_debug_config_file Config file to apply to single words
  147. fx_debugfile FXDebug Name of debugfile
  148. classify_font_name UnknownFont Default font name to be used in training
  149. classify_training_file MicroFeatures Training file
  150. debug_file File to send tprintf output to
  151. textord_underline_threshold 0.5 Fraction of width occupied
  152. edges_childarea 0.5 Min area fraction of child outline
  153. edges_boxarea 0.875 Min area fraction of grandchild for box
  154. textord_fp_chop_snap 0.5 Max distance of chop pt from vertex
  155. gapmap_big_gaps 1.75 xht multiplier
  156. textord_spline_shift_fraction 0.02 Fraction of line spacing for quad
  157. textord_spline_outlier_fraction 0.1 Fraction of line spacing for outlier
  158. textord_skew_ile 0.5 Ile of gradients for page skew
  159. textord_skew_lag 0.01 Lag for skew on row accumulation
  160. textord_linespace_iqrlimit 0.2 Max iqr/median for linespace
  161. textord_width_limit 8 Max width of blobs to make rows
  162. textord_chop_width 1.5 Max width before chopping
  163. textord_expansion_factor 1 Factor to expand rows by in expand_rows
  164. textord_overlap_x 0.5 Fraction of linespace for good overlap
  165. textord_minxh 0.25 fraction of linesize for min xheight
  166. textord_min_linesize 1.25 * blob height for initial linesize
  167. textord_excess_blobsize 1.3 New row made if blob makes row this big
  168. textord_occupancy_threshold 0.4 Fraction of neighbourhood
  169. textord_underline_width 2 Multiple of line_size for underline
  170. textord_min_blob_height_fraction 0.75 Min blob height/top to include blob top into xheight stats
  171. textord_xheight_mode_fraction 0.4 Min pile height to make xheight
  172. textord_ascheight_mode_fraction 0.08 Min pile height to make ascheight
  173. textord_descheight_mode_fraction 0.08 Min pile height to make descheight
  174. textord_ascx_ratio_min 1.25 Min cap/xheight
  175. textord_ascx_ratio_max 1.8 Max cap/xheight
  176. textord_descx_ratio_min 0.25 Min desc/xheight
  177. textord_descx_ratio_max 0.6 Max desc/xheight
  178. textord_xheight_error_margin 0.1 Accepted variation
  179. oldbl_xhfract 0.4 Fraction of est allowed in calc
  180. oldbl_dot_error_size 1.26 Max aspect ratio of a dot
  181. textord_oldbl_jumplimit 0.15 X fraction for new partition
  182. pitsync_joined_edge 0.75 Dist inside big blob for chopping
  183. pitsync_offset_freecut_fraction 0.25 Fraction of cut for free cuts
  184. textord_tabfind_vertical_text_ratio 0.5 Fraction of textlines deemed vertical to use vertical page mode
  185. textord_tabfind_aligned_gap_fraction 0.75 Fraction of height used as a minimum gap for aligned blobs.
  186. textord_tabvector_vertical_gap_fraction 0.5 max fraction of mean blob width allowed for vertical gaps in vertical text
  187. textord_tabvector_vertical_box_ratio 0.5 Fraction of box matches required to declare a line vertical
  188. textord_projection_scale 0.2 Ding rate for mid-cuts
  189. textord_balance_factor 1 Ding rate for unbalanced char cells
  190. textord_wordstats_smooth_factor 0.05 Smoothing gap stats
  191. textord_width_smooth_factor 0.1 Smoothing width stats
  192. textord_words_width_ile 0.4 Ile of blob widths for space est
  193. textord_words_maxspace 4 Multiple of xheight
  194. textord_words_default_maxspace 3.5 Max believable third space
  195. textord_words_default_minspace 0.6 Fraction of xheight
  196. textord_words_min_minspace 0.3 Fraction of xheight
  197. textord_words_default_nonspace 0.2 Fraction of xheight
  198. textord_words_initial_lower 0.25 Max inital cluster size
  199. textord_words_initial_upper 0.15 Min initial cluster spacing
  200. textord_words_minlarge 0.75 Fraction of valid gaps needed
  201. textord_words_pitchsd_threshold 0.04 Pitch sync threshold
  202. textord_words_def_fixed 0.016 Threshold for definite fixed
  203. textord_words_def_prop 0.09 Threshold for definite prop
  204. textord_pitch_rowsimilarity 0.08 Fraction of xheight for sameness
  205. words_initial_lower 0.5 Max inital cluster size
  206. words_initial_upper 0.15 Min initial cluster spacing
  207. words_default_prop_nonspace 0.25 Fraction of xheight
  208. words_default_fixed_space 0.75 Fraction of xheight
  209. words_default_fixed_limit 0.6 Allowed size variance
  210. textord_words_definite_spread 0.3 Non-fuzzy spacing region
  211. textord_spacesize_ratiofp 2.8 Min ratio space/nonspace
  212. textord_spacesize_ratioprop 2 Min ratio space/nonspace
  213. textord_fpiqr_ratio 1.5 Pitch IQR/Gap IQR threshold
  214. textord_max_pitch_iqr 0.2 Xh fraction noise in pitch
  215. textord_fp_min_width 0.5 Min width of decent blobs
  216. textord_underline_offset 0.1 Fraction of x to ignore
  217. classify_cp_angle_pad_loose 45 Class Pruner Angle Pad Loose
  218. classify_cp_angle_pad_medium 20 Class Pruner Angle Pad Medium
  219. classify_cp_angle_pad_tight 10 CLass Pruner Angle Pad Tight
  220. classify_cp_end_pad_loose 0.5 Class Pruner End Pad Loose
  221. classify_cp_end_pad_medium 0.5 Class Pruner End Pad Medium
  222. classify_cp_end_pad_tight 0.5 Class Pruner End Pad Tight
  223. classify_cp_side_pad_loose 2.5 Class Pruner Side Pad Loose
  224. classify_cp_side_pad_medium 1.2 Class Pruner Side Pad Medium
  225. classify_cp_side_pad_tight 0.6 Class Pruner Side Pad Tight
  226. classify_pp_angle_pad 45 Proto Pruner Angle Pad
  227. classify_pp_end_pad 0.5 Proto Prune End Pad
  228. classify_pp_side_pad 2.5 Proto Pruner Side Pad
  229. classify_min_slope 0.414214 Slope below which lines are called horizontal
  230. classify_max_slope 2.41421 Slope above which lines are called vertical
  231. classify_norm_adj_midpoint 32 Norm adjust midpoint …
  232. classify_norm_adj_curl 2 Norm adjust curl …
  233. classify_pico_feature_length 0.05 Pico Feature Length
  234. speckle_large_max_size 0.3 Max large speckle size
  235. speckle_small_penalty 10 Small speckle penalty
  236. speckle_large_penalty 10 Large speckle penalty
  237. speckle_small_certainty -1 Small speckle certainty
  238. ambigs_debug_level 0 Debug level for unichar ambiguities
  239. tessedit_single_match 0 Top choice only from CP
  240. classify_debug_level 0 Classify debug level
  241. classify_norm_method 1 Normalization Method …
  242. matcher_debug_level 0 Matcher Debug Level
  243. matcher_debug_flags 0 Matcher Debug Flags
  244. classify_learning_debug_level 0 Learning Debug Level:
  245. matcher_permanent_classes_min 1 Min # of permanent classes
  246. matcher_min_examples_for_prototyping 3 Reliable Config Threshold
  247. matcher_sufficient_examples_for_prototyping 5 Enable adaption even if the ambiguities have not been seen
  248. classify_adapt_proto_threshold 230 Threshold for good protos during adaptive 0-255
  249. classify_adapt_feature_threshold 230 Threshold for good features during adaptive 0-255
  250. classify_class_pruner_threshold 229 Class Pruner Threshold 0-255
  251. classify_class_pruner_multiplier 30 Class Pruner Multiplier 0-255:
  252. classify_cp_cutoff_strength 7 Class Pruner CutoffStrength:
  253. classify_integer_matcher_multiplier 14 Integer Matcher Multiplier 0-255:
  254. il1_adaption_test 0 Dont adapt to i/I at beginning of word
  255. dawg_debug_level 0 Set to 1 for general debug info, to 2 for more details, to 3 to see all the debug messages
  256. hyphen_debug_level 0 Debug level for hyphenated words.
  257. max_viterbi_list_size 10 Maximum size of viterbi list.
  258. stopper_smallword_size 2 Size of dict word to be treated as non-dict word
  259. stopper_debug_level 0 Stopper debug level
  260. tessedit_truncate_wordchoice_log 10 Max words to keep in list
  261. fragments_debug 0 Debug character fragments
  262. segment_debug 0 Debug the whole segmentation process
  263. max_permuter_attempts 10000 Maximum number of different character choices to consider during permutation. This limit is especially useful when user patterns are specified, since overly generic patterns can result in dawg search exploring an overly large number of options.
  264. wordrec_num_seg_states 30 Segmentation states
  265. repair_unchopped_blobs 1 Fix blobs that aren’t chopped
  266. chop_debug 0 Chop debug
  267. chop_split_length 10000 Split Length
  268. chop_same_distance 2 Same distance
  269. chop_min_outline_points 6 Min Number of Points on Outline
  270. chop_inside_angle -50 Min Inside Angle Bend
  271. chop_min_outline_area 2000 Min Outline Area
  272. chop_x_y_weight 3 X / Y length weight
  273. segment_adjust_debug 0 Segmentation adjustment debug
  274. wordrec_debug_level 0 Debug level for wordrec
  275. segsearch_debug_level 0 SegSearch debug level
  276. segsearch_max_pain_points 2000 Maximum number of pain points stored in the queue
  277. segsearch_max_futile_classifications 10 Maximum number of pain point classifications per word thatdid not result in finding a better word choice.
  278. language_model_debug_level 0 Language model debug level
  279. language_model_ngram_order 8 Maximum order of the character ngram model
  280. language_model_viterbi_list_max_num_prunable 10 Maximum number of prunable (those for which PrunablePath() is true) entries in each viterbi list recorded in BLOB_CHOICEs
  281. language_model_viterbi_list_max_size 500 Maximum size of viterbi lists recorded in BLOB_CHOICEs
  282. language_model_min_compound_length 3 Minimum length of compound words
  283. language_model_fixed_length_choices_depth 3 Depth of blob choice lists to explore when fixed length dawgs are on
  284. tessedit_pageseg_mode 6 Page seg mode: 0=osd only, 1=auto+osd, 2=auto, 3=col, 4=block, 5=line, 6=word, 7=char (Values from PageSegMode enum in publictypes.h)
  285. tessedit_ocr_engine_mode 0 Which OCR engine(s) to run (Tesseract, Cube, both). Defaults to loading and running only Tesseract (no Cube,no combiner). Values from OcrEngineMode enum in tesseractclass.h)
  286. pageseg_devanagari_split_strategy 0 Whether to use the top-line splitting process for Devanagari documents while performing page-segmentation.
  287. ocr_devanagari_split_strategy 0 Whether to use the top-line splitting process for Devanagari documents while performing ocr.
  288. bidi_debug 0 Debug level for BiDi
  289. applybox_debug 1 Debug level
  290. applybox_page 0 Page number to apply boxes from
  291. tessedit_bigram_debug 0 Amount of debug output for bigram correction.
  292. debug_x_ht_level 0 Reestimate debug
  293. quality_min_initial_alphas_reqd 2 alphas in a good word
  294. tessedit_tess_adaption_mode 39 Adaptation decision algorithm for tess
  295. tessedit_test_adaption_mode 3 Adaptation decision algorithm for tess
  296. paragraph_debug_level 0 Print paragraph debug info.
  297. cube_debug_level 0 Print cube debug info.
  298. tessedit_preserve_min_wd_len 2 Only preserve wds longer than this
  299. crunch_rating_max 10 For adj length in rating per ch
  300. crunch_pot_indicators 1 How many potential indicators needed
  301. crunch_leave_lc_strings 4 Dont crunch words with long lower case strings
  302. crunch_leave_uc_strings 4 Dont crunch words with long lower case strings
  303. crunch_long_repetitions 3 Crunch words with long repetitions
  304. crunch_debug 0 As it says
  305. fixsp_non_noise_limit 1 How many non-noise blbs either side?
  306. fixsp_done_mode 1 What constitues done for spacing
  307. debug_fix_space_level 0 Contextual fixspace debug
  308. x_ht_acceptance_tolerance 8 Max allowed deviation of blob top outside of font data
  309. x_ht_min_change 8 Min change in xht before actually trying it
  310. suspect_level 99 Suspect marker level
  311. suspect_space_level 100 Min suspect level for rejecting spaces
  312. suspect_short_words 2 Dont Suspect dict wds longer than this
  313. tessedit_reject_mode 0 Rejection algorithm
  314. tessedit_ok_mode 5 Acceptance decision algorithm
  315. tessedit_image_border 2 Rej blbs near image edge limit
  316. min_sane_x_ht_pixels 8 Reject any x-ht lt or eq than this
  317. tessedit_page_number -1 -1 -> All pages , else specifc page to process
  318. tessdata_manager_debug_level 0 Debug level for TessdataManager functions.
  319. tosp_debug_level 0 Debug data
  320. tosp_enough_space_samples_for_median 3 or should we use mean
  321. tosp_redo_kern_limit 10 No.samples reqd to reestimate for row
  322. tosp_few_samples 40 No.gaps reqd with 1 large gap to treat as a table
  323. tosp_short_row 20 No.gaps reqd with few cert spaces to use certs
  324. tosp_sanity_method 1 How to avoid being silly
  325. textord_max_noise_size 7 Pixel size of noise
  326. textord_noise_sizefraction 10 Fraction of size for maxima
  327. textord_noise_translimit 16 Transitions for normal blob
  328. textord_noise_sncount 1 super norm blobs to save row
  329. use_definite_ambigs_for_classifier 0 Use definite ambiguities when running character classifier
  330. use_ambigs_for_adaption 0 Use ambigs for deciding whether to adapt to a character
  331. prioritize_division 0 Prioritize blob division over chopping
  332. classify_enable_learning 1 Enable adaptive classifier
  333. tess_cn_matching 0 Character Normalized Matching
  334. tess_bn_matching 0 Baseline Normalized Matching
  335. classify_enable_adaptive_matcher 1 Enable adaptive classifier
  336. classify_use_pre_adapted_templates 0 Use pre-adapted classifier templates
  337. classify_save_adapted_templates 0 Save adapted templates to a file
  338. classify_enable_adaptive_debugger 0 Enable match debugger
  339. disable_character_fragments 1 Do not include character fragments in the results of the classifier
  340. classify_debug_character_fragments 0 Bring up graphical debugging windows for fragments training
  341. matcher_debug_separate_windows 0 Use two different windows for debugging the matching: One for the protos and one for the features.
  342. classify_bln_numeric_mode 0 Assume the input is numbers [0-9].
  343. load_system_dawg 1 Load system word dawg.
  344. load_freq_dawg 1 Load frequent word dawg.
  345. load_unambig_dawg 1 Load unambiguous word dawg.
  346. load_punc_dawg 1 Load dawg with punctuation patterns.
  347. load_number_dawg 1 Load dawg with number patterns.
  348. load_fixed_length_dawgs 1 Load fixed length dawgs (e.g. for non-space delimited languages)
  349. load_bigram_dawg 1 Load dawg with special word bigrams.
  350. use_only_first_uft8_step 0 Use only the first UTF8 step of the given string when computing log probabilities.
  351. stopper_no_acceptable_choices 0 Make AcceptableChoice() always return false. Useful when there is a need to explore all segmentations
  352. save_raw_choices 1 Save all explored raw choices
  353. permute_debug 0 Debug char permutation process
  354. permute_script_word 0 Turn on word script consistency permuter
  355. segment_segcost_rating 0 incorporate segmentation cost in word rating?
  356. segment_nonalphabetic_script 0 Don’t use any alphabetic-specific tricks.Set to true in the traineddata config file for scripts that are cursive or inherently fixed-pitch
  357. permute_fixed_length_dawg 0 Turn on fixed-length phrasebook search permuter
  358. permute_chartype_word 0 Turn on character type (property) consistency permuter
  359. save_doc_words 0 Save Document Words
  360. doc_dict_enable 1 Enable Document Dictionary
  361. ngram_permuter_activated 0 Activate character-level n-gram-based permuter
  362. permute_only_top 0 Run only the top choice permuter
  363. merge_fragments_in_matrix 1 Merge the fragments in the ratings matrix and delete them after merging
  364. wordrec_no_block 0 Don’t output block information
  365. wordrec_enable_assoc 1 Associator Enable
  366. force_word_assoc 0 force associator to run regardless of what enable_assoc is.This is used for CJK where component grouping is necessary.
  367. fragments_guide_chopper 0 Use information from fragments to guide chopping process
  368. chop_enable 1 Chop enable
  369. chop_vertical_creep 0 Vertical creep
  370. assume_fixed_pitch_char_segment 0 include fixed-pitch heuristics in char segmentation
  371. use_new_state_cost 0 use new state cost heuristics for segmentation state evaluation
  372. wordrec_debug_blamer 0 Print blamer debug messages
  373. wordrec_run_blamer 0 Try to set the blame for errors
  374. enable_new_segsearch 0 Enable new segmentation search path.
  375. save_alt_choices 1 Save alternative paths found during chopping and segmentation search
  376. language_model_ngram_on 0 Turn on/off the use of character ngram model
  377. language_model_ngram_use_only_first_uft8_step 0 Use only the first UTF8 step of the given string when computing log probabilities.
  378. language_model_ngram_space_delimited_language 1 Words are delimited by space
  379. language_model_use_sigmoidal_certainty 0 Use sigmoidal score for certainty
  380. tessedit_resegment_from_boxes 0 Take segmentation and labeling from box file
  381. tessedit_resegment_from_line_boxes 0 Conversion of word/line box file to char box file
  382. tessedit_train_from_boxes 0 Generate training data from boxed chars
  383. tessedit_make_boxes_from_boxes 0 Generate more boxes from boxed chars
  384. tessedit_dump_pageseg_images 0 Dump intermediate images made during page segmentation
  385. tessedit_ambigs_training 0 Perform training for ambiguities
  386. tessedit_adapt_to_char_fragments 1 Adapt to words that contain a character composed form fragments
  387. tessedit_adaption_debug 0 Generate and print debug information for adaption
  388. applybox_learn_chars_and_char_frags_mode 0 Learn both character fragments (as is done in the special low exposure mode) as well as unfragmented characters.
  389. applybox_learn_ngrams_mode 0 Each bounding box is assumed to contain ngrams. Only learn the ngrams whose outlines overlap horizontally.
  390. tessedit_display_outwords 0 Draw output words
  391. tessedit_training_tess 0 Call Tess to learn blobs
  392. tessedit_dump_choices 0 Dump char choices
  393. tessedit_fix_fuzzy_spaces 1 Try to improve fuzzy spaces
  394. tessedit_unrej_any_wd 0 Dont bother with word plausibility
  395. tessedit_fix_hyphens 1 Crunch double hyphens?
  396. tessedit_redo_xheight 1 Check/Correct x-height
  397. tessedit_enable_doc_dict 1 Add words to the document dictionary
  398. tessedit_debug_fonts 0 Output font info per char
  399. tessedit_debug_block_rejection 0 Block and Row stats
  400. tessedit_enable_bigram_correction 1 Enable correction based on the word bigram dictionary.
  401. debug_acceptable_wds 0 Dump word pass/fail chk
  402. tessedit_tess_adapt_to_rejmap 0 Use reject map to control Tesseract adaption
  403. tessedit_minimal_rej_pass1 0 Do minimal rejection on pass 1 output
  404. tessedit_test_adaption 0 Test adaption criteria
  405. tessedit_matcher_log 0 Log matcher activity
  406. save_blob_choices 0 Save the results of the recognition step (blob_choices) within the corresponding WERD_CHOICE
  407. test_pt 0 Test for point
  408. docqual_excuse_outline_errs 0 Allow outline errs in unrejection?
  409. tessedit_good_quality_unrej 1 Reduce rejection on good docs
  410. tessedit_use_reject_spaces 1 Reject spaces?
  411. tessedit_preserve_blk_rej_perfect_wds 1 Only rej partially rejected words in block rejection
  412. tessedit_preserve_row_rej_perfect_wds 1 Only rej partially rejected words in row rejection
  413. tessedit_dont_blkrej_good_wds 0 Use word segmentation quality metric
  414. tessedit_dont_rowrej_good_wds 0 Use word segmentation quality metric
  415. tessedit_row_rej_good_docs 1 Apply row rejection to good docs
  416. tessedit_reject_bad_qual_wds 1 Reject all bad quality wds
  417. tessedit_debug_doc_rejection 0 Page stats
  418. tessedit_debug_quality_metrics 0 Output data to debug file
  419. bland_unrej 0 unrej potential with no chekcs
  420. unlv_tilde_crunching 1 Mark v.bad words for tilde crunch
  421. crunch_early_merge_tess_fails 1 Before word crunch?
  422. crunch_early_convert_bad_unlv_chs 0 Take out ~^ early?
  423. crunch_terrible_garbage 1 As it says
  424. crunch_pot_garbage 1 POTENTIAL crunch garbage
  425. crunch_leave_ok_strings 1 Dont touch sensible strings
  426. crunch_accept_ok 1 Use acceptability in okstring
  427. crunch_leave_accept_strings 0 Dont pot crunch sensible strings
  428. crunch_include_numerals 0 Fiddle alpha figures
  429. tessedit_prefer_joined_punct 0 Reward punctation joins
  430. tessedit_write_block_separators 0 Write block separators in output
  431. tessedit_write_rep_codes 0 Write repetition char code
  432. tessedit_write_unlv 0 Write .unlv output file
  433. tessedit_create_hocr 0 Write .html hOCR output file
  434. suspect_constrain_1Il 0 UNLV keep 1Il chars rejected
  435. tessedit_minimal_rejection 0 Only reject tess failures
  436. tessedit_zero_rejection 0 Dont reject ANYTHING
  437. tessedit_word_for_word 0 Make output have exactly one word per WERD
  438. tessedit_zero_kelvin_rejection 0 Dont reject ANYTHING AT ALL
  439. tessedit_consistent_reps 1 Force all rep chars the same
  440. tessedit_rejection_debug 0 Adaption debug
  441. tessedit_flip_0O 1 Contextual 0O O0 flips
  442. rej_trust_doc_dawg 0 Use DOC dawg in 11l conf. detector
  443. rej_1Il_use_dict_word 0 Use dictword test
  444. rej_1Il_trust_permuter_type 1 Dont double check
  445. rej_use_tess_accepted 1 Individual rejection control
  446. rej_use_tess_blanks 1 Individual rejection control
  447. rej_use_good_perm 1 Individual rejection control
  448. rej_use_sensible_wd 0 Extend permuter check
  449. rej_alphas_in_number_perm 0 Extend permuter check
  450. tessedit_create_boxfile 0 Output text with boxes
  451. tessedit_write_images 0 Capture the image from the IPE
  452. interactive_display_mode 0 Run interactively?
  453. tessedit_override_permuter 1 According to dict_word
  454. textord_tabfind_show_vlines 0 Debug line finding
  455. textord_use_cjk_fp_model 0 Use CJK fixed pitch model
  456. tessedit_init_config_only 0 Only initialize with the config file. Useful if the instance is not going to be used for OCR but say only for layout analysis.
  457. textord_equation_detect 0 Turn on equation detector
  458. textord_single_height_mode 0 Script has no xheight, so use a single mode
  459. tosp_old_to_method 0 Space stats use prechopping?
  460. tosp_old_to_constrain_sp_kn 0 Constrain relative values of inter and intra-word gaps for old_to_method.
  461. tosp_only_use_prop_rows 1 Block stats to use fixed pitch rows?
  462. tosp_force_wordbreak_on_punct 0 Force word breaks on punct to break long lines in non-space delimited langs
  463. tosp_use_pre_chopping 0 Space stats use prechopping?
  464. tosp_old_to_bug_fix 0 Fix suspected bug in old code
  465. tosp_block_use_cert_spaces 1 Only stat OBVIOUS spaces
  466. tosp_row_use_cert_spaces 1 Only stat OBVIOUS spaces
  467. tosp_narrow_blobs_not_cert 1 Only stat OBVIOUS spaces
  468. tosp_row_use_cert_spaces1 1 Only stat OBVIOUS spaces
  469. tosp_recovery_isolated_row_stats 1 Use row alone when inadequate cert spaces
  470. tosp_only_small_gaps_for_kern 0 Better guess
  471. tosp_all_flips_fuzzy 0 Pass ANY flip to context?
  472. tosp_fuzzy_limit_all 1 Dont restrict kn->sp fuzzy limit to tables
  473. tosp_stats_use_xht_gaps 1 Use within xht gap for wd breaks
  474. tosp_use_xht_gaps 1 Use within xht gap for wd breaks
  475. tosp_only_use_xht_gaps 0 Only use within xht gap for wd breaks
  476. tosp_rule_9_test_punct 0 Dont chng kn to space next to punct
  477. tosp_flip_fuzz_kn_to_sp 1 Default flip
  478. tosp_flip_fuzz_sp_to_kn 1 Default flip
  479. tosp_improve_thresh 0 Enable improvement heuristic
  480. textord_no_rejects 0 Don’t remove noise blobs
  481. textord_show_blobs 0 Display unsorted blobs
  482. textord_show_boxes 0 Display unsorted blobs
  483. textord_noise_rejwords 1 Reject noise-like words
  484. textord_noise_rejrows 1 Reject noise-like rows
  485. textord_noise_debug 0 Debug row garbage detector
  486. m_data_sub_dir tessdata/ Directory for data files
  487. classify_learn_debug_str Class str to debug learning
  488. user_words_suffix A list of user-provided words.
  489. user_patterns_suffix A list of user-provided patterns.
  490. output_ambig_words_file Output file for ambiguities found in the dictionary
  491. word_to_debug Word for which stopper debug information should be printed to stdout
  492. word_to_debug_lengths Lengths of unichars in word_to_debug
  493. tessedit_char_blacklist Blacklist of chars not to recognize
  494. tessedit_char_whitelist Whitelist of chars to recognize
  495. tessedit_write_params_to_file Write all parameters to the given file.
  496. applybox_exposure_pattern .exp Exposure value follows this pattern in the image filename. The name of the image files are expected to be in the form [lang].[fontname].exp[num].tif
  497. chs_leading_punct (’`” Leading punctuation
  498. chs_trailing_punct1 ).,;:?! 1st Trailing punctuation
  499. chs_trailing_punct2 )’`” 2nd Trailing punctuation
  500. outlines_odd %| Non standard number of outlines
  501. outlines_2 ij!?%”:; Non standard number of outlines
  502. numeric_punctuation ., Punct. chs expected WITHIN numbers
  503. unrecognised_char | Output char for unidentified blobs
  504. ok_repeated_ch_non_alphanum_wds -?*= Allow NN to unrej
  505. conflict_set_I_l_1 Il1[] Il1 conflict set
  506. file_type .tif Filename extension
  507. tessedit_load_sublangs List of languages to load with this one
  508. classify_char_norm_range 0.2 Character Normalization Range …
  509. classify_min_norm_scale_x 0 Min char x-norm scale …
  510. classify_max_norm_scale_x 0.325 Max char x-norm scale …
  511. classify_min_norm_scale_y 0 Min char y-norm scale …
  512. classify_max_norm_scale_y 0.325 Max char y-norm scale …
  513. matcher_good_threshold 0.125 Good Match (0-1)
  514. matcher_great_threshold 0 Great Match (0-1)
  515. matcher_perfect_threshold 0.02 Perfect Match (0-1)
  516. matcher_bad_match_pad 0.15 Bad Match Pad (0-1)
  517. matcher_rating_margin 0.1 New template margin (0-1)
  518. matcher_avg_noise_size 12 Avg. noise blob length
  519. matcher_clustering_max_angle_delta 0.015 Maximum angle delta for prototype clustering
  520. classify_misfit_junk_penalty 0 Penalty to apply when a non-alnum is vertically out of its expected textline position
  521. rating_scale 1.5 Rating scaling factor
  522. certainty_scale 20 Certainty scaling factor
  523. tessedit_class_miss_scale 0.00390625 Scale factor for features not used
  524. classify_character_fragments_garbage_certainty_threshold -3 Exclude fragments that do not look like whole characters from training and adaption
  525. segment_penalty_dict_frequent_word 1 Score multiplier for word matches which have good case andare frequent in the given language (lower is better).
  526. segment_penalty_dict_case_ok 1.1 Score multiplier for word matches that have good case (lower is better).
  527. segment_penalty_dict_case_bad 1.3125 Default score multiplier for word matches, which may have case issues (lower is better).
  528. segment_penalty_ngram_best_choice 1.24 Multipler to for the best choice from the ngram model.
  529. segment_penalty_dict_nonword 1.25 Score multiplier for glyph fragment segmentations which do not match a dictionary word (lower is better).
  530. segment_penalty_garbage 1.5 Score multiplier for poorly cased strings that are not in the dictionary and generally look like garbage (lower is better).
  531. certainty_scale 20 Certainty scaling factor
  532. stopper_nondict_certainty_base -2.5 Certainty threshold for non-dict words
  533. stopper_phase2_certainty_rejection_offset 1 Reject certainty offset
  534. stopper_certainty_per_char -0.5 Certainty to add for each dict char above small word size.
  535. stopper_allowable_character_badness 3 Max certaintly variation allowed in a word (in sigma)
  536. stopper_ambiguity_threshold_gain 8 Gain factor for ambiguity threshold.
  537. stopper_ambiguity_threshold_offset 1.5 Certainty offset for ambiguity threshold.
  538. bestrate_pruning_factor 2 Multiplying factor of current best rate to prune other hypotheses
  539. segment_reward_script 0.95 Score multipler for script consistency within a word. Being a ‘reward’ factor, it should be <= 1. Smaller value implies bigger reward.
  540. segment_reward_chartype 0.97 Score multipler for char type consistency within a word.
  541. segment_reward_ngram_best_choice 0.99 Score multipler for ngram permuter’s best choice (only used in the Han script path).
  542. doc_dict_pending_threshold 0 Worst certainty for using pending dictionary
  543. doc_dict_certainty_threshold -2.25 Worst certainty for words that can be inserted into thedocument dictionary
  544. wordrec_worst_state 1 Worst segmentation state
  545. tessedit_certainty_threshold -2.25 Good blob limit
  546. chop_split_dist_knob 0.5 Split length adjustment
  547. chop_overlap_knob 0.9 Split overlap adjustment
  548. chop_center_knob 0.15 Split center adjustment
  549. chop_sharpness_knob 0.06 Split sharpness adjustment
  550. chop_width_change_knob 5 Width change adjustment
  551. chop_ok_split 100 OK split limit
  552. chop_good_split 50 Good split limit
  553. heuristic_segcost_rating_base 1.25 base factor for adding segmentation cost into word rating.It’s a multiplying factor, the larger the value above 1, the bigger the effect of segmentation cost.
  554. heuristic_weight_rating 1 weight associated with char rating in combined cost of state
  555. heuristic_weight_width 1000 weight associated with width evidence in combined cost of state
  556. heuristic_weight_seamcut 0 weight associated with seam cut in combined cost of state
  557. heuristic_max_char_wh_ratio 2 max char width-to-height ratio allowed in segmentation
  558. segsearch_max_char_wh_ratio 2 Maximum character width-to-height ratio
  559. segsearch_max_fixed_pitch_char_wh_ratio 2 Maximum character width-to-height ratio for fixed-pitch fonts
  560. language_model_ngram_small_prob 0,000001 To avoid overly small denominators use this as the floor of the probability returned by the ngram model.
  561. language_model_ngram_nonmatch_score -40 Average classifier score of a non-matching unichar.
  562. language_model_ngram_scale_factor 0.03 Strength of the character ngram model relative to the character classifier
  563. language_model_penalty_non_freq_dict_word 0.1 Penalty for words not in the frequent word dictionary
  564. language_model_penalty_non_dict_word 0.15 Penalty for non-dictionary words
  565. language_model_penalty_punc 0.2 Penalty for inconsistent punctuation
  566. language_model_penalty_case 0.1 Penalty for inconsistent case
  567. language_model_penalty_script 0.5 Penalty for inconsistent script
  568. language_model_penalty_chartype 0.3 Penalty for inconsistent character type
  569. language_model_penalty_font 0 Penalty for inconsistent font
  570. language_model_penalty_spacing 0.05 Penalty for inconsistent spacing
  571. language_model_penalty_increment 0.01 Penalty increment
  572. quality_rej_pc 0.08 good_quality_doc lte rejection limit
  573. quality_blob_pc 0 good_quality_doc gte good blobs limit
  574. quality_outline_pc 1 good_quality_doc lte outline error limit
  575. quality_char_pc 0.95 good_quality_doc gte good char limit
  576. test_pt_x 100000 xcoord
  577. test_pt_y 100000 ycoord
  578. tessedit_reject_doc_percent 65 %rej allowed before rej whole doc
  579. tessedit_reject_block_percent 45 %rej allowed before rej whole block
  580. tessedit_reject_row_percent 40 %rej allowed before rej whole row
  581. tessedit_whole_wd_rej_row_percent 70 Number of row rejects in whole word rejectswhich prevents whole row rejection
  582. tessedit_good_doc_still_rowrej_wd 1.1 rej good doc wd if more than this fraction rejected
  583. quality_rowrej_pc 1.1 good_quality_doc gte good char limit
  584. crunch_terrible_rating 80 crunch rating lt this
  585. crunch_poor_garbage_cert -9 crunch garbage cert lt this
  586. crunch_poor_garbage_rate 60 crunch garbage rating lt this
  587. crunch_pot_poor_rate 40 POTENTIAL crunch rating lt this
  588. crunch_pot_poor_cert -8 POTENTIAL crunch cert lt this
  589. crunch_del_rating 60 POTENTIAL crunch rating lt this
  590. crunch_del_cert -10 POTENTIAL crunch cert lt this
  591. crunch_del_min_ht 0.7 Del if word ht lt xht x this
  592. crunch_del_max_ht 3 Del if word ht gt xht x this
  593. crunch_del_min_width 3 Del if word width lt xht x this
  594. crunch_del_high_word 1.5 Del if word gt xht x this above bl
  595. crunch_del_low_word 0.5 Del if word gt xht x this below bl
  596. crunch_small_outlines_size 0.6 Small if lt xht x this
  597. fixsp_small_outlines_size 0.28 Small if lt xht x this
  598. suspect_rating_per_ch 999.9 Dont touch bad rating limit
  599. suspect_accept_rating -999.9 Accept good rating limit
  600. tessedit_lower_flip_hyphen 1.5 Aspect ratio dot/hyphen test
  601. tessedit_upper_flip_hyphen 1.8 Aspect ratio dot/hyphen test
  602. rej_whole_of_mostly_reject_word_fract 0.85 if >this fract
  603. min_orientation_margin 7 Min acceptable orientation margin
  604. tosp_old_sp_kn_th_factor 2 Factor for defining space threshold in terms of space and kern sizes
  605. tosp_threshold_bias1 0 how far between kern and space?
  606. tosp_threshold_bias2 0 how far between kern and space?
  607. tosp_narrow_fraction 0.3 Fract of xheight for narrow
  608. tosp_narrow_aspect_ratio 0.48 narrow if w/h less than this
  609. tosp_wide_fraction 0.52 Fract of xheight for wide
  610. tosp_wide_aspect_ratio 0 wide if w/h less than this
  611. tosp_fuzzy_space_factor 0.6 Fract of xheight for fuzz sp
  612. tosp_fuzzy_space_factor1 0.5 Fract of xheight for fuzz sp
  613. tosp_fuzzy_space_factor2 0.72 Fract of xheight for fuzz sp
  614. tosp_gap_factor 0.83 gap ratio to flip sp->kern
  615. tosp_kern_gap_factor1 2 gap ratio to flip kern->sp
  616. tosp_kern_gap_factor2 1.3 gap ratio to flip kern->sp
  617. tosp_kern_gap_factor3 2.5 gap ratio to flip kern->sp
  618. tosp_ignore_big_gaps -1 xht multiplier
  619. tosp_ignore_very_big_gaps 3.5 xht multiplier
  620. tosp_rep_space 1.6 rep gap multiplier for space
  621. tosp_enough_small_gaps 0.65 Fract of kerns reqd for isolated row stats
  622. tosp_table_kn_sp_ratio 2.25 Min difference of kn & sp in table
  623. tosp_table_xht_sp_ratio 0.33 Expect spaces bigger than this
  624. tosp_table_fuzzy_kn_sp_ratio 3 Fuzzy if less than this
  625. tosp_fuzzy_kn_fraction 0.5 New fuzzy kn alg
  626. tosp_fuzzy_sp_fraction 0.5 New fuzzy sp alg
  627. tosp_min_sane_kn_sp 1.5 Dont trust spaces less than this time kn
  628. tosp_init_guess_kn_mult 2.2 Thresh guess – mult kn by this
  629. tosp_init_guess_xht_mult 0.28 Thresh guess – mult xht by this
  630. tosp_max_sane_kn_thresh 5 Multiplier on kn to limit thresh
  631. tosp_flip_caution 0 Dont autoflip kn to sp when large separation
  632. tosp_large_kerning 0.19 Limit use of xht gap with large kns
  633. tosp_dont_fool_with_small_kerns -1 Limit use of xht gap with odd small kns
  634. tosp_near_lh_edge 0 Dont reduce box if the top left is non blank
  635. tosp_silly_kn_sp_gap 0.2 Dont let sp minus kn get too small
  636. tosp_pass_wide_fuzz_sp_to_context 0.75 How wide fuzzies need context
  637. textord_blob_size_bigile 95 Percentile for large blobs
  638. textord_noise_area_ratio 0.7 Fraction of bounding box for noise
  639. textord_blob_size_smallile 20 Percentile for small blobs
  640. textord_initialx_ile 0.75 Ile of sizes for xheight guess
  641. textord_initialasc_ile 0.9 Ile of sizes for xheight guess
  642. textord_noise_sizelimit 0.5 Fraction of x for big t count
  643. textord_noise_normratio 2 Dot to norm ratio for deletion
  644. textord_noise_syfract 0.2 xh fract height error for norm blobs
  645. textord_noise_sxfract 0.4 xh fract width error for norm blobs
  646. textord_noise_hfract 0.015625 Height fraction to discard outlines as speckle noise
  647. textord_noise_rowratio 6 Dot to norm ratio for deletion
  648. textord_blshift_maxshift 0 Max baseline shift
  649. textord_blshift_xfraction 9.99 Min size of baseline shift
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement