Advertisement
Guest User

solr.xml-lostCommitIssue

a guest
Feb 5th, 2020
282
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
XML 20.66 KB | None | 0 0
  1. <?xml version="1.0" encoding="UTF-8" ?>
  2.  
  3. <!--
  4.     For more details about configurations options that may appear in
  5.     this file, see http://wiki.apache.org/solr/SolrConfigXml.
  6. -->
  7. <config>
  8.     <!-- Controls what version of Lucene various components of Solr
  9.         adhere to.  Generally, you want to use the latest version to
  10.         get all bug fixes and improvements. It is highly recommended
  11.         that you fully re-index after changing this setting as it can
  12.         affect both how text is indexed and queried.
  13.    -->
  14.     <luceneMatchVersion>8.4.1</luceneMatchVersion>
  15.  
  16.  
  17.     <!-- A 'dir' option by itself adds any files found in the directory
  18.         to the classpath, this is useful for including all jars in a
  19.         directory.
  20.  
  21.         When a 'regex' is specified in addition to a 'dir', only the
  22.         files in that directory which completely match the regex
  23.         (anchored on both ends) will be included.
  24.  
  25.         If a 'dir' option (with or without a regex) is used and nothing
  26.         is found that matches, a warning will be logged.
  27.  
  28.         The examples below can be used to load some solr-contribs along
  29.         with their external dependencies.
  30.      -->
  31.     <lib dir="${solr.install.dir:../../../..}/contrib/extraction/lib" regex=".*\.jar"/>
  32.     <lib dir="${solr.install.dir:../../../..}/dist/" regex="solr-cell-\d.*\.jar"/>
  33.  
  34.     <lib dir="${solr.install.dir:../../../..}/contrib/clustering/lib/" regex=".*\.jar"/>
  35.     <lib dir="${solr.install.dir:../../../..}/dist/" regex="solr-clustering-\d.*\.jar"/>
  36.  
  37.     <lib dir="${solr.install.dir:../../../..}/contrib/langid/lib/" regex=".*\.jar"/>
  38.     <lib dir="${solr.install.dir:../../../..}/dist/" regex="solr-langid-\d.*\.jar"/>
  39.  
  40.     <lib dir="${solr.install.dir:../../../..}/contrib/velocity/lib" regex=".*\.jar"/>
  41.     <lib dir="${solr.install.dir:../../../..}/dist/" regex="solr-velocity-\d.*\.jar"/>
  42.  
  43.     <!-- Data Directory
  44.  
  45.         Used to specify an alternate directory to hold all index data
  46.         other than the default ./data under the Solr home.  If
  47.         replication is in use, this should match the replication
  48.         configuration.
  49.      -->
  50.     <dataDir>${solr.data.dir:}</dataDir>
  51.  
  52.  
  53.     <!-- The DirectoryFactory to use for indexes.
  54.  
  55.         solr.StandardDirectoryFactory is filesystem
  56.         based and tries to pick the best implementation for the current
  57.         JVM and platform.  solr.NRTCachingDirectoryFactory, the default,
  58.         wraps solr.StandardDirectoryFactory and caches small files in memory
  59.         for better NRT performance.
  60.  
  61.         One can force a particular implementation via solr.MMapDirectoryFactory,
  62.         solr.NIOFSDirectoryFactory, or solr.SimpleFSDirectoryFactory.
  63.  
  64.         solr.RAMDirectoryFactory is memory based and not persistent.
  65.      -->
  66.     <directoryFactory name="DirectoryFactory"
  67.                      class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}"/>
  68.  
  69.     <!-- The CodecFactory for defining the format of the inverted index.
  70.         The default implementation is SchemaCodecFactory, which is the official Lucene
  71.         index format, but hooks into the schema to provide per-field customization of
  72.         the postings lists and per-document values in the fieldType element
  73.         (postingsFormat/docValuesFormat). Note that most of the alternative implementations
  74.         are experimental, so if you choose to customize the index format, it's a good
  75.         idea to convert back to the official format e.g. via IndexWriter.addIndexes(IndexReader)
  76.         before upgrading to a newer version to avoid unnecessary reindexing.
  77.         A "compressionMode" string element can be added to <codecFactory> to choose
  78.         between the existing compression modes in the default codec: "BEST_SPEED" (default)
  79.         or "BEST_COMPRESSION".
  80.    -->
  81.     <codecFactory class="solr.SchemaCodecFactory"/>
  82.  
  83.     <!-- To disable dynamic schema REST APIs, use the following for <schemaFactory>: -->
  84.  
  85.     <schemaFactory class="ClassicIndexSchemaFactory"/>
  86.  
  87.  
  88.     <!-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  89.         Index Config - These settings control low-level behavior of indexing
  90.         Most example settings here show the default value, but are commented
  91.         out, to more easily see where customizations have been made.
  92.  
  93.         Note: This replaces <indexDefaults> and <mainIndex> from older versions
  94.         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -->
  95.     <indexConfig>
  96.         <!-- LockFactory
  97.  
  98.             This option specifies which Lucene LockFactory implementation
  99.             to use.
  100.  
  101.             single = SingleInstanceLockFactory - suggested for a
  102.                      read-only index or when there is no possibility of
  103.                      another process trying to modify the index.
  104.             native = NativeFSLockFactory - uses OS native file locking.
  105.                      Do not use when multiple solr webapps in the same
  106.                      JVM are attempting to share a single index.
  107.             simple = SimpleFSLockFactory  - uses a plain file for locking
  108.  
  109.             Defaults: 'native' is default for Solr3.6 and later, otherwise
  110.                       'simple' is the default
  111.  
  112.             More details on the nuances of each LockFactory...
  113.             http://wiki.apache.org/lucene-java/AvailableLockFactories
  114.        -->
  115.         <lockType>${solr.lock.type:native}</lockType>
  116.  
  117.     </indexConfig>
  118.  
  119.  
  120.     <!-- JMX
  121.  
  122.         This example enables JMX if and only if an existing MBeanServer
  123.         is found, use this if you want to configure JMX through JVM
  124.         parameters. Remove this to disable exposing Solr configuration
  125.         and statistics to JMX.
  126.  
  127.         For more details see http://wiki.apache.org/solr/SolrJmx
  128.      -->
  129.     <jmx/>
  130.     <!-- If you want to connect to a particular server, specify the
  131.         agentId
  132.      -->
  133.     <!-- <jmx agentId="myAgent" /> -->
  134.     <!-- If you want to start a new MBeanServer, specify the serviceUrl -->
  135.     <!-- <jmx serviceUrl="service:jmx:rmi:///jndi/rmi://localhost:9999/solr"/>
  136.      -->
  137.  
  138.     <!-- The default high-performance update handler -->
  139.     <updateHandler class="solr.DirectUpdateHandler2">
  140.  
  141.         <!-- Enables a transaction log, used for real-time get, durability, and
  142.             and solr cloud replica recovery.  The log can grow as big as
  143.             uncommitted changes to the index, so use of a hard autoCommit
  144.             is recommended (see below).
  145.             "dir" - the target directory for transaction logs, defaults to the
  146.                    solr data directory.
  147.             "numVersionBuckets" - sets the number of buckets used to keep
  148.                    track of max version values when checking for re-ordered
  149.                    updates; increase this value to reduce the cost of
  150.                    synchronizing access to version buckets during high-volume
  151.                    indexing, this requires 8 bytes (long) * numVersionBuckets
  152.                    of heap space per Solr core.
  153.        -->
  154.         <updateLog>
  155.             <str name="dir">${solr.ulog.dir:}</str>
  156.             <int name="numVersionBuckets">${solr.ulog.numVersionBuckets:65536}</int>
  157.         </updateLog>
  158.  
  159.         <!-- AutoCommit
  160.  
  161.             Perform a hard commit automatically under certain conditions.
  162.             Instead of enabling autoCommit, consider using "commitWithin"
  163.             when adding documents.
  164.  
  165.             http://wiki.apache.org/solr/UpdateXmlMessages
  166.  
  167.             maxDocs - Maximum number of documents to add since the last
  168.                       commit before automatically triggering a new commit.
  169.  
  170.             maxTime - Maximum amount of time in ms that is allowed to pass
  171.                       since a document was added before automatically
  172.                       triggering a new commit.
  173.             openSearcher - if false, the commit causes recent index changes
  174.               to be flushed to stable storage, but does not cause a new
  175.               searcher to be opened to make those changes visible.
  176.  
  177.             If the updateLog is enabled, then it's highly recommended to
  178.             have some sort of hard autoCommit to limit the log size.
  179.          -->
  180.         <autoCommit>
  181.             <maxDocs>${solr.autoCommit.maxDocs:10000}</maxDocs>
  182.             <openSearcher>false</openSearcher>
  183.         </autoCommit>
  184.  
  185.         <!-- softAutoCommit is like autoCommit except it causes a
  186.             'soft' commit which only ensures that changes are visible
  187.             but does not ensure that data is synced to disk.  This is
  188.             faster and more near-realtime friendly than a hard commit.
  189.          -->
  190.  
  191.         <autoSoftCommit>
  192.             <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime>
  193.         </autoSoftCommit>
  194.  
  195.     </updateHandler>
  196.  
  197.     <!-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  198.         Query section - these settings control query time things like caches
  199.         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -->
  200.     <query>
  201.  
  202.         <!-- Maximum number of clauses in each BooleanQuery,  an exception
  203.             is thrown if exceeded.  It is safe to increase or remove this setting,
  204.             since it is purely an arbitrary limit to try and catch user errors where
  205.             large boolean queries may not be the best implementation choice.
  206.          -->
  207.         <maxBooleanClauses>1024</maxBooleanClauses>
  208.  
  209.         <!-- Solr Internal Query Caches
  210.  
  211.             There are two implementations of cache available for Solr,
  212.             LRUCache, based on a synchronized LinkedHashMap, and
  213.             FastLRUCache, based on a ConcurrentHashMap.
  214.  
  215.             FastLRUCache has faster gets and slower puts in single
  216.             threaded operation and thus is generally faster than LRUCache
  217.             when the hit ratio of the cache is high (> 75%), and may be
  218.             faster under other scenarios on multi-cpu systems.
  219.        -->
  220.  
  221.         <!-- Filter Cache
  222.  
  223.             Cache used by SolrIndexSearcher for filters (DocSets),
  224.             unordered sets of *all* documents that match a query.  When a
  225.             new searcher is opened, its caches may be prepopulated or
  226.             "autowarmed" using data from caches in the old searcher.
  227.             autowarmCount is the number of items to prepopulate.  For
  228.             LRUCache, the autowarmed items will be the most recently
  229.             accessed items.
  230.  
  231.             Parameters:
  232.               class - the SolrCache implementation LRUCache or
  233.                   (LRUCache or FastLRUCache)
  234.               size - the maximum number of entries in the cache
  235.               initialSize - the initial capacity (number of entries) of
  236.                   the cache.  (see java.util.HashMap)
  237.               autowarmCount - the number of entries to prepopulate from
  238.                   and old cache.
  239.               maxRamMB - the maximum amount of RAM (in MB) that this cache is allowed
  240.                          to occupy. Note that when this option is specified, the size
  241.                          and initialSize parameters are ignored.
  242.          -->
  243.         <filterCache class="solr.FastLRUCache"
  244.                     size="512"
  245.                     initialSize="512"
  246.                     autowarmCount="0"/>
  247.  
  248.         <!-- Query Result Cache
  249.  
  250.             Caches results of searches - ordered lists of document ids
  251.             (DocList) based on a query, a sort, and the range of documents requested.
  252.             Additional supported parameter by LRUCache:
  253.                maxRamMB - the maximum amount of RAM (in MB) that this cache is allowed
  254.                           to occupy
  255.          -->
  256.         <queryResultCache class="solr.LRUCache"
  257.                          size="512"
  258.                          initialSize="512"
  259.                          autowarmCount="0"/>
  260.  
  261.         <!-- Document Cache
  262.  
  263.             Caches Lucene Document objects (the stored fields for each
  264.             document).  Since Lucene internal document ids are transient,
  265.             this cache will not be autowarmed.
  266.          -->
  267.         <documentCache class="solr.LRUCache"
  268.                       size="512"
  269.                       initialSize="512"
  270.                       autowarmCount="0"/>
  271.  
  272.         <!-- custom cache currently used by block join -->
  273.         <cache name="perSegFilter"
  274.               class="solr.search.LRUCache"
  275.               size="10"
  276.               initialSize="0"
  277.               autowarmCount="0"
  278.               regenerator="solr.NoOpRegenerator"/>
  279.  
  280.         <!-- Field Value Cache
  281.  
  282.             Cache used to hold field values that are quickly accessible
  283.             by document id.  The fieldValueCache is created by default
  284.             even if not configured here.
  285.          -->
  286.         <fieldValueCache class="solr.FastLRUCache"
  287.                         size="512"
  288.                         autowarmCount="0"
  289.                         showItems="32" />
  290.  
  291.         <!-- Lazy Field Loading
  292.  
  293.             If true, stored fields that are not requested will be loaded
  294.             lazily.  This can result in a significant speed improvement
  295.             if the usual case is to not load all stored fields,
  296.             especially if the skipped fields are large compressed text
  297.             fields.
  298.        -->
  299.         <enableLazyFieldLoading>true</enableLazyFieldLoading>
  300.  
  301.         <!-- Use Filter For Sorted Query
  302.  
  303.             A possible optimization that attempts to use a filter to
  304.             satisfy a search.  If the requested sort does not include
  305.             score, then the filterCache will be checked for a filter
  306.             matching the query. If found, the filter will be used as the
  307.             source of document ids, and then the sort will be applied to
  308.             that.
  309.  
  310.             For most situations, this will not be useful unless you
  311.             frequently get the same search repeatedly with different sort
  312.             options, and none of them ever use "score"
  313.          -->
  314.         <!--
  315.           <useFilterForSortedQuery>true</useFilterForSortedQuery>
  316.          -->
  317.  
  318.         <!-- Result Window Size
  319.  
  320.             An optimization for use with the queryResultCache.  When a search
  321.             is requested, a superset of the requested number of document ids
  322.             are collected.  For example, if a search for a particular query
  323.             requests matching documents 10 through 19, and queryWindowSize is 50,
  324.             then documents 0 through 49 will be collected and cached.  Any further
  325.             requests in that range can be satisfied via the cache.
  326.          -->
  327.         <queryResultWindowSize>20</queryResultWindowSize>
  328.  
  329.         <!-- Maximum number of documents to cache for any entry in the
  330.             queryResultCache.
  331.          -->
  332.         <queryResultMaxDocsCached>200</queryResultMaxDocsCached>
  333.  
  334.         <!-- Use Cold Searcher
  335.  
  336.             If a search request comes in and there is no current
  337.             registered searcher, then immediately register the still
  338.             warming searcher and use it.  If "false" then all requests
  339.             will block until the first searcher is done warming.
  340.          -->
  341.         <useColdSearcher>true</useColdSearcher>
  342.  
  343.  
  344.         <!-- Unfortunately, Solr does not allow to differentiate the slow request threshold between write (/update) and
  345.        read (/select) queries. While we accept batch updates with 10s oder 20s duration, read queries should be <1s.
  346.        If we set the threshold to 1s, the log is spammed with thousands of entries in the indexing phase.
  347.        Therefore we use the following configuration:
  348.        - in solrconfig.xml of the collections: slowQueryThresholdMillis = 1000. This logs queries with log level "warn"
  349.        - in log4j2.xml log level "error", i.e. by default do not log any slow queries.
  350.        If you need to debug slow queries, you can temporarily set the log level for class
  351.        org.apache.solr.core.SolrCore.SlowRequest to "warn" or lower using the Solr Admin UI
  352.        (see https://lucene.apache.org/solr/guide/7_7/configuring-logging.html#temporary-logging-settings) -->
  353.         <slowQueryThresholdMillis>1000</slowQueryThresholdMillis>
  354.  
  355.     </query>
  356.  
  357.  
  358.     <!-- Request Dispatcher
  359.  
  360.         This section contains instructions for how the SolrDispatchFilter
  361.         should behave when processing requests for this SolrCore.
  362.  
  363.      -->
  364.     <requestDispatcher handleSelect="false" >
  365.         <!-- HTTP Caching
  366.  
  367.             Set HTTP caching related parameters (for proxy caches and clients).
  368.  
  369.             The options below instruct Solr not to output any HTTP Caching
  370.             related headers
  371.          -->
  372.         <httpCaching never304="true"/>
  373.         <!-- If you include a <cacheControl> directive, it will be used to
  374.             generate a Cache-Control header (as well as an Expires header
  375.             if the value contains "max-age=")
  376.  
  377.             By default, no Cache-Control header is generated.
  378.  
  379.             You can use the <cacheControl> option even if you have set
  380.             never304="true"
  381.          -->
  382.         <!--
  383.           <httpCaching never304="true" >
  384.             <cacheControl>max-age=30, public</cacheControl>
  385.           </httpCaching>
  386.          -->
  387.         <!-- To enable Solr to respond with automatically generated HTTP
  388.             Caching headers, and to response to Cache Validation requests
  389.             correctly, set the value of never304="false"
  390.  
  391.             This will cause Solr to generate Last-Modified and ETag
  392.             headers based on the properties of the Index.
  393.  
  394.             The following options can also be specified to affect the
  395.             values of these headers...
  396.  
  397.             lastModFrom - the default value is "openTime" which means the
  398.             Last-Modified value (and validation against If-Modified-Since
  399.             requests) will all be relative to when the current Searcher
  400.             was opened.  You can change it to lastModFrom="dirLastMod" if
  401.             you want the value to exactly correspond to when the physical
  402.             index was last modified.
  403.  
  404.             etagSeed="..." is an option you can change to force the ETag
  405.             header (and validation against If-None-Match requests) to be
  406.             different even if the index has not changed (ie: when making
  407.             significant changes to your config file)
  408.  
  409.             (lastModifiedFrom and etagSeed are both ignored if you use
  410.             the never304="true" option)
  411.          -->
  412.         <!--
  413.           <httpCaching lastModifiedFrom="openTime"
  414.                        etagSeed="Solr">
  415.             <cacheControl>max-age=30, public</cacheControl>
  416.           </httpCaching>
  417.          -->
  418.     </requestDispatcher>
  419.  
  420.     <!-- Request Handlers
  421.  
  422.         http://wiki.apache.org/solr/SolrRequestHandler
  423.  
  424.         Incoming queries will be dispatched to a specific handler by name
  425.         based on the path specified in the request.
  426.  
  427.         If a Request Handler is declared with startup="lazy", then it will
  428.         not be initialized until the first request that uses it.
  429.  
  430.      -->
  431.     <!-- SearchHandler
  432.  
  433.         http://wiki.apache.org/solr/SearchHandler
  434.  
  435.         For processing Search Queries, the primary Request Handler
  436.         provided with Solr is "SearchHandler" It delegates to a sequent
  437.         of SearchComponents (see below) and supports distributed
  438.         queries across multiple shards
  439.      -->
  440.     <requestHandler name="/select" class="solr.SearchHandler">
  441.         <!-- default values for query parameters can be specified, these
  442.             will be overridden by parameters in the request
  443.          -->
  444.         <lst name="defaults">
  445.             <str name="echoParams">explicit</str>
  446.             <int name="rows">10</int>
  447.         </lst>
  448.     </requestHandler>
  449.  
  450.     <!-- A request handler that returns indented JSON by default -->
  451.     <requestHandler name="/query" class="solr.SearchHandler">
  452.         <lst name="defaults">
  453.             <str name="echoParams">explicit</str>
  454.             <str name="wt">json</str>
  455.             <str name="indent">true</str>
  456.         </lst>
  457.     </requestHandler>
  458.  
  459.  
  460.     <initParams path="/update/**,/query,/select">
  461.         <lst name="defaults">
  462.             <str name="df">_text_</str>
  463.         </lst>
  464.     </initParams>
  465.  
  466.  
  467.     <!-- Update Processors
  468.  
  469.         Chains of Update Processor Factories for dealing with Update
  470.         Requests can be declared, and then used by name in Update
  471.         Request Processors
  472.  
  473.         http://wiki.apache.org/solr/UpdateRequestProcessor
  474.  
  475.      -->
  476.  
  477.  
  478.     <queryResponseWriter name="json" class="solr.JSONResponseWriter" />
  479.  
  480.  
  481. </config>
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement