Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- <!--
- Licensed to the Apache Software Foundation (ASF) under one or more
- contributor license agreements. See the NOTICE file distributed with
- this work for additional information regarding copyright ownership.
- The ASF licenses this file to You under the Apache License, Version 2.0
- (the "License"); you may not use this file except in compliance with
- the License. You may obtain a copy of the License at
- http://www.apache.org/licenses/LICENSE-2.0
- Unless required by applicable law or agreed to in writing, software
- distributed under the License is distributed on an "AS IS" BASIS,
- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- See the License for the specific language governing permissions and
- limitations under the License.
- -->
- <!--
- For more details about configurations options that may appear in
- this file, see http://wiki.apache.org/solr/SolrConfigXml.
- -->
- <config>
- <!--
- In all configuration below, a prefix of "solr." for class names
- is an alias that causes solr to search appropriate packages,
- including org.apache.solr.(search|update|request|core|analysis)
- You may also specify a fully qualified Java classname if you
- have your own custom plugins.
- -->
- <!--
- Controls what version of Lucene various components of Solr
- adhere to. Generally, you want to use the latest version to
- get all bug fixes and improvements. It is highly recommended
- that you fully re-index after changing this setting as it can
- affect both how text is indexed and queried.
- -->
- <luceneMatchVersion>LUCENE_41</luceneMatchVersion>
- <!--
- <lib/> directives can be used to instruct Solr to load an Jars
- identified and use them to resolve any "plugins" specified in
- your solrconfig.xml or schema.xml (ie: Analyzers, Request
- Handlers, etc...).
- All directories and paths are resolved relative to the
- instanceDir.
- Please note that <lib/> directives are processed in the order
- that they appear in your solrconfig.xml file, and are "stacked"
- on top of each other when building a ClassLoader - so if you have
- plugin jars with dependencies on other jars, the "lower level"
- dependency jars should be loaded first.
- If a "./lib" directory exists in your instanceDir, all files
- found in it are included as if you had used the following
- syntax...
- <lib dir="./lib" />
- -->
- <dataDir>${solr.data.dir:}</dataDir>
- <!--
- The DirectoryFactory to use for indexes.
- solr.StandardDirectoryFactory is filesystem
- based and tries to pick the best implementation for the current
- JVM and platform. solr.NRTCachingDirectoryFactory, the default,
- wraps solr.StandardDirectoryFactory and caches small files in memory
- for better NRT performance.
- One can force a particular implementation via solr.MMapDirectoryFactory,
- solr.NIOFSDirectoryFactory, or solr.SimpleFSDirectoryFactory.
- solr.RAMDirectoryFactory is memory based, not
- persistent, and doesn't work with replication.
- -->
- <directoryFactory name="DirectoryFactory" class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}"/>
- <!--
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Index Config - These settings control low-level behavior of indexing
- Most example settings here show the default value, but are commented
- out, to more easily see where customizations have been made.
- Note: This replaces <indexDefaults> and <mainIndex> from older versions
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- -->
- <indexConfig>
- <!--
- maxFieldLength was removed in 4.0. To get similar behavior, include a
- LimitTokenCountFilterFactory in your fieldType definition. E.g.
- <filter class="solr.LimitTokenCountFilterFactory" maxTokenCount="10000"/>
- -->
- <!--
- Maximum time to wait for a write lock (ms) for an IndexWriter. Default: 1000
- -->
- <!-- <writeLockTimeout>1000</writeLockTimeout> -->
- <!--
- The maximum number of simultaneous threads that may be
- indexing documents at once in IndexWriter; if more than this
- many threads arrive they will wait for others to finish.
- Default in Solr/Lucene is 8.
- -->
- <maxIndexingThreads>2</maxIndexingThreads>
- <useCompoundFile>true</useCompoundFile>
- <!--
- ramBufferSizeMB sets the amount of RAM that may be used by Lucene
- indexing for buffering added documents and deletions before they are
- flushed to the Directory.
- maxBufferedDocs sets a limit on the number of documents buffered
- before flushing.
- If both ramBufferSizeMB and maxBufferedDocs is set, then
- Lucene will flush based on whichever limit is hit first.
- -->
- <ramBufferSizeMB>20</ramBufferSizeMB>
- -->
- <maxBufferedDocs>10000</maxBufferedDocs>
- <!--
- Expert: Merge Policy
- The Merge Policy in Lucene controls how merging of segments is done.
- The default since Solr/Lucene 3.3 is TieredMergePolicy.
- The default since Lucene 2.3 was the LogByteSizeMergePolicy,
- Even older versions of Lucene used LogDocMergePolicy.
- -->
- <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
- <int name="maxMergeAtOnce">4</int>
- <int name="segmentsPerTier">4</int>
- </mergePolicy>
- <unlockOnStartup>true</unlockOnStartup>
- </indexConfig>
- <updateHandler class="solr.DirectUpdateHandler2">
- <!--
- Enables a transaction log, used for real-time get, durability, and
- and solr cloud replica recovery. The log can grow as big as
- uncommitted changes to the index, so use of a hard autoCommit
- is recommended (see below).
- "dir" - the target directory for transaction logs, defaults to the
- solr data directory.
- -->
- <updateLog>
- <str name="dir">${solr.ulog.dir:}</str>
- </updateLog>
- <autoCommit>
- <maxTime>15000</maxTime>
- <openSearcher>false</openSearcher>
- </autoCommit>
- <autoSoftCommit>
- <maxTime>5000</maxTime>
- </autoSoftCommit>
- </updateHandler>
- <query>
- <maxBooleanClauses>1024</maxBooleanClauses>
- <!--
- Solr Internal Query Caches
- There are two implementations of cache available for Solr,
- LRUCache, based on a synchronized LinkedHashMap, and
- FastLRUCache, based on a ConcurrentHashMap.
- FastLRUCache has faster gets and slower puts in single
- threaded operation and thus is generally faster than LRUCache
- when the hit ratio of the cache is high (> 75%), and may be
- faster under other scenarios on multi-cpu systems.
- -->
- <!--
- Filter Cache
- Cache used by SolrIndexSearcher for filters (DocSets),
- unordered sets of *all* documents that match a query. When a
- new searcher is opened, its caches may be prepopulated or
- "autowarmed" using data from caches in the old searcher.
- autowarmCount is the number of items to prepopulate. For
- LRUCache, the autowarmed items will be the most recently
- accessed items.
- Parameters:
- class - the SolrCache implementation LRUCache or
- (LRUCache or FastLRUCache)
- size - the maximum number of entries in the cache
- initialSize - the initial capacity (number of entries) of
- the cache. (see java.util.HashMap)
- autowarmCount - the number of entries to prepopulate from
- and old cache.
- -->
- <filterCache class="solr.FastLRUCache" size="512" initialSize="512" autowarmCount="0"/>
- <!--
- Query Result Cache
- Caches results of searches - ordered lists of document ids
- (DocList) based on a query, a sort, and the range of documents requested.
- -->
- <queryResultCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0"/>
- <!--
- Document Cache
- Caches Lucene Document objects (the stored fields for each
- document). Since Lucene internal document ids are transient,
- this cache will not be autowarmed.
- -->
- <documentCache class="solr.LRUCache" size="512" initialSize="512" autowarmCount="0"/>
- <!--
- Lazy Field Loading
- If true, stored fields that are not requested will be loaded
- lazily. This can result in a significant speed improvement
- if the usual case is to not load all stored fields,
- especially if the skipped fields are large compressed text
- fields.
- -->
- <enableLazyFieldLoading>true</enableLazyFieldLoading>
- <!--
- Use Filter For Sorted Query
- A possible optimization that attempts to use a filter to
- satisfy a search. If the requested sort does not include
- score, then the filterCache will be checked for a filter
- matching the query. If found, the filter will be used as the
- source of document ids, and then the sort will be applied to
- that.
- For most situations, this will not be useful unless you
- frequently get the same search repeatedly with different sort
- options, and none of them ever use "score"
- -->
- <!--
- <useFilterForSortedQuery>true</useFilterForSortedQuery>
- -->
- <!--
- Result Window Size
- An optimization for use with the queryResultCache. When a search
- is requested, a superset of the requested number of document ids
- are collected. For example, if a search for a particular query
- requests matching documents 10 through 19, and queryWindowSize is 50,
- then documents 0 through 49 will be collected and cached. Any further
- requests in that range can be satisfied via the cache.
- -->
- <queryResultWindowSize>20</queryResultWindowSize>
- <!--
- Maximum number of documents to cache for any entry in the
- queryResultCache.
- -->
- <queryResultMaxDocsCached>200</queryResultMaxDocsCached>
- <!--
- Query Related Event Listeners
- Various IndexSearcher related events can trigger Listeners to
- take actions.
- newSearcher - fired whenever a new searcher is being prepared
- and there is a current searcher handling requests (aka
- registered). It can be used to prime certain caches to
- prevent long request times for certain requests.
- firstSearcher - fired whenever a new searcher is being
- prepared but there is no current registered searcher to handle
- requests or to gain autowarming data from.
- -->
- <!--
- Use Cold Searcher
- If a search request comes in and there is no current
- registered searcher, then immediately register the still
- warming searcher and use it. If "false" then all requests
- will block until the first searcher is done warming.
- -->
- <useColdSearcher>false</useColdSearcher>
- <!--
- Max Warming Searchers
- Maximum number of searchers that may be warming in the
- background concurrently. An error is returned if this limit
- is exceeded.
- Recommend values of 1-2 for read-only slaves, higher for
- masters w/o cache warming.
- -->
- <maxWarmingSearchers>5</maxWarmingSearchers>
- </query>
- <!--
- Request Dispatcher
- This section contains instructions for how the SolrDispatchFilter
- should behave when processing requests for this SolrCore.
- handleSelect is a legacy option that affects the behavior of requests
- such as /select?qt=XXX
- handleSelect="true" will cause the SolrDispatchFilter to process
- the request and dispatch the query to a handler specified by the
- "qt" param, assuming "/select" isn't already registered.
- handleSelect="false" will cause the SolrDispatchFilter to
- ignore "/select" requests, resulting in a 404 unless a handler
- is explicitly registered with the name "/select"
- handleSelect="true" is not recommended for new users, but is the default
- for backwards compatibility
- -->
- <requestDispatcher handleSelect="false">
- <!--
- Request Parsing
- These settings indicate how Solr Requests may be parsed, and
- what restrictions may be placed on the ContentStreams from
- those requests
- enableRemoteStreaming - enables use of the stream.file
- and stream.url parameters for specifying remote streams.
- multipartUploadLimitInKB - specifies the max size (in KiB) of
- Multipart File Uploads that Solr will allow in a Request.
- formdataUploadLimitInKB - specifies the max size (in KiB) of
- form data (application/x-www-form-urlencoded) sent via
- POST. You can use POST to pass request parameters not
- fitting into the URL.
- *** WARNING ***
- The settings below authorize Solr to fetch remote files, You
- should make sure your system has some authentication before
- using enableRemoteStreaming="true"
- -->
- <!--
- HTTP Caching
- Set HTTP caching related parameters (for proxy caches and clients).
- The options below instruct Solr not to output any HTTP Caching
- related headers
- -->
- <httpCaching never304="true"/>
- <!--
- If you include a <cacheControl> directive, it will be used to
- generate a Cache-Control header (as well as an Expires header
- if the value contains "max-age=")
- By default, no Cache-Control header is generated.
- You can use the <cacheControl> option even if you have set
- never304="true"
- -->
- <!--
- <httpCaching never304="true" >
- <cacheControl>max-age=30, public</cacheControl>
- </httpCaching>
- -->
- <!--
- To enable Solr to respond with automatically generated HTTP
- Caching headers, and to response to Cache Validation requests
- correctly, set the value of never304="false"
- This will cause Solr to generate Last-Modified and ETag
- headers based on the properties of the Index.
- The following options can also be specified to affect the
- values of these headers...
- lastModFrom - the default value is "openTime" which means the
- Last-Modified value (and validation against If-Modified-Since
- requests) will all be relative to when the current Searcher
- was opened. You can change it to lastModFrom="dirLastMod" if
- you want the value to exactly correspond to when the physical
- index was last modified.
- etagSeed="..." is an option you can change to force the ETag
- header (and validation against If-None-Match requests) to be
- different even if the index has not changed (ie: when making
- significant changes to your config file)
- (lastModifiedFrom and etagSeed are both ignored if you use
- the never304="true" option)
- -->
- <!--
- <httpCaching lastModifiedFrom="openTime"
- etagSeed="Solr">
- <cacheControl>max-age=30, public</cacheControl>
- </httpCaching>
- -->
- </requestDispatcher>
- <!--
- Request Handlers
- http://wiki.apache.org/solr/SolrRequestHandler
- Incoming queries will be dispatched to a specific handler by name
- based on the path specified in the request.
- Legacy behavior: If the request path uses "/select" but no Request
- Handler has that name, and if handleSelect="true" has been specified in
- the requestDispatcher, then the Request Handler is dispatched based on
- the qt parameter. Handlers without a leading '/' are accessed this way
- like so: http://host/app/[core/]select?qt=name If no qt is
- given, then the requestHandler that declares default="true" will be
- used or the one named "standard".
- If a Request Handler is declared with startup="lazy", then it will
- not be initialized until the first request that uses it.
- -->
- <!--
- SearchHandler
- http://wiki.apache.org/solr/SearchHandler
- For processing Search Queries, the primary Request Handler
- provided with Solr is "SearchHandler" It delegates to a sequent
- of SearchComponents (see below) and supports distributed
- queries across multiple shards
- -->
- <requestHandler name="search" class="solr.SearchHandler" default="true">
- <!--
- default values for query parameters can be specified, these
- will be overridden by parameters in the request
- -->
- <lst name="defaults">
- <str name="defType">dismax</str>
- <str name="echoParams">explicit</str>
- <int name="rows">10</int>
- <str name="q.alt">*:*</str>
- <str name="mm">2<-1 5<-2 6<90%</str>
- <str name="fl">
- id, score, author_display, author_vern_display, format, isbn_t, language_facet, lc_callnum_display, material_type_display, published_display, published_vern_display, pub_date, title_display, title_vern_display, subject_topic_facet, subject_geo_facet, subject_era_facet, subtitle_display, subtitle_vern_display, url_fulltext_display, url_suppl_display, material,
- </str>
- <str name="material_qf">
- material_unstem_search^200 material_addl_unstem_search^50 material_t^20 material_addl_t
- </str>
- <str name="material_pf">
- material_unstem_search^2000 material_addl_unstem_search^500 material_t^200 material_addl_t^10
- </str>
- <str name="facet">true</str>
- <str name="facet.mincount">1</str>
- <str name="facet.limit">10</str>
- <str name="facet.field">format</str>
- <str name="facet.field">pub_date</str>
- <str name="facet.field">material</str>
- </lst>
- </requestHandler>
- <requestHandler name="/select" class="solr.SearchHandler">
- <arr name="last-components">
- <str>spellcheck</str>
- </arr>
- </requestHandler>
- <!--
- A request handler that returns indented JSON by default
- -->
- <requestHandler name="/query" class="solr.SearchHandler">
- <lst name="defaults">
- <str name="echoParams">explicit</str>
- <str name="wt">json</str>
- <str name="indent">true</str>
- <str name="df">text</str>
- </lst>
- <arr name="last-components">
- <str>spellcheck</str>
- </arr>
- </requestHandler>
- <!--
- realtime get handler, guaranteed to return the latest stored fields of
- any document, without the need to commit or open a new searcher. The
- current implementation relies on the updateLog feature being enabled.
- -->
- <requestHandler name="/get" class="solr.RealTimeGetHandler">
- <lst name="defaults">
- <str name="omitHeader">true</str>
- <str name="wt">json</str>
- <str name="indent">true</str>
- </lst>
- </requestHandler>
- <requestHandler name="/update" class="solr.UpdateRequestHandler"></requestHandler>
- <requestHandler name="/update/json" class="solr.JsonUpdateRequestHandler">
- <lst name="defaults">
- <str name="stream.contentType">application/json</str>
- </lst>
- </requestHandler>
- <requestHandler name="/update/csv" class="solr.CSVRequestHandler">
- <lst name="defaults">
- <str name="stream.contentType">application/csv</str>
- </lst>
- </requestHandler>
- <requestHandler name="/update/extract" startup="lazy" class="solr.extraction.ExtractingRequestHandler">
- <lst name="defaults">
- <str name="lowernames">true</str>
- <str name="uprefix">ignored_</str>
- <!-- capture link hrefs but ignore div attributes -->
- <str name="captureAttr">true</str>
- <str name="fmap.a">links</str>
- <str name="fmap.div">ignored_</str>
- </lst>
- </requestHandler>
- <requestHandler name="/analysis/field" startup="lazy" class="solr.FieldAnalysisRequestHandler"/>
- <requestHandler name="/analysis/document" class="solr.DocumentAnalysisRequestHandler" startup="lazy"/>
- <!-- ping/healthcheck -->
- <requestHandler name="/admin/ping" class="solr.PingRequestHandler">
- <lst name="invariants">
- <str name="q">solrpingquery</str>
- </lst>
- <lst name="defaults">
- <str name="echoParams">all</str>
- </lst>
- </requestHandler>
- <requestHandler name="/debug/dump" class="solr.DumpRequestHandler">
- <lst name="defaults">
- <str name="echoParams">explicit</str>
- <str name="echoHandler">true</str>
- </lst>
- </requestHandler>
- <requestHandler name="/replication" class="solr.ReplicationHandler"></requestHandler>
- <searchComponent name="spellcheck" class="solr.SpellCheckComponent">
- <str name="queryAnalyzerFieldType">textSpell</str>
- <!--
- Multiple "Spell Checkers" can be declared and used by this
- component
- -->
- <!--
- a spellchecker built from a field of the main index
- -->
- <lst name="spellchecker">
- <str name="name">default</str>
- <str name="field">name</str>
- <str name="classname">solr.DirectSolrSpellChecker</str>
- <!--
- the spellcheck distance measure used, the default is the internal levenshtein
- -->
- <str name="distanceMeasure">internal</str>
- <!--
- minimum accuracy needed to be considered a valid spellcheck suggestion
- -->
- <float name="accuracy">0.5</float>
- <!--
- the maximum #edits we consider when enumerating terms: can be 1 or 2
- -->
- <int name="maxEdits">2</int>
- <!-- the minimum shared prefix when enumerating terms -->
- <int name="minPrefix">1</int>
- <!-- maximum number of inspections per result. -->
- <int name="maxInspections">5</int>
- <!--
- minimum length of a query term to be considered for correction
- -->
- <int name="minQueryLength">4</int>
- <!--
- maximum threshold of documents a query term can appear to be considered for correction
- -->
- <float name="maxQueryFrequency">0.01</float>
- <!--
- uncomment this to require suggestions to occur in 1% of the documents
- <float name="thresholdTokenFrequency">.01</float>
- -->
- </lst>
- <!--
- a spellchecker that can break or combine words. See "/spell" handler below for usage
- -->
- <lst name="spellchecker">
- <str name="name">wordbreak</str>
- <str name="classname">solr.WordBreakSolrSpellChecker</str>
- <str name="field">name</str>
- <str name="combineWords">true</str>
- <str name="breakWords">true</str>
- <int name="maxChanges">10</int>
- </lst>
- </searchComponent>
- <searchComponent name="tvComponent" class="solr.TermVectorComponent"/>
- <searchComponent name="terms" class="solr.TermsComponent"/>
- <searchComponent class="solr.HighlightComponent" name="highlight">
- <highlighting>
- <!-- Configure the standard fragmenter -->
- <!--
- This could most likely be commented out in the "default" case
- -->
- <fragmenter name="gap" default="true" class="solr.highlight.GapFragmenter">
- <lst name="defaults">
- <int name="hl.fragsize">100</int>
- </lst>
- </fragmenter>
- <!--
- A regular-expression-based fragmenter
- (for sentence extraction)
- -->
- <fragmenter name="regex" class="solr.highlight.RegexFragmenter">
- <lst name="defaults">
- <!--
- slightly smaller fragsizes work better because of slop
- -->
- <int name="hl.fragsize">70</int>
- <!-- allow 50% slop on fragment sizes -->
- <float name="hl.regex.slop">0.5</float>
- <!-- a basic sentence pattern -->
- <str name="hl.regex.pattern">[-\w ,/\n\"']{20,200}</str>
- </lst>
- </fragmenter>
- <!-- Configure the standard formatter -->
- <formatter name="html" default="true" class="solr.highlight.HtmlFormatter">
- <lst name="defaults">
- <str name="hl.simple.pre">
- <![CDATA[ <em> ]]>
- </str>
- <str name="hl.simple.post">
- <![CDATA[ </em> ]]>
- </str>
- </lst>
- </formatter>
- <!-- Configure the standard encoder -->
- <encoder name="html" class="solr.highlight.HtmlEncoder"/>
- <!-- Configure the standard fragListBuilder -->
- <fragListBuilder name="simple" class="solr.highlight.SimpleFragListBuilder"/>
- <!-- Configure the single fragListBuilder -->
- <fragListBuilder name="single" class="solr.highlight.SingleFragListBuilder"/>
- <!-- Configure the weighted fragListBuilder -->
- <fragListBuilder name="weighted" default="true" class="solr.highlight.WeightedFragListBuilder"/>
- <!-- default tag FragmentsBuilder -->
- <fragmentsBuilder name="default" default="true" class="solr.highlight.ScoreOrderFragmentsBuilder">
- <!--
- <lst name="defaults">
- <str name="hl.multiValuedSeparatorChar">/</str>
- </lst>
- -->
- </fragmentsBuilder>
- <!-- multi-colored tag FragmentsBuilder -->
- <fragmentsBuilder name="colored" class="solr.highlight.ScoreOrderFragmentsBuilder">
- <lst name="defaults">
- <str name="hl.tag.pre">
- <![CDATA[
- <b style="background:yellow">,<b style="background:lawgreen">, <b style="background:aquamarine">,<b style="background:magenta">, <b style="background:palegreen">,<b style="background:coral">, <b style="background:wheat">,<b style="background:khaki">, <b style="background:lime">,<b style="background:deepskyblue">
- ]]>
- </str>
- <str name="hl.tag.post">
- <![CDATA[ </b> ]]>
- </str>
- </lst>
- </fragmentsBuilder>
- <boundaryScanner name="default" default="true" class="solr.highlight.SimpleBoundaryScanner">
- <lst name="defaults">
- <str name="hl.bs.maxScan">10</str>
- <str name="hl.bs.chars">.,!?</str>
- </lst>
- </boundaryScanner>
- <boundaryScanner name="breakIterator" class="solr.highlight.BreakIteratorBoundaryScanner">
- <lst name="defaults">
- <!--
- type should be one of CHARACTER, WORD(default), LINE and SENTENCE
- -->
- <str name="hl.bs.type">WORD</str>
- <!--
- language and country are used when constructing Locale object.
- -->
- <!--
- And the Locale object will be used when getting instance of BreakIterator
- -->
- <str name="hl.bs.language">en</str>
- <str name="hl.bs.country">US</str>
- </lst>
- </boundaryScanner>
- </highlighting>
- </searchComponent>
- <requestHandler class="solr.MoreLikeThisHandler" name="/mlt">
- <lst name="defaults">
- <str name="mlt.mintf">1</str>
- <str name="mlt.mindf">2</str>
- </lst>
- </requestHandler>
- <!--
- Admin Handlers - This will register all the standard admin RequestHandlers.
- -->
- <requestHandler name="/admin/" class="solr.admin.AdminHandlers"/>
- </config>
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement