Index source configuration

An index source is a configuration of Cms resources to be indexed.

<indexsource>
<name>...</name>
<indexer class="..." />
<resources>
<resource>...</resource>
</resources>
<documenttypes-indexed>
<name>...</name>´
...
</documenttypes-indexed>
</indexsource>

Configuration nodes

  • the <name> node gives the index source a unique name. the value of the <source> nodes in an index configuration and the value of a <name> node in an index source configuration have to match
  • the <indexer> node specifies the package/class name of the indexer used to pull the content of Cms resources into Lucene index documents using the document factories configured in the documenttype nodes
  • each <resource> node specifies a Cms resource (typically a folder) to be indexed, including it's full site root
  • each <name> node below the <documenttypes-indexed> node specifies the name of a resource type to be indexed within the specified Cms resources. the value of a <name> node here has to match the value of a <name> node configured above in the <documenttype> nodes.

Example

This example shows how to configure the index source for all VFS resources in the default site:

<indexsource>
<name>source1</name>
<indexer class="org.opencms.search.CmsVfsIndexer" />
<resources>
<resource>/sites/default/</resource>
</resources>
<documenttypes-indexed>
<name>xmlpage</name>
<name>page</name>
<name>text</name>
<name>pdf</name>
<name>msword</name>
<name>msexcel</name>
<name>image</name>
<name>generic</name>
</documenttypes-indexed>
</indexsource>

Available indexers

  • org.opencms.search.CmsVfsIndexer
    is used to index VFS resources