org.opencms.search.documents
Class A_CmsVfsDocument

java.lang.Object
  extended by org.opencms.search.documents.A_CmsVfsDocument
All Implemented Interfaces:
I_CmsDocumentFactory, I_CmsSearchExtractor
Direct Known Subclasses:
CmsDocumentGeneric, CmsDocumentHtml, CmsDocumentMsExcel, CmsDocumentMsPowerPoint, CmsDocumentMsWord, CmsDocumentOpenOffice, CmsDocumentPdf, CmsDocumentPlainText, CmsDocumentRtf, CmsDocumentXmlContent, CmsDocumentXmlPage

public abstract class A_CmsVfsDocument
extends java.lang.Object
implements I_CmsDocumentFactory

Base document factory class for a VFS CmsResource, just requires a specialized implementation of I_CmsSearchExtractor.extractContent(CmsObject, CmsResource, CmsSearchIndex) for text extraction from the binary document content.

Since:
6.0.0
Version:
$Revision: 1.26 $
Author:
Carsten Weinholz, Alexander Kandzior

Field Summary
protected  java.lang.String m_name
          Name of the documenttype.
static java.lang.String SEARCH_PRIORITY_HIGH_VALUE
          Deprecated. use CmsSearchFieldConfiguration.SEARCH_PRIORITY_HIGH_VALUE instead
static java.lang.String SEARCH_PRIORITY_LOW_VALUE
          Deprecated. use CmsSearchFieldConfiguration.SEARCH_PRIORITY_LOW_VALUE instead
static java.lang.String SEARCH_PRIORITY_MAX_VALUE
          Deprecated. use CmsSearchFieldConfiguration.SEARCH_PRIORITY_MAX_VALUE instead
static java.lang.String SEARCH_PRIORITY_NORMAL_VALUE
          Deprecated. use CmsSearchFieldConfiguration.SEARCH_PRIORITY_NORMAL_VALUE instead
static java.lang.String VFS_DOCUMENT_KEY_PREFIX
          Deprecated. use CmsSearchFieldConfiguration.VFS_DOCUMENT_KEY_PREFIX instead
 
Constructor Summary
A_CmsVfsDocument(java.lang.String name)
          Creates a new instance of this lucene document factory.
 
Method Summary
 org.apache.lucene.document.Document createDocument(CmsObject cms, CmsResource resource, CmsSearchIndex index)
          Generates a new lucene document instance from contents of the given resource for the provided index.
 CmsExtractionResultCache getCache()
          Returns the disk based cache used to store the raw extraction results.
static java.lang.String getDocumentKey(java.lang.String type, java.lang.String mimeType)
          Creates a document factory lookup key for the given resource type name / MIME type configuration.
 java.util.List<java.lang.String> getDocumentKeys(java.util.List<java.lang.String> resourceTypes, java.util.List<java.lang.String> mimeTypes)
          Returns the list of accepted keys for the resource types that can be indexed using this document factory.
 java.lang.String getName()
          Returns the name of this document type factory.
protected  CmsFile readFile(CmsObject cms, CmsResource resource)
          Upgrades the given resource to a CmsFile with content.
 void setCache(CmsExtractionResultCache cache)
          Sets the disk based cache used to store the raw extraction results.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.opencms.search.documents.I_CmsDocumentFactory
isLocaleDependend, isUsingCache
 
Methods inherited from interface org.opencms.search.documents.I_CmsSearchExtractor
extractContent
 

Field Detail

SEARCH_PRIORITY_HIGH_VALUE

public static final java.lang.String SEARCH_PRIORITY_HIGH_VALUE
Deprecated. use CmsSearchFieldConfiguration.SEARCH_PRIORITY_HIGH_VALUE instead
Value for "high" search priority.

See Also:
Constant Field Values

SEARCH_PRIORITY_LOW_VALUE

public static final java.lang.String SEARCH_PRIORITY_LOW_VALUE
Deprecated. use CmsSearchFieldConfiguration.SEARCH_PRIORITY_LOW_VALUE instead
Value for "low" search priority.

See Also:
Constant Field Values

SEARCH_PRIORITY_MAX_VALUE

public static final java.lang.String SEARCH_PRIORITY_MAX_VALUE
Deprecated. use CmsSearchFieldConfiguration.SEARCH_PRIORITY_MAX_VALUE instead
Value for "maximum" search priority.

See Also:
Constant Field Values

SEARCH_PRIORITY_NORMAL_VALUE

public static final java.lang.String SEARCH_PRIORITY_NORMAL_VALUE
Deprecated. use CmsSearchFieldConfiguration.SEARCH_PRIORITY_NORMAL_VALUE instead
Value for "normal" search priority.

See Also:
Constant Field Values

VFS_DOCUMENT_KEY_PREFIX

public static final java.lang.String VFS_DOCUMENT_KEY_PREFIX
Deprecated. use CmsSearchFieldConfiguration.VFS_DOCUMENT_KEY_PREFIX instead
The VFS prefix for document keys.

See Also:
Constant Field Values

m_name

protected java.lang.String m_name
Name of the documenttype.

Constructor Detail

A_CmsVfsDocument

public A_CmsVfsDocument(java.lang.String name)
Creates a new instance of this lucene document factory.

Parameters:
name - name of the documenttype
Method Detail

getDocumentKey

public static java.lang.String getDocumentKey(java.lang.String type,
                                              java.lang.String mimeType)
Creates a document factory lookup key for the given resource type name / MIME type configuration.

If the given mimeType is null, this indicates that the key should match all VFS resource of the given resource type regardless of the MIME type.

Parameters:
type - the resource type name to use
mimeType - the MIME type to use
Returns:
a document factory lookup key for the given resource id / MIME type configuration

createDocument

public org.apache.lucene.document.Document createDocument(CmsObject cms,
                                                          CmsResource resource,
                                                          CmsSearchIndex index)
                                                   throws CmsException
Generates a new lucene document instance from contents of the given resource for the provided index.

Specified by:
createDocument in interface I_CmsDocumentFactory
Parameters:
cms - the OpenCms user context used to access the OpenCms VFS
resource - the search index resource to create the Lucene document from
index - the search index to create the Document for
Returns:
the Lucene Document for the given index resource and the given search index
Throws:
CmsException - if something goes wrong
See Also:
I_CmsDocumentFactory.createDocument(CmsObject, CmsResource, CmsSearchIndex), CmsSearchFieldConfiguration.createDocument(CmsObject, CmsResource, CmsSearchIndex, I_CmsExtractionResult)

getCache

public CmsExtractionResultCache getCache()
Description copied from interface: I_CmsDocumentFactory
Returns the disk based cache used to store the raw extraction results.

In case null is returned, then result caching is not supported for this factory.

Specified by:
getCache in interface I_CmsDocumentFactory
Returns:
the disk based cache used to store the raw extraction results
See Also:
I_CmsDocumentFactory.getCache()

getDocumentKeys

public java.util.List<java.lang.String> getDocumentKeys(java.util.List<java.lang.String> resourceTypes,
                                                        java.util.List<java.lang.String> mimeTypes)
                                                 throws CmsException
Description copied from interface: I_CmsDocumentFactory
Returns the list of accepted keys for the resource types that can be indexed using this document factory.

The result List contains String objects. This String is later matched against getDocumentKey(String, String) to find the corrospondig I_CmsDocumentFactory for a resource to index.

The list of accepted resource types may contain a catch-all entry "*"; in this case, a list for all possible resource types is returned, calculated by a logic depending on the document handler class.

Specified by:
getDocumentKeys in interface I_CmsDocumentFactory
Parameters:
resourceTypes - list of accepted resource types
mimeTypes - list of accepted mime types
Returns:
the list of accepted keys for the resource types that can be indexed using this document factory (String objects)
Throws:
CmsException - if something goes wrong
See Also:
I_CmsDocumentFactory.getDocumentKeys(java.util.List, java.util.List)

getName

public java.lang.String getName()
Description copied from interface: I_CmsDocumentFactory
Returns the name of this document type factory.

Specified by:
getName in interface I_CmsDocumentFactory
Returns:
the name of this document type factory
See Also:
I_CmsDocumentFactory.getName()

setCache

public void setCache(CmsExtractionResultCache cache)
Description copied from interface: I_CmsDocumentFactory
Sets the disk based cache used to store the raw extraction results.

This should only be used for factories where I_CmsDocumentFactory.isUsingCache() returns true.

Specified by:
setCache in interface I_CmsDocumentFactory
Parameters:
cache - the disk based cache used to store the raw extraction results
See Also:
I_CmsDocumentFactory.setCache(org.opencms.search.documents.CmsExtractionResultCache)

readFile

protected CmsFile readFile(CmsObject cms,
                           CmsResource resource)
                    throws CmsException,
                           CmsIndexException
Upgrades the given resource to a CmsFile with content.

Parameters:
cms - the current users OpenCms context
resource - the resource to upgrade
Returns:
the given resource upgraded to a CmsFile with content
Throws:
CmsException - if the resource could not be read
CmsIndexException - if the resource has no content