Sitemap

Thursday, June 18, 2015

UCM: Indexing

To configure UCM to place a .hcst file in the weblayout directory instead of a copy of the native file, set IndexVaultFile=true. This will work only when the file is a passthru file (didn't go through IBR). The .hcst file in the weblayout points to the vault file only.
IndexVaultFile=true
NOTE: IndexVaultFile=true was replaced with UseNativeFormatInIndex=true. Either of these configuration settings will force the indexer to index the native file.
NOTE: When using webless storage, use UseNativeFormatInIndex=true. IndexVaultFile=true should not be used at all.

If the above env variable is set as true, and still the user wants to allow some documents to be copied to the weblayout directory
IndexVaultExclusionWildcardFormats=*/hcs*|*/ttp|*/xsl|*/wml|*template*|*/jsp*|*/gif|*/png|*/pdf|*/doc*|*/msword|*/*ms-excel|text/plain


When a large file is being indexed, and textexport times out, you can increase the timeout. The default value is 15 seconds.
TextExtractorTimeoutInSec=60
IndexerTextExtractionGuardTimeout=60


UCM will not index files larger than 10485760(10 MB) by default unless the configuration entry MaxIndexableFileSize is set (in this example 20 MB). Setting this to 0 (zero) stops full text indexing but still allows use of Oracle Text Search. This is useful if you still need case insensitive searches but do not need full text indexing.
MaxIndexableFileSize=20971520


This parameter lists what formats will be text indexed. If a file format extension is not on the list, the textexport will not get invoked and it will be indexed as metadata only.
TextIndexerFilterFormats=pdf,msword,ms-word,doc*,ms-excel,xls*,ms-powerpoint,powerpoint,ppt*,rtf,xml,msg,zip

More information in depth: Doc ID 445871.1

No comments:

Post a Comment