Sitemap

Showing posts with label OracleTextSearch. Show all posts
Showing posts with label OracleTextSearch. Show all posts

Monday, June 22, 2015

SortSpec when Configured for OracleTextSearch

When the Content Server is set for SearchIndexerEngineName=OracleTextSearch, only fields in SDATA sections can be used for sorting. Using SortSpec to sort the search results, the correct syntax is:
&SortSpec=<Field> <ASC or DESC>, <Field> <ASC or DESC>

Fields that are optimized as SDATA sections will require the prefix 'sd'. Two standard fields that are already optimized and will require the prefix are dDocName and dDocTitle.
&SortSpec=sddDocTitle ASC,sddDocName ASC,dInDate DESC

and this is how we will be passing the SortSpec key in the binder:
requestBinder.putLocal("SortSpec","sddDocName ASC, dInDate DESC");


To get the full list of SDATA sections
1. Go to UCM Administration --> Configuration. Note the Active Index value. It will be either ots1 or ots2.
2. On the Content Server database schema run the following spool script on the active index:
set long 2000000 
set pages 0 
set heading off 
set feedback off 
spool /tmp/outputfile.txt 
select ctx_report.create_index_script('<Active Index>') from dual; 
spool off

NOTE: The <Active Index> is the index the script will be run against. If the active index is ots1 the ctx_report.create_index_script will be run using FT_IDCTEXT1. If the active index is ots2 the report will be run using FT_IDCTEXT2.
Example: select ctx_report.create_index_script('FT_IDCTEXT1') from dual; 

After the spool completes edit the outputfile.txt file. Look for these entries: ctx_ddl.add_sdata_section. These are the fields that have been configured to be SDATA sections. These will include any fields that were set to be optimized in the Text Search Admin page. Also look for these entries: ctx_ddl.add_sdata_column. These also are SDATA sections. If the number of ctx_ddl.add_sdata_section and ctx_ddl.add_sdata_column entries equals 32 then no new fields can be set to be optimized.


To workaround this limit
Since the 32 limit cannot be increased, one way to reduce the number of SDATA section fields is to disable indexing on the custom metadata fields that won't be required to be searched on. In the ctx_ddl.add_sdata_section and ctx_ddl.add_sdata_column entries, these are the fields that will have an x or sdx prefix.

1. Go into Configuration Manager --> Information Fields
2. Select the field that isn't required for searches
3. Click the Edit button
4. Uncheck the Enable for Search Index box
5. Repeat the previous steps for the other unneeded fields
6. Rebuild the search index
7. Run the select ctx_report.create_index_script again and confirm that the field or fields are no longer marked as an SDATA section

NOTE: There is a fixed limit of 32 SDATA sections that can be present at one time. This limitation of Oracle Text has been solved as of database version 12.1.0.1.0.  This fix has also been backported to version 11.2.0.3.0.  It's also included in the 11.2.0.4.0 patchset for the database. See Note 1562142.1.
After upgrading or patching the database to raise this limit, there is another change needed for WCC to recognize the new limit.  See the Note:1607548.1

Thursday, May 21, 2015

UCM: contains operator in DATABASE.METADATA SearchEngine

While working on a content server where the SearchEngine is set as DATABASE.METADATA, we might require the <contains> operator for searching contents, just like in OracleTextSearch. This might come in handy when lets say there is a multivalued metadata field (xDepartment), and there are multiple values for the same (AB, BC, CD). In such a case, we normally think of enabling the OracleTextSearch engine and then go ahead. But in OracleTextSearch , the indexing is slow and the content's status might take time to be Released. Also the system doesn't need a fulltext search.

Oracle have provided a component DBSearchContainsOpSupport. Enable it and restart. Follow the "Zone Fields Configuration" link on Administration page and define indexes that would be full text indexed. That's it.

You can then check your results by modifying the querytext in the Advanced Query builder form
`BC` <contains> xDepartment

Wednesday, May 6, 2015

OracleTextSearch Thesaurus Search

Configure OracleTextSearch for UCM
SearchIndexerEngineName=OracleTextSearch
IndexerDatabaseProviderName=SystemDatabase
AdditionalEscapeChars=-:#

Place the thesaurus file in a proper location, and then execute the below command in the terminal
ctxload -user USERNAME/PASSWORD -thes -name cbi_thesaurus -file NAME_OF_FILE

where
     USERNAME: username of the database
     PASSWORD: password for the database
     NAME_OF_FILE: name of the thesaurus file

In order to check whether the contents of the file are properly loaded or not, execute the queries in SQL Developer:

1. The following query will display the names of the thesaurus (in this case, only CBI_THESAURUS)
select * from CTX_THESAURI

2. The following query will display the list of the words that were loaded in the thesaurus:
select * from CTX_THES_PHRASES

This is the Java method:

    public void cbiThesaurusSearch() throws DataException, ServiceException, IOException {
        SystemUtils.trace(COMPONENT_DEBUG, "=====CBIThesaurusSearch STARTS=====");

        String searchKeyword = "";
        String searchKeyword2 = "";
        String queryText = "";

        SystemUtils.trace(COMPONENT_DEBUG, "Original QueryText: " + m_binder.getLocal("QueryText"));
        try {
            if (m_binder.getLocal(SEARCH_KEY) != null) {
                searchKeyword = m_binder.getLocal(SEARCH_KEY).toUpperCase();
                SystemUtils.trace(COMPONENT_DEBUG, "Search Keyword:" + searchKeyword);
                
                

                if (m_binder.getLocal(SEARCH_KEY_WITHIN) != null && (m_binder.getLocal(SEARCH_KEY_WITHIN).length() > 0)) {
                    searchKeyword2 = m_binder.getLocal(SEARCH_KEY_WITHIN).toUpperCase();
                    SystemUtils.trace(COMPONENT_DEBUG, "Search Keyword Within:" + searchKeyword2);
                }

                int index = m_binder.getLocal("QueryText").indexOf("(");
                queryText = m_binder.getLocal("QueryText").substring(0, index);
                SystemUtils.trace(COMPONENT_DEBUG, "Altered QueryText: " + queryText);

                DataBinder db = new DataBinder();
                db.putLocal("searchKey", searchKeyword);
                db.putLocal("thName", THESAURUS_NAME);
                db.putLocal("level", LEVEL);

                String expandedKeywords = "";

                ResultSet rs = m_workspace.createResultSet("CBIThesaurusSearch", db);
                DataResultSet dataContainer = new DataResultSet();
                dataContainer.copy(rs);
                for (dataContainer.first(); dataContainer.isRowPresent(); dataContainer.next()) {
                    String expandedKeyword = dataContainer.getStringValueByName("THEVALUES");
                    expandedKeywords = expandedKeywords + expandedKeyword + ",";
                }
                if (expandedKeywords.endsWith(",")) {
                    expandedKeywords = expandedKeywords.substring(0, expandedKeywords.length() - 1);
                }
                SystemUtils.trace(COMPONENT_DEBUG, "ExpandedKeywords: " + expandedKeywords);

                if (searchKeyword2 == "") {
                    m_binder.putLocal("QueryText", queryText + "(" + expandedKeywords + ")");
                } else {
                    m_binder.putLocal("QueryText",
                                      queryText + "((" + expandedKeywords + ")  (" + searchKeyword2 + "))");
                }

                SystemUtils.trace(COMPONENT_DEBUG, "Final QueryText: " + m_binder.getLocal("QueryText"));


                //m_binder.putLocal("IdcService", "GET_SEARCH_RESULTS");
                m_binder.putLocal("IdcService", "CBIDispSortSearch");
                executeService(m_binder, "sysadmin", false);
            }
        } catch (ServiceException s) {
            SystemUtils.trace(COMPONENT_DEBUG, "ServiceException " + s.getMessage());
        } catch (DataException d) {
            SystemUtils.trace(COMPONENT_DEBUG, "DataException " + d.getMessage());
        } catch (Exception e) {
            SystemUtils.trace(COMPONENT_DEBUG, "Exception " + e.getMessage());
            e.printStackTrace();
        } finally {
            m_workspace.releaseConnection();
        }
        SystemUtils.trace(COMPONENT_DEBUG, "=====CBIThesaurusSearch ENDS=====");
    }


And this is the query which is being called by the method:
SELECT DISTINCT INITCAP (val) THEVALUES
  FROM (SELECT *
          FROM (    SELECT REGEXP_SUBSTR (
                              (SELECT ctx_thes.syn (?, ?)
                                 FROM DUAL),
                              '[^{|}]+',
                              1,
                              LEVEL,
                              'i')
                              val
                      FROM DUAL
                CONNECT BY LEVEL <=
                              REGEXP_COUNT (
                                 (SELECT ctx_thes.syn (?, ?)
                                    FROM DUAL),
                                 '[^|]+'))
         WHERE val IS NOT NULL
        UNION
        SELECT *
          FROM (    SELECT REGEXP_SUBSTR (
                              (SELECT ctx_thes.bt (?, ?, ?)
                                 FROM DUAL),
                              '[^{|}]+',
                              1,
                              LEVEL,
                              'i')
                              val
                      FROM DUAL
                CONNECT BY LEVEL <=
                              REGEXP_COUNT (
                                 (SELECT ctx_thes.bt (?, ?, ?)
                                    FROM DUAL),
                                 '[^|]+'))
         WHERE val IS NOT NULL
        UNION
        SELECT *
          FROM (    SELECT REGEXP_SUBSTR (
                              (SELECT ctx_thes.nt (?, ?, ?)
                                 FROM DUAL),
                              '[^{|}]+',
                              1,
                              LEVEL,
                              'i')
                              val
                      FROM DUAL
                CONNECT BY LEVEL <=
                              REGEXP_COUNT (
                                 (SELECT ctx_thes.nt (?, ?, ?)
                                    FROM DUAL),
                                 '[^|]+'))
         WHERE val IS NOT NULL)

      searchKey varchar
      thName varchar
      searchKey varchar
      thName varchar
      searchKey varchar
      level int
      thName varchar
      searchKey varchar
      level int
      thName varchar
      searchKey varchar
      level int
      thName varchar
      searchKey varchar
      level int
      thName varchar

Sunday, April 5, 2015

UCM: Maximum Search Results Using ORACLETEXTSEARCH limited to 2048 in UCM10g

There is a database patch that addresses this OracleTextSearch issue. Please download and apply patch 12582138 from the link below:

https://updates.oracle.com/Orion/SimpleSearch/process_form?search_type=patch&patch_number=12582138&plat_lang=226P&display_type=&search_style=8&orderby=&direction=&old_type_list=&gobuttonpressed=&sortcolpressed=&tab_number=&c_release_parent=product&c_product_child=release

NOTE: Upgrading to Oracle Database 11.2.0.4 or higher has the patch included.


UPDATE: The value of (ORACLETEXTSEARCH)MaxResultCount is set as 2048 in SearchEngineRules table. So you can also override this limit by creating a custom component.