Customize Elastic Search and Search by Synonyms in Liferay

Q-Have you ever wondered if you can customize your Elastic Search, so that on your searches in Liferay, not only the words you're searching for come up in the results but also synonyms of these words?

A-Yes, you can!

Below, I'll show you an example of how I did it (by customizing my indexes and mappings settings).

Let's say that I have a web content article containing the word "small" and I search for the word "tiny".

 
  1. Navigate to Control Panel → Configuration → System Settings → Foundation
     
  2. Search for the com.liferay.portal.search.elasticsearch.configuration.ElasticsearchConfiguration system setting.
     
  3. Go to "Aditional index configurations" and add your own.

    You can start by copying the contents of your index-settings.json file there. Your index-settings.json file is packed in the Elastic Search module of your Liferay bundle.

    Now, modify it as this document describes: https://www.elastic.co/guide/en/elasticsearch/guide/current/using-synonyms.html .

    It should look like the image below (note the "synonyms" part of the json, where we wrote our list of synonyms):
      Inline image 1
     
  4. Now go to "Override type mappings" and copy into the text area input the contents of the file "liferay-type-mappings.json", also packed in the Elastic Search module (a jar that you will find in your Liferay bundle) into it.

    Modify it, also following the steps this document describes: https://www.elastic.co/guide/en/elasticsearch/guide/current/using-synonyms.html (as in the previous step).

    For example, you could add something like this:
      "content": {
         "index": "analyzed",
         "store": "yes",
                   "search_analyzer" : "my_synonym_analyzer",
                   "analyzer" : "my_synonym_analyzer",
                   "term_vector": "with_positions_offsets",
                   "type": "string"
      },
    and something like this:
      "title": {
    "index": "analyzed",
    "store": "yes",
                   "search_analyzer" : "my_synonym_analyzer",
                   "analyzer" : "my_synonym_analyzer",
    "term_vector": "with_positions_offsets",
    "type": "string"
    },
    to it:
    Inline image 2
      
     
  5. Save your changes
     
  6. Navigate to Control Panel → Configuration → Server Administration and execute "Reindex all search indexes" under the section "Index Actions"
    Inline image 3

     
  7. Perform a search and... Voila!, the magic happens:
    Inline image 4

    Easy, right?

    If Elastic Search can do it, Liferay will do it too (since it leverages on Elastic Search for indexing its documents). You just need to know it can be done, and where in the control panel you can configure it.
     
Blogues

Hello Carlos,

 

Thank you very much for this blog post. It could prove to be an important case for our customer. We are currently looking into the options of how to address synonym updates in a live environment. This means that our customer already has some content available. As this content grows and they learn about the search queries the users launches, they would like to have a rapid option to adjust the synonym list. This means a reïndex needs to occur. 

 

Can you perhaps describe the impact the reïndex trigger has? How long would this normally take (in consideration of the amount of data)? Will this momentarily impact the performance of DXP? 

 

Is the Liferay default configuration for ElasticSearch optimized for such reïndexes in a live environment?

 

What would be your suggestion how to best approach this change in a live environment?

 

Kind regards,

 

Koen

Hello Koen, This is more of a general Elastic Search problem rather than a Liferay-specific one; maybe you want to take a look at this post: https://blog.codecentric.de/en/2014/09/elasticsearch-zero-downtime-reindexing-problems-solutions/  it explains different ways to do it

Hello Carlos,

 

Thank you very much for the feedback. I am currently creating a PoC for our customer where we show the search functionalities making use of synonyms.  Unfortunately I cannot seem to get the synonyms working.

 

I have added below configurations in the field you mention.

- Additional Index Configurations

{     "analysis":{         "filter":{             "my_synonym_filter":{                 "type":"synonym",                 "synonyms":[                     "dagelijkse,alledaags"                 ]             }         },         "analyzer":{             "my_synonym_analyzer":{                 "tokenizer":"keyword",                 "filter":"my_synonym_filter"             }         }     } }

 

- Override Type Mappins, replacing the description part of the entire  LiferayDocumentType I could find in the elasticsearch6-impl.jar of the liferay-dxp-connector-elasticsearch-6-1.0.0.lpkg

"description": {                 "store": true,                 "search_analyzer" : "my_synonym_analyzer",                 "analyzer" : "my_synonym_analyzer",                 "term_vector": "with_positions_offsets",                 "type": "text"             },

 

After I save and reindex, I cannot even find my content anymore using the term "dagelijkse". Before or when I clear both settings, I can find my context.  

When I add the "index":"analyzed" as per your screenshot, an error is logged while performing the reindex: "MapperParsingException[Failed to parse mapping [LiferayDocumentType]: Could not convert [description.index] to boolean]"

So I tried setting true but that also does not work.

 

Could you maybe help me towards the solution?

Mind you that I am currently using an embedded ES 6. Also every time I restart the portal, the configurations are deleted.

 

Kind regards,

 

Koen

Carlos,

 

I kinda have the feeling that the Override Type Mappings file I use, is not fully in line with what is default in Liferay. Is there a way to find the default content of this value on the server itself (with an embedded ES)?

 

Regards,

 

Koen

Hello Carlos,

 

To come back to my own comment, I have been able to have synonyms working. The config I need to provide is different from what you describe. With the help of this guide (https://dev.liferay.com/discover/deployment/-/knowledge_base/7-0/configuring-elasticsearch-for-liferay-0) I figured out  the necessary changes I needed to perform. Especially the mentioned service http://[HOST]:[ES_PORT]/liferay-[COMPANY_ID] provided to be helpful. From that I noticed that my custom analyzer was not part of the configuration. Below is a short description of the changes I needed to make. 

 

- Additional Index configurations (mind the "index" object)

{     "index": {         "analysis":{             "filter":{                 "my_synonym_filter":{                     "type":"synonym",                     "synonyms":[                         "dagelijkse,alledaags"                     ]                 }             },             "analyzer":{                 "my_synonym_analyzer":{                     "filter": "my_synonym_filter",                     "tokenizer":"standard"                 }             }         }     } }

 

- Override Type Mappings: The "LiferayDocumentType" needs to be removed from the content of liferay-type-mappings.json. However it's even better to use the response from the mentioned service and adapt that accordingly.

 

Now I still have the issue that the synonym is not used for the title. I do have following config and for the content the synonym is used. 

            "content": {                 "store": true,                 "search_analyzer" : "my_synonym_analyzer",                 "analyzer" : "my_synonym_analyzer",                 "term_vector": "with_positions_offsets",                 "type": "text"             },

            "title": {                 "store": true,                 "search_analyzer" : "my_synonym_analyzer",                 "analyzer" : "my_synonym_analyzer",                 "term_vector": "with_positions_offsets",                 "type": "text"             },

 

If you still have any pointers, please share them with me. I will still continue investigating how to properly get everything working.

 

 

Kind regards,

 

Koen

Sorry Koen,

 

I was travelling and I couldn't reply your message. Jonas Choi encountered the same problem you did, because my solution worked with a previous version of Elastic Search. He realized it didn't work in newer versions of ES, and explained in another blog article (probably you can google his name and this very same topic) how to make it work in ES6. In any case, I'm happy that you found the solution yourself and that you shared it. Thank you