Creating a Google Like Search Part V: Finale
Company Blogs November 27, 2017 By Petteri Karttunen Staff
Previous parts of this series can be found here (part 1), here (part 2), here (part 3) and here (part 4).
In the final part of this blog series few more interesting features are added to the previously created search portlet: possibility to use Liferay Audience Targeting to make segmented content more relevant, possibility to configure sort and facet fields (to any indexed fields) and fully configure search fields and their boosts. There’s also a possibility to make non-Liferay content findable through this search portlet.
There were also quite a few generic improvements I made along the way so in the end of the day, we have a custom Liferay search portlet with following features:
- Google like appearance
- Completely ajaxed interface (no page transitions)
- 3 selectable search result layouts (image card layout available for files / images)
- Sortable search results (not available in default Liferay search)
- Bookmarkable searches with short urls which can easily be collected to Google Analytics
- Autocompletion & query suggestions
- Automatic alternate search
- Support for Boolean operators and Lucene syntax
- Asset types to search for
- Facets to retrieve
- Sort fields
- Fields to search for and their boosting
- Audience targeting support to boost segment matching contents
- Ability to include non-Liferay resources in the search results
I also added there few notes how to make this work on CE. Depending on the interest that could be on the roadmap anyways. I also splitted the application in separate modules for a cleaner architecture. So for example, If you’d just like to use the backend and build your own UI that’s also possible now.
Results Image Layout
Non-Liferay Assets in the Search Results
Customizing the Liferay Elasticsearch Adapter
Why would you want to modify Liferay search adapter? Search adapter implementation in Liferay is responsible for implementing the portal search API for a specific search engine. It takes care for example about communication link between the portal and search engine, about implementation of index searchers and writers and about translating the portal search queries to native engine specific queries. Following diagram illustrates the layering roughly:
Following diagram on the other hand, illustrates the physical placement of search functionalities in bundles and modules:
In this project I had to customize the adapter to get a full support of the most versatile and powerful single query type, Elasticsearch QueryStringQuery.
Liferay search API evolves all the time, but at the moment, support for the QueryStringQuery is sparse. It allows only setting the query string but doesn’t let you to control any of the other, about 30 parameters. Using that query type you cannot thus control for example boosting or fields to target the query to, fuzziness etc.
So, for this purpose I did two things. First I created a new search query type QueryStringQuery extending Liferay’s standard query type StringQuery. This new query type is introduced in gsearch-query-api package. You can think that as an Liferay search API extension.
The other thing I did, was extending the Elasticsearch adapter. If creating a new query type was super easy, this was not. Sure you can write your own adapter but how about just extending a standard one even just a little? Extending search adapter is currently not flexible in a way it could be. When you take a look at the Elasticsearch adapter source code, especially the ElasticsearchQueryTranslator you see there service references to the invidual query translators. The first thought would be to use the extension points, to just create an alternative implementation of StringQuery translator with higher service ranking to replace the standard translator. That would be the way I would like to do that. That way, we wouldn’t have to modify the adapter at all and we would keep the maintainability of our portlet and portal as it is. That’s however not possible, at least at the moment, because of two reasons. First the references in the ElasticsearchQueryTranslator are by default using STATIC and RELUCTANT injection.
The other problem is, that the Elasticsearch adapter module is only exposing the com.liferay.search.elasticsearch.settings subpackage. So, if you need to reference any package inside the adapter in your custom service, you will get an import-package error at deploy time because those referenced packages are private. David Nebinger wrote an excellent writing of overcoming package access restrictions but actually there’s a third minor problem: that adapters dependencies are not all OSGI compliant but included in the module. So using for example Elasticsearch classes from you custom service, let’s say from a fragment module, leads currently to class loading problems. Hopefully these limitations will be taken care of in the future but at the moment, customization of the adapter source code seems not to be avoidable for extending practically anything of search adapters functionality.
You can see the details of my adapter customization implementation in Github. Basically I created a custom StringQuery translator implementation which, in case of our extended StringQuery type ”QueryStringQuery” will use a specialized translator for that an in case of the standard StringQuery falls to the default implementation.
As there will be an official Liferay support for Elasticsearch 6 in the near future, I decided not to upgrade the Elasticsearch adapter for 5.6.
Making Search More Intelligent
I was not completely satisfied with the relevancy of portal standard search and wanted to play with the idea of improving hits relevancy by means of making field options, like boosting, configurable and implementing some machine learning features. I tried to make the API easily customizable so you can implement there your own features.
So one of the first thoughts was that it would be great to integrate Audience Targeting feature to search. That way you could boost contents segmented to your user segments but even better, you would have a dynamic way to control search hits relevancy. As this is a DXP feature only you can enable and disable it in the configuration options. By default, it’s disabled.
How does it work? If current user is segmented to any user segment it adds a condition to the query giving the configure boost for any contents matching that segment.
Making non Liferay Resources Findable
When starting this project, I planned to do some experiments on getting non Liferay assets to search results, in conjunction to search engine federation, but decided to leave federation out of the scope as it’s something that usually should be transparent to the client (portal in this case). I just mention here that both Elasticsearch and SOLR have means to make that possible.
How to make search to find things in the index that are not Liferay assets. For example, you might have several integration points in your portal, having resources that should be findable by Liferay. Usually the options are: make those resources portal assets so that they can be found by portal search or, create a dedicated, custom search for these external resources. Both of these solutions have lots of challenges or usability issues so how about getting everything on the same result list.
In this simple imaginary scenario external resources are being indexed to the portal search index. So basically what you have to do, is to take care that these external documents just have the fields needed for our custom search to find them. You can find an example and more information on the GitHub page.
Just a reminder here that this solution only works with our custom portlet here, not with the standard Liferay portlet. Also, in a real world scenario indexing documents in the Liferay portal index would not be recommendable by any meansTo make our custom portlet to find resources in custom indexes, you would just need to customize the Elasticsearch search adapter a little bit further, mainly the ElasticsearchIndexSearcher class where the indexes to search for are being determined.
So that’s it for this blog series. For more information and details please see the project Github page.