留言板

How to Index External Sites for Use in Liferay Searches

Mik Cantrell,修改在6 年前。

How to Index External Sites for Use in Liferay Searches

New Member 帖子: 4 加入日期: 09-4-9 最近的帖子
I have a project that I would love to use Liferay for. Here are the requirements:

1 - Must be able to store, index and search documents
2 - Must be able to programmatically load files to be indexed and searched
3 - Must be able to index and search external sites

I think Liferay does the first one pretty well right out of the box and I've seen documentation about the 2nd one being possible. Now, the part that is missing or I'm missing is the 3rd one. Is there some way to search external web sites that have been indexed/crawled by 3rd party solutions such as manifold/lucene/solr or is there already something in Liferay for this that I'm not aware of? If so, I would greatly appreciate any guidance you could point me too.

Thanks,
Michael
thumbnail
Jorge Díaz,修改在6 年前。

RE: How to Index External Sites for Use in Liferay Searches

Liferay Master 帖子: 753 加入日期: 14-1-9 最近的帖子
Hi Mik,

You have to implement it, some ideas:

See https://web.liferay.com/es/community/forums/-/message_boards/message/87242969

Another idea could be creating a new ServiceBuilder entity storing the external url to a page. (Similar to bookmark entity)
That entity will have a Indexer that will retrieve the external url and will send to index.

As a final step you also have to integrate a crawler. It will retrieve all pages of the site and it will create inside Liferay using the new ServiceBuilder entity.