Forums de discussion

How to Index External Sites for Use in Liferay Searches

Mik Cantrell, modifié il y a 6 années.

How to Index External Sites for Use in Liferay Searches

New Member Publications: 4 Date d'inscription: 09/04/09 Publications récentes
I have a project that I would love to use Liferay for. Here are the requirements:

1 - Must be able to store, index and search documents
2 - Must be able to programmatically load files to be indexed and searched
3 - Must be able to index and search external sites

I think Liferay does the first one pretty well right out of the box and I've seen documentation about the 2nd one being possible. Now, the part that is missing or I'm missing is the 3rd one. Is there some way to search external web sites that have been indexed/crawled by 3rd party solutions such as manifold/lucene/solr or is there already something in Liferay for this that I'm not aware of? If so, I would greatly appreciate any guidance you could point me too.

Thanks,
Michael
thumbnail
Jorge Díaz, modifié il y a 6 années.

RE: How to Index External Sites for Use in Liferay Searches

Liferay Master Publications: 753 Date d'inscription: 09/01/14 Publications récentes
Hi Mik,

You have to implement it, some ideas:

See https://web.liferay.com/es/community/forums/-/message_boards/message/87242969

Another idea could be creating a new ServiceBuilder entity storing the external url to a page. (Similar to bookmark entity)
That entity will have a Indexer that will retrieve the external url and will send to index.

As a final step you also have to integrate a crawler. It will retrieve all pages of the site and it will create inside Liferay using the new ServiceBuilder entity.