Search Engine Optimization
Table of Contents [-]
- Sitemap Protocol
- Friendly URLs
- Title, keywords and description
- Ideas for improvements
Search engine optimization (SEO) is the process of improving the volume or quality of traffic to a web site from search engines via "natural" or un-paid ("organic" or "algorithmic") search results (Source: wikipedia.org)
SEO in Liferay #
Liferay is often used to build public websites, and for this reason it provides a wide range of features to help make such sites show up at the top of the search results.
These features have been improved over the last versions of the product and in version 5.2 is one of the best tools out there to build sites that are SEO friendly.
This wiki page describes all the available features.
Sitemap Protocol #
Liferay implements the Sitemap Protocol to notify Google or Yahoo of the sitemap of a web site.
According to the Sitemap Protocol website: "Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site." (Source: sitemaps.org)
Reporting our Sitemap to search engines #
Web Site administrators can access this feature by going to "Manage Pages" for its organization or community (this can be done, for example from the Control Panel). Once within the manage pages UI click the "Settings" tab and then the "Sitemap" map.
The UI shown allows the site administrator to preview the sitemap generated or send it to Google or Yahoo. This last service requires the user to be logged in. Once the sitemap is sent the search engine will use it to understand the structure of the site, make sure it crawls all pages and offer specific results to users.
Sending the sitemap only needs to be done one per site, since the search engines keeps the URL and asks for the sitemap again periodically. In any case some people like to do it to make sure the search engine is up to date.
Customizing the sitemap #
Liferay allows specifying that some pages are more important than others or even that some pages should not be part of the sitemap. By doing this they can make sure that the pages that contain information that people will be looking for have a higher ranking within the search engines.
These options can be set for each page when editing them from within the "Manage Pages" UI, specifically in a form section called "Robots".
This sections has three fields: - Include: whether the page being edited should be included in the sitemap or not. - Priority: priority of the page compared to others of the same site. From 0 to 1.0. - Frequency of change: from always (meaning very often) to never. The default is daily.
Note that the search engines interpret this information relative to the other pages of the same site. Giving all pages maximum priority or maximum frequency of change is the same as given all the minimum of both. Therefore, use these options only to highlight specific pages that you know people will be searching for.
Friendly URLs #
Search engines read the words present in the URL of a given web page and give them a higher relevance than those found in the body of the document. For this reason using URLs that reflect the contents of a page is a very good way to increase the chances of having that page being at the top of the search results when the users look for such words.
Liferay provides friendly URLs extensively for this purpose. Specifically a Liferay URL has the following structure:
- General prefix: Either /web if it's a public site or /group if it's private.
- Friendly URL part for the community, organization or user to which the pages belongs: For example /guest. This part is set by default based on the name of the community, organization or user. It can be changed from within Manage Pages > Settings > Virtual Host. From this same UI it's also possible to assign a virtual host to the site so that the general prefix and this URL part is hidden.
- Friendly URL part for the specific page: For example /home. This part is set by default based on the name of the page and can be changed in the form to edit the page.
- Friendly URL part for the specific application being navigated. This part may be friendlier or not based on how the portlet was implemented. Most Liferay portlets have friendly URLs and in several cases the URLs even include the name of a content being shown.
Title, keywords and description #
Besides the URL, search engines also read certain HTML tags, called the meta HTML tags, to find specific information about the contents of the page. Specifically, the most important of these meta tags are:
- Title: Short description of the contents of the page. Should be unique accross the site. - Description: Longer description of the contents of the page (but it shouldn't be too long either) - Keywords: a comma-separated list of words that identify the key subjects covered in the site.
If the page provides proper values for these three items the search engine will give them a higher relevance than the words within the body of the page.
In Liferay (as well as most other portals and CMSs) there are pages that are specifically created by an administrator while others are generated automatically as a result of the links clicked by the end user. For the former, Liferay provides the administrator ways to specify the meta information. For the latter, the portlet being navigated should be smart enough to generate them automatically. The next sections provide more information about both scenarios.
Setting the title, keywords and description of the pages of a site #
To be written.
Generation of title, keywords and description by Liferay's portlets #
To be written.
Framework to make custom portlets SEO friendly #
To be written.
The old days #
Long time ago all portal products, including Liferay, were known to have very ugly URLs. Liferay was one of the first Java portal platforms to show that portals can have URLs as pretty as any other web platform.
First improvements #
Liferay started adding SEO friendly features in its versions from 4.6 to 5.1. First the ability to configure specific friendly URLs for certain pages was added. Version 5.1 was the first to have all URLs use friendly URLs all around. Also, these versions had several improvements on the metadata generated as well as the implementation of the Sitemap protocol.
But it was in version 5.2 where all these features were taken to the next level.
Improvements in 5.2 #
- Friendlier URLs by default: Now the friendly URLs generated for pages, communities and organizations will be based by default in its names. No more friendly URLs with numbers!
- Automatic generation of META tags in Asset Publisher:
- Keywords: Whenever a content is being shown its categories and tags will be added to the keyworkds meta tag.
- Description: Whenever a content is being shown its description will be added to the keyworkds meta tag.
- Unique titles for all pages: The title now has information about the portal, the community/org, the page and the portlet. Note that this has required a change to the default
- Framework to set the title and meta tags values from custom portlets
- Friendly URLs in Asset Publisher when viewing an specific asset (for assets that support it)
Ideas for improvements #
Here are some ideas that have been suggested to improve the SEO capabilities of Liferay:
- The sitemap protocol currently only generates one section per Liferay page. In some cases a more in depth sitemap is desired which includes a section for each content and for all views generated through the navigation within the portlets. Some people have reported to achieve this by using an external crawler. It's yet to be seen if or how this could be done within Liferay.
- Better Friendly URL support for those asset types such as message boards, image gallery, document library, ... which don't have specific friendlyURLs for individual assets.
- Wikipedia Article: http://en.wikipedia.org/wiki/Search_engine_optimization
- Sitemap Protocol: http://sitemap.org/