Liferay 6.1 simple cluster

A Liferay cluster is simply a matter of put some properties in portal-ext.properties. First you'll put these two properties:

cluster.link.enabled=true

cluster.link.autodetect.address=<some-address>:<some-port>

lucene.replicate.write=true

The first will enable Liferay to replicate both database caches (ehcache) and search indexes (lucene) through the ClusterLink channel.

The second will tell Liferay what network interface to use for the multicast communication. You must specify an IP address and a port so that Liferay can make a "ping" in that socket (IP + port) to check if that will work.

The third will enable Lucene to replicate the indexes across the nodes.

This kind of IP+port verification may seem a bit awkward (I guess it's for both windows and unix system work the same way), but the idea is to set it to a proxy or gateway so that all nodes use the same pair IP+port. But if you, for some reason, can't give an IP+port to Liferay, due to firewall or else, you may use the IP+port of the database. I don't see any reason to block database's IP from Liferay ;)

In some cenarios, you may want to use an specific interface exclusivelly for the multicast communication. This is good, and you can always use the IP+port of that interface for each node individually. The only question is, what port to use? In Linux you can use port 22 for this is a commonly open port in many distributions. Use "netstat" to look what port can be use for that purpose in your system.

Documents and Images

Unlike some folks may think, documents and images aren't stored in the database, is in the filesystem so the performance can be much, much better.

Of course, documents and images must be accessible from all Liferay nodes. The good practice for that is to use a storage area. Then you can set Liferay to use the mounted storage directory. This is done in portal-ext.properties like this:

dl.store.file.system.root.dir=/my/storage/path/document_library

Or:

dl.store.file.system.root.dir=G:/document_library

For Windows Servers. Note that you have to use forward slashes instead of back slashes.

You will end with 3 simple properties in the portal-ext.properties of each node.

Deploy portlets, plugins, themes...

Since the very first version of Liferay, the replication of the portlets and plugins always was a task for the application server you're using for Liferay. 

Most of them have a feature called "farm" that do this job. So Liferay is aware of this.

But, to simplify things, you can make a script for that. I wrote one that do it using ssh connection with the rsync tool available in most Linux distributions.

This script can be downloaded from here: https://github.com/ricardofunke/ndeploy/blob/master/ndeploy.sh

To install, make a copy of this script to all your Liferay nodes. I recommend to create a folder inside your LIFERAY_HOME called "ndeploy" and copy the ndeploy.sh script inside it.

Next to the script, inside the same "ndeploy" folder, create another folder called "deploy" where the script will watch for .war files to be replicated to all Liferay nodes. 

Grant execute permission to the script with "chmod +x ndeploy.sh"

Edit the script and set the NODES variable with all your Liferay nodes IP except the local one. Don't forget to do this for all the nodes.

Change the variable APPSRV_HOME to the correct path to your application server. This is the path of your LIFERAY_HOME.

You must create ssh public keys and distribute between all nodes using liferay user. This is necessary to eliminate the need for password between the nodes. For example, suppose you're using liferay as the user to run the portal (the java process) in the "node1" server.

As root:

# su - liferay

$ ssh-keygen

Press enter without set any password

$ ssh-copy-id -i .ssh/id_rsa.pub liferay@node2

Do the same for all Liferay nodes you have.

Now you have to make sure that this script will run with liferay side-by-side as a deamon. Use the option -d to run the script as a deamon. You can put it in your Liferay startup script. Remember to use the same Liferay user to run this script.

Finally, Liferay must be set to deploy the application not to the application server but first to ndeploy.sh "deploy" folder, so that it can copy to all other nodes. The order of deploying will be like this:

  1. You'll copy your application to the Liferay deploy folder (or you will upload through Control Panel)
  2. Liferay will copy the application to ndeploy.sh folder
  3. ndeploy.sh will copy the application to the application server in all nodes locally and remotelly.

To do this, put this property into your portal-ext.properties:

auto.deploy.dest.dir=/path/to/ndeploy/deploy

Change the value to correspond to the path to the ndeploy.sh deploy folder. You can make this in the Control Panel instead in Control Panel -> Plugins Installation -> Install More Plugins -> Configuration in the "Destination directory" field.

It's all. Let me know if you have any trouble with this installation.

Blogs
Ricardo, thanks a lot for sharing the script to do indirect farming, I was looking for that kind of solution for long time for the clustered environment. emoticon
Awesome...don't have specific query. But I assume that approach you are following is to copy war from shared location to "deploy" directory of the app servers in clustered, right?
Well, the idea is to have this script in all the nodes watching the ndeploy/deploy folder.

Then, you set liferay to copy to this folder, so for the Liferay users, this script will be transparent. If you want to deploy a portlet, you'll deploy as normally to liferay's deploy folder (or upload via Control Panel), then liferay will copy to ndeploy/deploy folder which will copy to all nodes directly to the application server.

This is the same for all Liferay nodes, so you won't have a unique shared location.
Hi Ricardo, it's a very helpful post!!
About the propertie cluster.link.autodetect.address in portal-ext.properties, i see that checking that there is a connection with <some-address> and <some-port> with the command telnet <some-address> <some-port> (port 80 in this case)
But when i'm starting up Liferay there is a message like "Did not receive a response to the deployment operation within the allowed timeout period" and Liferay finally stops starting up.
What I'i'm doign wrong? There is something else that i must to configure?
And... how can i check if liferay has been correctly configured as simple cluster?
Thanks for your answer.
Are you sure that the port 80 is opened so that liferay can check it successfully?
It seems to be a JBoss error, not related to the Liferay cluster. It seems that JBoss couldn't deploy something, perhaps the own Liferay.
Hi again. Sorry but I'm very new about configuring Liferay in cluster mode.
My previous issue was a hardware problem (memory leak) instead a JBoss problem, I'm running Liferay bundled with JBoss in a virtual machine.
Now i'm trying to do my best configuring Liferay bundled with Tomcat (liferay-portal-tomcat-6.1.1-ce-ga2) and have some questions:
Do I have to put the same portal-ext.properties in both nodes?
Should I run a clean copy of Liferay (first run)?
Is there an order when starting up both Liferay?

Well, at this time i did everything what i'm asking and find that the Liferay started up in second node needs to be configurated again (the welcome screen, define new user password, reminder of the password, etc.), what i think is wrong because it's in "cluster mode".
I'm confused about this. Is there another way to do this? In other pages i found to do this modifying some xml files, install Liferay WAR, etc. (sometimes confusing too) but not for version of Liferay 6.1 and JBoss 7.1.

Thanks again for your answer.
Hi Daniel,

"Do I have to put the same portal-ext.properties in both nodes?"
Yes, you do

"Should I run a clean copy of Liferay (first run)?"
Sorry, I didn't get this question

"Is there an order when starting up both Liferay?"
No, but it's good to wait the node start to start others

"Well, at this time i did everything what i'm asking and find that the Liferay started up in second node needs to be configurated again (the welcome screen, define new user password, reminder of the password, etc.), what i think is wrong because it's in "cluster mode"."
No, it's because you set the nodes to different databases. All the nodes must be set to the same database.

"I'm confused about this. Is there another way to do this? In other pages i found to do this modifying some xml files, install Liferay WAR, etc. (sometimes confusing too) but not for version of Liferay 6.1 and JBoss 7.1."
No, all the cluster configuration is done here in this blog.

Thanks for the questions ;)
Hi Ricardo, it's me again. I'm glad to tell you that I'm making some progress about setting up this liferay cluster.
Just one more question, the path for the libray folder must be created on each node of the cluster? or.. it must be created on a shared folder, let's say a file server?... If the correct is the second option, how can I do that on a redhat file server?
Thanks again, really.
Hi Ricardo,

Thanks for your tutoriel. With this tuto, it is very easy to configure a cluster.

But I have a problem with the shared folder.
Liferay use not the correct path. He delete the "/" in the path.

Error :
com.liferay.portlet.documentlibrary.NoSuchFileException: Z:document_library\1
0155\10181\6\1.0

I work on Windows and with a liferay CE 6.1.1 GA 2.
It's the properties used in portal-ext.properties
dl.store.file.system.root.dir=Z:\document_library dl.store.impl=com.liferay.portlet.documentlibrary.store.FileSystemStore

Do you have meet this problem?
Hi Ricardo, it's me again. Finally my "simple" cluster is working with Liferay CE 6.1.1 bundled with JBoss, but i'm getting a fatal error when i try to do the same in 6.1.20 EE version, I know it has to be with the license and the question is what type of license I need for clustering? The message in log is like following:
"Clustering has been detected. Developer licenses do not allow for clustering. Local server is shutting down."

The other question is: Is this simple cluster setting the session mode to STICKY session mode in Liferay? Or maybe I just have to configure something else in ROOT.war or JBOSS to get this feature enabled?
I make this question because I'm afraid in production they are using a different load balancer, a hardware balancer, instead the software balancer that i'm using.

Well I hope you could answer my questions. Thanks again.
Hi Daniel,

The Sticky Session is just a way of doing load balancing, there's a lot of balancers (hardware or software) that has support for this.

Liferay works better with Sticky Session when there's a need for authentication/authorization or to make any use of user session.

About the License, you'll need to purchase a production or non-production License in order to play with cluster or you may contact Liferay's Commercial Area.

Regards
Hi Ricardo - This is awesome as it is so easy to follow. However, how do each of the node knows about the other node? Where do I put the multicase IP address? Can you provide some test cases that I can do to validate cluster is setup correctly?

Thanks
Hi Hoa La,

The nodes comunicates through multicast, if you take a look at the portal.properties you'll se in the Multicast section:

##
## Multicast
##

#
# Consolidate multicast address and port settings in one location for easier
# maintenance. These settings must correlate to your physical network
# configuration (i.e. firewall, switch, and other network hardware matter)
# to ensure speedy and accurate communication across a cluster.
#
# Each address and port combination represent a conversation that is made
# between different nodes. If they are not unique or correctly set, there
# will be a potential of unnecessary network traffic that may cause slower
# updates or inaccurate updates.
#

#
# See the property "cluster.link.channel.properties.control".
#
multicast.group.address["cluster-link-control"]=239.255.0.1
multicast.group.port["cluster-link-control"]=23301

#
# See the properties "cluster.link.channel.properties.transport.0" and
# "cluster.link.channel.system.properties".
#
multicast.group.address["cluster-link-udp"]=239.255.0.2
multicast.group.port["cluster-link-udp"]=23302

#
# See the property "cluster.link.channel.system.properties".
#
multicast.group.address["cluster-link-mping"]=239.255.0.3
multicast.group.port["cluster-link-mping"]=23303

#
# See the properties "net.sf.ehcache.configurationResourceName" and
# "net.sf.ehcache.configurationResourceName.peerProviderProperties".
#
multicast.group.address["hibernate"]=239.255.0.4
multicast.group.port["hibernate"]=23304

#
# See the properties "ehcache.multi.vm.config.location" and
# "ehcache.multi.vm.config.location.peerProviderProperties".
#
multicast.group.address["multi-vm"]=239.255.0.5
multicast.group.port["multi-vm"]=23305

So if you want to change these values, just put these properties in your portal-ext.properties with your own settings.

For the test cases, just add a portlet on any page in one node, then go to the other node to see if the portlet is there. This will test the EhCache clustering.

For Lucene, add a web content in one node, then search for that content on the other node.

Also, you can put the line bellow in the $TOMCAT/webapps/ROOT/WEB-INF/classes/log4j.properties to debug the cluster communication. Don't that in a production environment!

log4j.logger.net.sf.ehcache=INFO
log4j.logger.net.sf.ehcache.config=DEBUG
log4j.logger.net.sf.ehcache.distribution=DEBUG

Best regards
Hi Ricardo, it's... me... again. Happy because the custer works, but a little sad because i'd notice that when firewall is disabled (running liferay on fedora) the cache replication is working correctly, i mean changes in one node reflects in the other node, but... when the firewall is enabled, cache replication is NOT working, it's like each node does not about the other one.

My portal-ext.properties is simple like this one:
##-----------------------------------------------------------------------------------------------------
cluster.link.enabled=true
cluster.link.autodetect.address=192.168.1.40:80

#
# MySQL
#
jdbc.default.driverClassName=com.mysql.jdbc.Driver
jdbc.default.url=jdbc:mysql://192.168.1.40:3306/lportal?useUnicode=true&characterEncoding=UTF-8&useFastDateParsing=false
jdbc.default.username=us3r
jdbc.default.password=passw0rd
##--------------------------------------------------------------------------------------

I make sure that my local network configuration and the file /etc/hosts are fine. Can you tell me what changes i must do for resolve this issue? Thanks again.
I suppose you should enable the multicast communication on the firewall. Refer to the Multicast section on portal.properties for the Multicast IPs and ports.
And how do i enable the multicast communication? At the moment I opened the following UDP ports from 23301 to 23305 in my firewall, and the port 5353 for Multicast is also opened i guess, what else? What are those IPs 239.255.0.1, 239.255.0.2, ..., 239.255.0.5 used for? What is the correct port for Multicast communication?
I have the JBoss server pointing to all interfaces and the output for netstat command is like following:

[liferay@node01 Documents]$ netstat -an | grep -i udp
udp 0 0 0.0.0.0:631 0.0.0.0:*
udp 0 0 127.0.0.1:41679 0.0.0.0:*
udp 0 0 239.255.0.1:23301 0.0.0.0:*
udp 0 0 239.255.0.2:23302 0.0.0.0:*
udp 214016 0 0.0.0.0:23304 0.0.0.0:*
udp 0 0 0.0.0.0:23304 0.0.0.0:*
udp 212992 0 0.0.0.0:23305 0.0.0.0:*
udp 0 0 0.0.0.0:23305 0.0.0.0:*
udp 0 0 127.0.0.1:60210 0.0.0.0:*
udp 0 0 0.0.0.0:5353 0.0.0.0:*
udp 0 0 0.0.0.0:53488 0.0.0.0:*
udp 0 0 224.0.75.75:7500 0.0.0.0:*
udp 0 0 224.0.75.75:7500 0.0.0.0:*

Maybe something more clear could help me. Thanks Ricardo!
Sorry the output for command is messy, it's like following:

[liferay@node01.Documents]$.netstat.-an.|.grep.-i.udp
udp........0......0.0.0.0.0:631.............0.0.0.0:*
udp........0......0.127.0.0.1:41679.........0.0.0.0:*
udp........0......0.239.255.0.1:23301.......0.0.0.0:*
udp........0......0.239.255.0.2:23302.......0.0.0.0:*
udp...214016......0.0.0.0.0:23304...........0.0.0.0:*
udp........0......0.0.0.0.0:23304...........0.0.0.0:*
udp...212992......0.0.0.0.0:23305...........0.0.0.0:*
udp........0......0.0.0.0.0:23305...........0.0.0.0:*
udp........0......0.127.0.0.1:60210.........0.0.0.0:*
udp........0......0.0.0.0.0:5353............0.0.0.0:*
udp........0......0.0.0.0.0:53488...........0.0.0.0:*
udp........0......0.224.0.75.75:7500........0.0.0.0:*
udp........0......0.224.0.75.75:7500........0.0.0.0:*
Hi Daniel,

These are the multicast addresses. I suggest you to do some research about multicast communication.

Liferay's multicast communication are already enable and that's because it works without firewall.
Thank you Ricardo. It seems my problem was solved by adding the following line to the iptables file:
-A INPUT -m pkttype --pkt-type multicast -j ACCEPT

I'll give credits to https://www.ibm.com/developerworks/mydeveloperworks/blogs/fe313521-2e95-46f2-817d-44a4f27eba32/entry/configuring_iptables_for_ip_multicast1?lang=en
This might be of some use. We published an article on how to create highly available cluster for Liferay in the cloud with Jelastic.

http://blog.jelastic.com/2013/06/06/liferay-cluster/
Hi, Ricardo, we are using Liferay 6.1 CE and i was wondering if there is a limit to number of servers you can cluster with clusterlink. We setup as shown in the blog entry and we have 8 servers. Sometimes we will get the last server that starts up, it will hang up during when it is trying to establish the liferay control channel for multicasting.
Yes Edmund, this setup will have a scale limit, but be aware that you won't have a specified fixed number of nodes as a limit. This will depend on the data transmited by your portal through the channel. The behavior of you whole portal will determine this limit.

As an alternative, you can use Terracotta and Solr (or other search engine) to bring more scalability to your portal in this context, as it will uncouple these components to other servers that can grow horizontally.

This answered your question?
Hi Ricardo
I'd like to add my thanks for this post - very helpful. I do have a couple of questions though:
1) In a 2-node cluster, in the portal-ext files, I have the
cluster.link.autodetect.address set up thusly:
On node 1:
cluster.link.autodetect.address=NODE_2_ADDRESS:8080
and on node 2:
cluster.link.autodetect.address=NODE_1_ADDRESS:8080
Is this correct?

And secondly
to use your "ndeploy" script, wouldn't it be preferable to actually deploy it outside the Liferay hierarchy? In case of an upgrade, you would just need to change the locations within the script, rather than salvaging the script and directories from the old hierarchy and re-deploying into the upgraded one. I know - not a LOT of work either way, but maybe saves a bit of planning....
Hi Rick,

The answer for the first question is no, I don't recommend you to use port 8080 of other nodes, because Liferay will not start if the other node is down.

If you use Linux, I suggest you to use port 22 (or any other SSH port you are using) and the IP of the local node, not the remote one.

This ip:port is just used to detect what network interface Liferay will use for the multicast communication. It will be best to use the same IP:port in all nodes because it will be better to maintain those files if all of them is exactly equal.

For the second question, you're right, you can put ndeploy folder in any place you feel better.

Best regards
Hi Ricardo,

I follow the clear guide.
I've got two machine with Liferay cluster, 6.1 CE GA3, installed on Tomcat 7.0.40.
I've a third machine with solr and one shared folder used for document library. All the server are Windows 2008 R2 std edition.
I've configured the propertie dl.store.file.system.root.dir and all others necessary properties, but when I try to upload a file i receive this error:
Caused by: java.io.FileNotFoundException: X:\document_library\10253\10279\70\701.afsh\701_1.0.afsh (The system cannot find the path specified)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.<init>(FileOutputStream.java:194)
at java.io.FileOutputStream.<init>(FileOutputStream.java:145)
at com.liferay.portal.util.FileImpl.write(FileImpl.java:857)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at com.liferay.portal.security.lang.DoPrivilegedHandler.doInvoke(DoPrivilegedHandler.java:88)
at com.liferay.portal.security.lang.DoPrivilegedHandler.invoke(DoPrivilegedHandler.java:56)
at com.sun.proxy.$Proxy61.write(Unknown Source)
at com.liferay.portal.kernel.util.FileUtil.write(FileUtil.java:388)
at com.liferay.portlet.documentlibrary.store.FileSystemStore.addFile(FileSystemStore.java:79)
at com.liferay.portlet.documentlibrary.store.StoreProxyImpl.addFile(StoreProxyImpl.java:65)
at com.liferay.portlet.documentlibrary.store.SafeFileNameStoreWrapper.addFile(SafeFileNameStoreWrapper.java:85)
at com.liferay.portlet.documentlibrary.store.DLStoreImpl.addFile(DLStoreImpl.java:130)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

I checked that the user had permission to write in the share folder

Can you help me?

Thnaks in advance

Marco
Hi Ricardo,
I am new to clustering, please clarify that how to set session time out in clustering.

Advance Thanks