The right tool for the job? Chapter 1: Instances

Liferay comes with so many features that it's hard to judge when a feature is a good solution for a given problem. I'd like to shine some light onto some of these features and common misconceptions about them because it's easy to abuse them for purposes for which they're not well suited - despite making the impression they might.
CC BY-ND 2.0 by S. Benno

Today I'm starting with Liferay Instances.

TL;DR? Skip to the last paragraph, giving the common "wrong problems" that instances are used to solve. While they are a great feature, they don't necessarily solve all of the problems they get thrown at in the expected way.

What is an instance?

If you go to Control Panel / Configuration / Portal Instances, you'll find the list of available instances and can add more. A new instance is introducing a totally separate data area - you'll have a new user database, new content, new sites etc. They have nothing in common with the other instances that you have in the portal. Nothing? Well, they share the same application server - so at least the portlets and plugins are shared, and naturally the maintenance intervals & downtime. But, on the data side: Nothing.

As you would expect, Liferay starts with a single instance that is used to handle all requests. Only if you actively introduce a second instance, you'll use multiple instances. The reason for this is that instances are detected through the server name: If you connect to the server depicted above with spaceprogram.liferay.com, the server will serve with the content of the second instance. All hostnames that can't be associated will be handled by the default instance.

This is called multi-tenancy. You may have multiple portals on a single application server. Great, isn't it?

Using Multiple Instances

Now assume you're providing services for multiple customers. You have your first customer's portal running at customer1.example.com in the first instance and you'd like to add customer2.example.com as a completely separated portal: Neither the content, nor the user administration should interfere with each other.

On first sight, you'd easily just introduce another instance named customer2.example.com and populate it with the required data - done. Two customers, two instances.

That was too easy, wasn't it? Where's the shortcoming?

Well, you'll end up with two instances with different features - one is more special than the other

Different features?

The first instance on Liferay is a special instance: It enables you to administer the application server (garbage collection, reindex search index, show memory consumption), install new plugins, access marketplace, etc.

Compare this part of the Control Panel of the first instance

with the same part of the second instance: You'll easily see that all of the server administration, application installation etc. is only to be found on the very first instance. (I promise that I'm logged in as Administrator on both instances).

As long as your first customer is not more special than your second one (and allowed to administer infrastructure for all of them), you might want to limit yourself to a particular different use of instances:

Make the first instance a purely administrative instance, e.g. administration.example.com. Then add all of your customer's sites as secondary (tertiary etc) instances. You'll end up with n+1 instances. The default instance should only have a few administrative users, while the other instances have whatever those instances need.

Now your first instance's administrators can maintain the whole site while your customer's administrators can't install server side plugins and maintain them - great, because the plugins are always available to all other sites, and they might not even know whom they share the server with. Imagine your customer1 to update customer2's theme... (see comments below)

Commonalities and Differences

Remember, all instances still share the same server. If one of the instances is really busy, the performance of the others might go down with it. If one requires downtime for updating a component, the others will go down with it. If one of the customers needs a specific plugin to be deployed, all of the customers will get it.

On a firewall level: If one of the instances needs access to a particular backend system (of customer1), connections to it will originate "at Liferay", thus no firewall can detect which instance the connection is coming from.

Separating instances

What happens when - some time in the future - you want to separate the instances from each other? If you follow my suggestion to have an administrative instance as first instance, life is (relatively) easy: Create a copy of your whole portal (all instances) and delete all the unused customer's instances from each of the instances. You'll end up with two portal servers with an administrative instance (which is a copy of each other).

Everything well? Or the wrong problem or the wrong solution?

Shiny, huh? But do instances solve all of your problems?

They can easily simulate to solve them. But they can also trick you into believing that they're a good solution. More often than not they aren't. Let's look at some shortcomings and issues that you should be aware of:

  • You'll have to administer each Instance as if it was a new portal: User Management, Single Sign On, User Profiles, Groups, Roles, Templates, ADTs, everything. If you update any in one instance, they won't be updated in the others. This might be what you expect, or it might be doubling your work. Typically it's some of both.
  • All instances share the same plugins (and their versions). If your customers use plugins that contain business logic specifically for them, this business logic might accidentally be published on other customer's instances. Themes and Hooks are shared between all instances as well - don't select your customer1-theme for the sites of customer2. Check the comments: You can limit your theme to specific instances
  • If one of your instances is really popular and draws a lot of traffic (enough to slow down the site), the other instances will also suffer performance and your customers might not be happy if you can't quickly serve their few web visitors.
  • If you need to cluster one instance, you must cluster all of them. Liferay's caches need to be dimensioned properly to serve the commonly requested content for all instances.
  • Some of Liferay's content types (e.g. ADTs, Web Content Templates) execute server side code. When these are done inappropriately, they might make it possible for customers to access other instance's data. You'll need to trust the individual Administrators with permissions to edit these content types to not do harm to others.
  • Instances always assume that you can predetermine the host names that users are using. You should not make your administrative site available to the internet as "the default" when you don't know the host. Thus there's some extra work that IMHO requires a webserver in front of the appserver. But that's good practice anyway.

Did I forget something?

Did this discourage you from using instances? It shouldn't - it just should help you make an informed decision when you consider using them and how.

Blogs
Do you have to think about multi-tenancy when developing a new service using ServiceBuilder? Like, what if you remove the "companyId" column from your entities?
If you remove the companyId from ServiceBuilder entities, these entities will not work with multi-tenancy. They'll be the same over all, yes. As the technical name for "instance" is "company": well observed.
Hi Olaf, doesn't the porperty "company-limit" inside liferay-look-and-feel.xml for themes-plugin solve the problem of "themes sharing"?
don't know if this xml files could also be used for portlet and hoock plugin ...
There's something that I didn't stumble upon yet. I'll check it & try it out, then update. Thanks for the pointer
@Carlo: Thanks for the pointer. After doing it wrong in the first time (leading to LPS-54387, which is invalid), I have this working. However, that's only for themes, not for any other plugin type. So: To the letter (as I was speaking explicitly about themes) I was wrong. At least the principle holds - just only for every other plugin type but themes. Good catch, and I now know a new feature.
Very interesting indeed.
What about multiple instances versus sharding? As you suggest you may use Liferay as a multi-tenant portal and create one instance per client (or tenant). Let's say that the services you sell get viral and you come up quickly with thousands of instances.
How do you manage your back-up plan by tenant as your tenant's data clutter the database up since you get only one lportal database. In that case wouldn't it be more sensible to think of doing sharding. You would get one database per tenant. Would this be the best design when running Liferay as a multi-tenant platform?
Hi Patrick,
I don't like to advocate sharding. One reason is that the setup is not really intuitive, and it's not trivial to get a shard out of the collection. IMHO it does not solve the problem that you'd expect it to solve (e.g. ease separation) but just stores different instance's data in different shards. It's quite complicated to change the setup of the shards.
Thank you for your feedback.
What I meant is that you get one shard per tenant, i.e. tenants' data could then be easily retrieved and backed up. Otherwise, retrieving one tenant's data, should it be required by the tenant, sounds difficult if you get a lot of instances. You could do it with the CompanyId identifier unique for each instance, but painfully. Configuring sharding is not trivial but once this is done, maintenance and back-up by tenant would be easier. Please correct me if I am wrong, in a nutshell, I assume that creating multiple instances is great when you don't have to manage many tenants, but when the number of tenants increases exponentially, it may be a good idea to think of another convenient design, like sharding, even though this is a challenge.
That's the problem: Shards seem to solve problems, but I am not confident that they deliver on your or my expectations: Can you back them up sequentially or does the backup need to be atomic (of the admin- and a secondary shard)? How do you ever remove one from the list of shards when you want to separate it to another server? And there's more than just the db: Search Index, Document library etc. There are a lot of details in shards that can easily mess with your expectation. This personal preference might be the main reason why I've left sharding out of the article...
I'm new to liferay, so perhaps I did not fully understand everything yet ;)
But if i'm correct you need to create instances to completely separate the data?

In our case we have the situation that within a setup with one instance with multiple site ( lets name them lrA, lremoticon you would be able to access lrB trough lrA with the following URL "http://lrA/web/lrB"
Having two instances (lri-A, lri-emoticon would prevent this, correct?
Hi Marc,
you're correct: Two instances would prevent you being able to see http://lrA/web/lrB. However, I feel that instances in this case are an oversized powertool where you only need a screwdriver - you can also prevent this when a webserver (e.g. Apache httpd) configures the virtual hosts so that host "lrA" does not allow URLs "*/web/lrB/*" and "*/group/lrB/*" and vice versa. However, both lrA and lrB would share a user database and permissions (roles etc) in this case. If this is absolutely undesirable, there's another argument for instances, but don't just use them lightly by default (because once you did, they're harder to get rid of, or to separate to different servers)
Sorry, as usual the answer is "it depends". I hope that the answer clarifies what it depends on.
Hi Olaf, Thanks for the quick response.

As we will host dozens of completely independent website I rather favors instances over sites.

Another thing is also, that I noticed that you can also access another website trough the URLs " http://lrA/sr/web/lrB" or "http://lrA/sr_RS/web/lrB" which makes the ACL even more complicated. Although I don't know if its a misconfiguration from our site on the test setup emoticon
Yep, those are the languages/localizations you want to use (if you want to enforce them through the URL). They're the reason why I mentioned "*/web/lrB/*" above and not "/web/lrB/*" (note the leading wildcard - and yes, it should rather be a regexp for empty or any character but slash).
It's fine to use instances (if you use the precautionary administrative instance I mention in the article), just don't do this before you absolutely know that you must. If these sites share the userbase, it might be easier to use a single instance. If you need to install different plugins per customer/website, it might be advisable to use different servers. If you need different administrative staff (that don't trust each other): Validate the level of trust that you have with what instances provide.