The right tool for the job? Chapter 3: Session Replication

Liferay comes with so many features that it's hard to judge when a feature is a good solution for a given problem. I'd like to shine some light onto some of these features and common misconceptions about them because it's easy to abuse them for purposes for which they're not well suited - despite making the impression they might.
CC BY-ND 2.0 by S. Benno

Today it's all about Session Replication, commonly seen as an intended setup in clusters.

What does Session Replication solve? Often, clusters are configured for "sticky sessions". When a user visits a site for the first time, the loadbalancer assigns a random server to serve this user's requests. On subsequent requests in the same session, the user continues to use this single server - as it has all the necessary server-side state already initialized and ready to go.

Server side state?

Yes, I'm aware that stateless systems would be more ideal and eliminate the whole question of session replication, but that's not what we're dealing with in the world of tightly permission controlled intranets, utilizing more and more high-level libraries that abstract away all the implementation details.

When the server that serves your requests goes down, the loadbalancer would transparently balance you over to another server in the cluster. However, that server has never seen you. It can't do anything with your session id - and so you find yourself logged out and annoyed.

Session Replication to the rescue: Just configure your application server to persist all of the session information and distribute it among the cluster nodes - and you're set. Ain't technology great? It just works.

However,

if you're running a cluster because you need to balance high load (as opposed to having a second server for high availability of your otherwise bored server), session replication adds significant overhead to every single request served. If you're not ready to pay that price, you might want to look for different solutions.

Here's some checklist to go through - see if you really need session replication:

  • How often do you anticipate a server to go down unplanned?
  • How many people do you anticipate to be in the middle of a complex transaction (that they'd have to restart) while that happens?
  • How many multi-step transactions do you have that could be impacted by such an outage?
  • Have you measured the impact of switching on/off session replication under load?

And here's an alternative solutions checklist: Systems and configurations that might do the trick for you and save you from configuring session replication:

  • Implement a Single Sign On (SSO) system - this way a user can be automatically logged in again when being balanced to a new server. For single-step transactions that might be all you need.
  • Implement your multi-step transactions so that they are stateless - combine this with SSO and you might not even realize that you've been rebalanced.
  • Implement your multi-step transactions on the business layer instead of the UI layer. That way you're independent of session storage - combine with SSO to be able to transparently find the user's data. Anything below the UI layer naturally won't utilize session data.

You see that SSO is the elephant in the room: By transparently keeping a user logged in, you've helped them with the main burden and problem point. My personal expectation is that Session Replication is the wrong or unnecessary configuration for 90% of implementations. It's solving the wrong problem, or implicitly charges an intransparent and high price. That's not to say that you must not use it - just be aware that you most likely are not in the 10%.

And a disclaimer: The 90/10% numbers are a personal expectation - there's no science (that I'd be linking here) and they might as well be inaccurate - in both directions.

You can find good reasons for implementing Session Replication. It just should not be the default choice, IMHO. I rather try harder to avoid it. And if it's unavoidable: Measure! Know the impact and the cost.