Mounting Multiple CMIS Repositories on Liferay 6.1
Company Blogs February 25, 2011 By Alexander Chow
Last year, there was a huge buzz about this new protocol call CMIS that just was released at 1.0. Much like JCR, CMIS is a protocol that allows interoperability for document repositories of different systems (there is a good blog post about different use cases that can be found here). The major advantage of CMIS over JCR is, well, it is not bound to Java. So there are libraries for python, PHP, .NET, etc. It also runs on top of standard web protocols like AtomPub or WebServices.
(The first section is just a bit of history. So if you just want to know what's happening in version 6.1, skip the first section.)
The Famed CMISHook
So, in Liferay 6.0, I was tasked with allowing Liferay to hook into CMIS as a means to store its document library data. If you have seen our JCRHook implementation, it is basically the same concept -- the CMISHook was used as a means to store our document library's low-level data. But then, if you went to your CMIS Repository, you would have all these numbers show up in your system which make no sense whatsoever to the average user. And if you changed anything in that repository, it would screw up Liferay because things were not synchronized.
Now WHY, some have asked, would you ever want something like this? Doesn't this defeat the whole purpose of CMIS? The answer to the second question is no. The first question is much like asking why we have a JCRHook -- or an S3Hook, for that matter. The main point is that, for some environments, you want to scale their systems in a way that a simple FileSystemHook will just not do. It is like asking if you want to store your files on a thumbdrive, network drive or your new Thunderbolt hard disk. This is what CMISHook did -- it gave users another option.
Liferay 6.1: Mounting CMIS Repositories
Of course, in Liferay, we were aware that the simple CMISHook was just a first step. The next step was to completely redesign the Document Library to support multiple repositories mounted for each document library portlet. Sergio in the Madrid office and I have been working on this redesign and now we can show a little of what it looks like. On a Mac, this is akin to mounting my iDisk or a network drive to my Finder. But the thing is, this is not simply CMIS -- the vision for 6.1 EE is to mount many vendor-specific repositories like Sharepoint and Documentum. Instead of swapping my SATA drive with an SSD, you are given the option to put a SATA + SSD + FireWire. So in 6.1, multiple repositories can be mounted to one document library -- CMIS is just the first of these.
While this is still relatively new and only in trunk, let me show you how it works.
Step 1: Credentials
The first thing you need to know is credentials. In order to log into a CMIS repository, we need to basically pass the credentials you used in Liferay through to CMIS. So, you need to make sure to set your portal.properties to allow Liferay to store your password in the session:
Next, you need to make sure that the means in which you login to Liferay is the same as for your repository. For most, this means that you need the same screenname. So, in portal.properties, I have:
Of course, what this means is that if I log into, say Nuxeo using "alex" and "secretpassword", then I have to login to Liferay with those same credentials as well. Most people would have some kind of an LDAP or something like this anyhow, so that should be fine. Without the same credentials, obviously, you will have a principal exception and your users will be complaining to you about why they can't see their data.. you don't want that.
Step 2: Mounting Your Repository
OK, in the Document Library control panel, you will see an "Add Repository" action. After clicking that, you will be given a form that looks like you are adding a folder -- but instead, you are adding a new repository. (Incidentally, we made it so you can mount a repository in any folder in the Document Library -- the root level, or your sub-sub-subfolder. As long as it is not in another third-party repository, you are fine.)
In this example, the repository type is set to CMIS AtomPub, but you can use WebServices if you like (it just has a whole lot more parameters to choose from). For CMIS, you need to fill in all entries -- but the repositoryId is not required. If you do not enter a repositoryId, then it will just look for the first repository using the given parameters and set it to that -- many systems only have one.
Step 3: Enjoy!
OK, after doing that, Liferay will try to talk to the other server and verify its connection. Assuming everything goes well, a new repository is added to your list of folders. Below, I have mounted the same Alfresco server -- once via AtomPub and again via Web Services.
Now, automatically, what you will notice is all the data that is stored at the Alfresco repository has been linked into Liferay. Just to verify, you can look at the files and folders in Liferay and compare them to Alfresco. Here's Liferay:
And here's Alfresco:
Obviously, any CRUD operation you do on Alfresco will be reflected on Liferay and vice versa.
For many, CMIS is perhaps the best thing since sliced bread. And in fact, as an interoperability protocol, it is pretty darn good. HOWEVER, all protocols come with shortcomings. Remember, CMIS is quite new (it is only at 1.0 right now) and has a lot of room for growth. A couple things to be aware of before you throw away that bread slicer you got for Christmas..
- CMIS does not give vendor-neutral specifications for many features found in Liferay or other repositories... like workflow. CMIS does not yet specify how a workflow is to be started and its different stages, etc. This is something that is in the discussion for CMIS v2.0. So, if you noticed, I mentioned Sharepoint and Documentum.. both of which are supposed to have support CMIS. The reason why we are building out vendor-specific repositories for 6.1 EE is because CMIS does not solve all the woes of integrating legacy systems into Liferay.
- Another fundamental item that is not supported by CMIS is a vendor-neutral way of managing metadata. There is no adhoc metadata, tags/categories, etc. Nuxeo, for example, stores their data as part of the CMIS file's properties using Dublic Core notation. Alfresco, on the other hand, builds things in the "extended" space of CMIS properties and brings in their proprietary "aspects." There are proposals out there for having a fundamental feature as metadata included in at least CMIS v1.1 (see here and here). But as of today, this does not exist. We are hoping to build out some of these vendor-specific attributes ontop of CMIS, but there are so many vendors out there that have their own way of supporting the standard. Just Nuxeo and Alfresco alone have quite different implementations.
- Another thing you will notice is speed.. or the lack thereof. If you are going to use everything on another server and have it translated into AtomPub or WebServices, over the wire to another server that has to translate it back to its native format... yah, it will get a performance hit. I mean, within our code, I try to cache as much information as possible, but it is still noticeably slower on my system (of course, I am running multiple servers on my non-thunderbolt-equipped MacBook Pro). It is like when I backup my wife's computer -- I always plug in and never do it over the WiFi.
All in all, CMIS is not a bad protocol. In fact, it is an excellent protocol. But, like all protocols, there is always a tension between the generic protocol and the genius of different vendors, trying to solve different problems for their customer's needs.
For us, we have gotten quite a few requests to support legacy repositories. We have no problem with that. In fact, that is the whole point of a portal – as an aggregator of information from vastly different technologies. However, it required a complete overhaul of the backend to do it (the 6.1 document library API is VERY different.. but you can't tell, can you?). Hence, the addition of CMIS as a separate repository in our document library, just adds to the greater ecosystem Liferay supports. Customers like that. And therefore, we do too.
Give her a spin and let me know how it goes. Thanks for reading.