« 返回

Expandos III (Liferay, NoSQL, and MongoDB)

Company Blogs 2011年1月8日 按 Ray Augé Staff

Update: The expando-mongodb-hook plugin is now committed to SVN and available from trunk.

Over the past few months the hype around NoSQL type databases had really been heating up the tech news and blog feeds. There seems to be an overwhelming desire to find scallability solutions that don't seem to be addressed with an RDBMS. What is Liferay to do?

Could Liferay support some form of NoSQL integration? I think so, and I surely couldn't go long without doing something to draw attention to the fact that Liferay is a prime candidate as a viable platform for scalling dynamically via a NoSQL backend.

The most obvious way I could see to leverage a NoSQL solution was with perhaps the most dynamic aspect of the portal, Expandos (and by association Custom Fields).

In order prove the concept of NoSQL with Liferay we decided to write an adapter (using a Liferay Hook pattern) to build a backend for Expando on MongoDB. I had no real idea how long it would take to accomplish but we decided to try. As it turns out it was not too difficult. We now have a fully functional adapter to store all of Liferay's dynamic Expando data in a highly scalable MongoDB. But note that Expandos still support all Liferay permissions. And Custom Fields are still indexed along with the entities anywhere they would be normally. This is a fantastic demonstration of just how extensible Liferay portal really is.

I tested against the version of mongodb that was readily available for Ubuntu 10.04 (1:1.2.2-1ubuntu1.1). I also tried to make sure to support cluster configurations. So check out the portlet.properties file in the plugin as well as the mongodb driver javadocs for what and how to set that up.

I did several small usage tests (none of which were load testing, since this was an informal design) to see that everything was working the right way. I created several Custom Fields on several different entites and tested CRUD opperations to make sure that the data was landing (as well as being removed/updated) where I wanted it, in MongoDB.

Meanwhile, I was also using the Mongo DB command line client mongo to make sure that everything was working from that end. I added a custom field called test to the User entity, and for the first user in the system, I set the value to test value . Here is an example of what we see via mongo:

 

[rotty@rotty-desktop  expando-mongodb-hook]$ mongo
MongoDB shell version: 1.2.2
url: test
connecting to: test
type "exit" to exit
type "help" for help
> show dbs
admin
local
lportal_0
lportal_10135
> use lportal_10135
switched to db lportal_10135
> db.getCollectionNames()
[
	"com.liferay.portal.model.User#CUSTOM_FIELDS",
	"com.liferay.portlet.blogs.model.BlogsEntry#CUSTOM_FIELDS",
	"com.liferay.portlet.documentlibrary.model.DLFileEntry#CUSTOM_FIELDS",
	"system.indexes"
]
> db.getCollection("com.liferay.portal.model.User#CUSTOM_FIELDS").count()
1
> db.getCollection("com.liferay.portal.model.User#CUSTOM_FIELDS").find()
{ "_id" : ObjectId("4d28f318fcfcc08a7855ebe4"), "companyId" : 10135, "tableId" : 17205, "rowId" : 10173, "classNameId" : 10048, "classPK" : 10173, "valueId" : 17207, "test" : "test value" }
> 

So far so good! As you can see the data is landing nicely into the Mongo DB database.

 

While that was a good test I also wanted to make sure that other use cases would work just as well. I decided to revive the First Expando Bank example to see how that would work.

I first had to make a few small API changes in the Velocity template. The updated template is attached. See this article and the follow up for more information on that topic.

After adding some accounts into the First Expando Bank app, the mongo console results looked like this:

 

> db.getCollectionNames()
[
	"AccountsTable#AccountsTable",
	"com.liferay.portal.model.User#CUSTOM_FIELDS",
	"com.liferay.portlet.blogs.model.BlogsEntry#CUSTOM_FIELDS",
	"com.liferay.portlet.documentlibrary.model.DLFileEntry#CUSTOM_FIELDS",
	"system.indexes"
]
> db.getCollection("AccountsTable#AccountsTable").count()
3
> db.getCollection("AccountsTable#AccountsTable").find()
{ "_id" : ObjectId("4d29292abda2c08a05e35e67"), "companyId" : 10135, "tableId" : 17320, "rowId" : 1294543146642, "classNameId" : 17313, "classPK" : 1294543146642, "valueId" : 17336, "balance" : 55, "firstName" : "Ray", "lastName" : "Auge", "modifiedDate" : "Sat Jan 08 2011 22:19:06 GMT-0500 (EST)" }
{ "_id" : ObjectId("4d292945bda2c08a06e35e67"), "companyId" : 10135, "tableId" : 17320, "rowId" : 1294543173086, "classNameId" : 17313, "classPK" : 1294543173086, "valueId" : 17337, "balance" : 120, "firstName" : "Daffy", "lastName" : "Duck", "modifiedDate" : "Sat Jan 08 2011 22:19:33 GMT-0500 (EST)" }
{ "_id" : ObjectId("4d292958bda2c08a07e35e67"), "companyId" : 10135, "tableId" : 17320, "rowId" : 1294543192848, "classNameId" : 17313, "classPK" : 1294543192848, "valueId" : 17338, "balance" : 300, "firstName" : "Mickey", "lastName" : "Mouse", "modifiedDate" : "Sat Jan 08 2011 22:19:52 GMT-0500 (EST)" }
> 

Excelent! It would appear that all our use cases are covered from automatic Custom Fields via the UI to programmatic use in a CMS template.

I'd love to get your feedback about it! Please note that there is currently no rich way to perform queries (à la MongoDB). But with a little enginuity we could probably make that possible.

讨论主题回复 作者 日期
Hi Ray, This is an awesome effort!!! Shagul Khajamohideen 2011年1月9日 上午4:38
Good to see you considering the NoSQL... Robert Greene 2011年1月9日 上午6:16
Nice! Thank you, Ray! Jonas Yuan 2011年1月9日 上午9:33
Very nice Ray! One other area where offering a... Jorge Ferrer 2011年1月9日 上午11:27
Brilliant job Ray! Alexander Chow 2011年1月9日 下午2:28
This is fantabulous, Ray. Your next step... Ahmed Hasan 2011年1月10日 下午7:37
even than support Bigtable I would vote for... Artur Linhart 2011年1月12日 上午1:18
This is what I have been searching for all the... Murat ÇOLAK 2011年2月12日 下午1:11
Good work. I'am currently using MongoDB and I... Steffen Schuler 2011年2月24日 上午5:09
Hey All, First, thanks for all the nice... Ray Augé 2011年2月24日 上午6:24
Note, if you would like to see integration with... Ray Augé 2011年2月24日 上午6:32
Thanks Ray, Sounds good to me. I'll update the... Steffen Schuler 2011年2月25日 上午4:28
Please do as I'm very interested in how it... Ray Augé 2011年2月25日 上午8:56
Hi Ray, so called polyglot persistence might... Jakub Liska 2011年6月23日 上午3:47
Thanks for this. I took a look and it's... Ray Augé 2011年7月4日 上午7:32
Hey Ray - just FYI I did some... James Falkner 2012年4月19日 上午7:06
Hi Ray, Just wanted to check with any body... Venkat Koppavolu 2012年11月23日 上午9:16
We actually have it in our minds to create... Ray Augé 2012年11月23日 上午9:33
.. However, there is no definitive date for... Ray Augé 2012年11月23日 上午9:35
Ray, Thanks for your reply... We are looking... Venkat Koppavolu 2012年11月24日 下午5:41
I highly doubt you'd succeed in doing that. The... Ray Augé 2012年11月24日 下午5:48
Ray, Thanks for quick reply.. I will keep all... Venkat Koppavolu 2012年11月24日 下午5:56
Hi Venkat , Have you succeeded in implmenting... Karthic Kannan 2014年1月8日 上午12:00
Ray , Are we still can not replace any relation... Mohammed Shamshuddin 2013年11月14日 上午6:37
Oops !. I have seen an error when I have tried... Mohammed Shamshuddin 2013年11月14日 上午6:52
Fixed! Re question: Not yet! It will take... Ray Augé 2013年11月14日 上午8:02
HI Venkat , have achieved in implementing... Karthic Kannan 2014年1月11日 上午1:09
I am trying to develop with (and possibly... Rick Osborn 2012年11月28日 下午1:17

Hi Ray,

This is an awesome effort!!!
在 11-1-9 上午4:38 发帖。
Good to see you considering the NoSQL alteratives for scale. One thing you might consider is to one up the others by using an ODB. In general, they will do a much better job at handling content relations which span partitioning schemes. For example, Versant object database is used by Eidos Media and Factiva ( Dow Jones , Reuters ) CMS systems - handles real-time content feeds from over 9000 sources (WallStreet Journal, Financial Times, etc ). Here is a video which shows how to build an application, make it ditributed, parallel, fault tolerant, optimize integrated cache loading. Little boring at first because it's a detailed tutorial, but about 20 minutes in it gets real interesting.

http://www.blip.tv/file/3285543

Cheers,
-Robert
在 11-1-9 上午6:16 发帖。
在 11-1-9 上午9:33 发帖。
Very nice Ray!

One other area where offering a NoSQL database as an option is as an storage of the logs for the new audit plugin in 6.0EE
在 11-1-9 上午11:27 发帖。
在 11-1-9 下午2:28 发帖。
This is fantabulous, Ray.

Your next step would be to make this an integral part of the service builder. How to make service builder inherently support any NoSQL like how it supports Hibernate / JPA. One of those is integration with google BigTable.

All the best for all these endeavors !!

You guys are rocking.

Ahmed Hasan
在 11-1-10 下午7:37 发帖。
even than support Bigtable I would vote for Apache Hadoop/Htable etc.
在 11-1-12 上午1:18 发帖以回复 Ahmed Hasan
This is what I have been searching for all the time.

This is really awesome
在 11-2-12 下午1:11 发帖。
Good work. I'am currently using MongoDB and I love this thing! With Morphia on top, things feels a little bit like JPA2 with its power of annotations. Additionally Morphia provides a nice way for DAOs.

It would be great to integrate it with ServiceBuilder. Currently, I have no idea how this could be done. Do you have any hints?

In general, how could MongoDB be used as a default DB? Can you lead us into some directions, where to look for the answers? Or how you would do this?
在 11-2-24 上午5:09 发帖。
Hey All,

First, thanks for all the nice comments!

@Steffen, @Ahmed, regarding ServiceBuilder integration: I can't see this happening in the near future simply due to the overwhelming number of other features already on our roadmap. BUT, on the other hand, I would not completely count it out, since SB is really so flexible internally. In the meantime, I would not hesitate to work with a bigtable, NoSQL solution paired with SB as is. I would do it by simply letting SB handle the entity modeling and Service tier generation, and then I would implement the DAO myself and have my Service tier call this custom DAO instead of the generated one. This would still save a significant amount of work and still provide all the features like generated web services and allow integration with all the nice Liferay framework APIs like permissions and assets.
在 11-2-24 上午6:24 发帖以回复 Steffen Schuler
Note, if you would like to see integration with SB as an option sooner than later, I would suggest creating a feature request ticket in JIRA and vote it up.

The sad part is there there is no standard API (like Hibernate provides for SQL) so we would have to pick and choose one or two best of breed products to base on.

I have also thought that what we really might need is to support REST end points as SB DataSource. This would potentially allow us to connect to any type of solution supporting REST.
在 11-2-24 上午6:32 发帖以回复 Ray Augé
Thanks Ray, Sounds good to me. I'll update the results here and how easy/difficult it is to do this with morphias DAO support for mongodb.
在 11-2-25 上午4:28 发帖以回复 Ray Augé
Please do as I'm very interested in how it turns out.

Also, it would be a very great Liferay Live topic showing how it can be done. I expect it might generate a whole lot of interest.
在 11-2-25 上午8:56 发帖以回复 Steffen Schuler
Hi Ray, so called polyglot persistence might help to the expando scaling issue. http://www.youtube.com/watch?v=fI4L6K0MBVE&feature=player_detailpage#t=199s

It's one of the hidden aces of Spring Data project http://www.springsource.org/spring-data and it could basically substitute the expando data model.
在 11-6-23 上午3:47 发帖以回复 Ray Augé
Thanks for this. I took a look and it's definitely interesting.
在 11-7-4 上午7:32 发帖以回复 Jakub Liska
Hey Ray - just FYI I did some scaling/performance measurements and presented my findings @ OSCON - others may be interested: http://www.oscon.com/oscon2011/public/schedule/detail/21536 (there is a link to the presentation with results)
在 12-4-19 上午7:06 发帖以回复 Ray Augé
Hi Ray,
Just wanted to check with any body implemented Cassandra with Liferay.
Our aim is to use Cassandra as backend for lifreay application.
Please share some thoughts on this to move forward.
Thanks,
Venkat
在 12-11-23 上午9:16 发帖以回复 James Falkner
We actually have it in our minds to create Cassandra adapter(s) for dealing with several of our storage APIs, particularly:

com.liferay.portlet.documentlibrary.store.Store
com.liferay.portlet.­dynamicdatamapping.storage.StorageEngine
com.liferay.portlet.expando.*
在 12-11-23 上午9:33 发帖以回复 Venkat Koppavolu
.. However, there is no definitive date for this. We would welcome any good community based implementations. Otherwise, as for general use of Cassandra as a backend store for your custom data is totally great idea. Go for it.
在 12-11-23 上午9:35 发帖以回复 Ray Augé
Ray, Thanks for your reply...

We are looking at Using Cassandra for both custom and liferay application usage(all tables roles, permissions etc..), i.e running liferay portal with cassandra.

Could you please throw some options which we can achive this functinality.

One way i have seen - overidding specific(expanpo) service implementation through hooks.

How can we use cassandra as full liferay Database storage option? Please share some ideas.

Thanks,
Venkat
在 12-11-24 下午5:41 发帖以回复 Ray Augé
I highly doubt you'd succeed in doing that. The portal relies quite heavily on relational database behaviors (such as being ACID). Also, it really would be of little gain in going to that effort in the first place. Most of the data in Liferay is in fact relational and trying to map that to NoSQL for the sake of it would be useless in my mind.

However, I acknolwedge that using a NoSQL solution for several of Liferay's data scenarios is appropriate, but not as a general replacement.
在 12-11-24 下午5:48 发帖以回复 Venkat Koppavolu
Ray, Thanks for quick reply.. I will keep all your suggestions and views.
在 12-11-24 下午5:56 发帖以回复 Ray Augé
I am trying to develop with (and possibly extend) the Mongo hook. If anyone has any technical resources on setting it up......
在 12-11-28 下午1:17 发帖。
Ray , Are we still can not replace any relation database with NoSQL database to store Liferays out of box tables and data.(Liferay's tables includes : Layout and its sub tables etc..)
在 13-11-14 上午6:37 发帖以回复 Ray Augé
Oops !. I have seen an error when I have tried posting a message , but after some time I am seeing my message got posted several times.
在 13-11-14 上午6:52 发帖以回复 Mohammed Shamshuddin
Fixed!

Re question: Not yet! It will take lots of design consideration to make that possible. At least we have to isolate features more than they are now in order to even conceive doing that. But there is hope (just not short term).
在 13-11-14 上午8:02 发帖以回复 Mohammed Shamshuddin
Hi Venkat ,

Have you succeeded in implmenting Cassandra in liferay ? If so can you please share us the steps ?
在 14-1-8 上午12:00 发帖以回复 Venkat Koppavolu
HI Venkat ,

have achieved in implementing cassandra in liferay ?

if so can you pls share me the steps ? or give me some ideas or suggestions to move forward .
在 14-1-11 上午1:09 发帖以回复 Venkat Koppavolu