Combination View Flat View Tree View
Threads [ Previous | Next ]
toggle
Roman Hoyenko
Document storage and best practices
August 4, 2011 9:49 AM
Answer

Roman Hoyenko

Community Moderator

Rank: Liferay Master

Posts: 868

Join Date: October 8, 2007

Recent Posts

I am going to store documents in Liferay, mostly pictures for the portal, some word/exel documents as well.

First about our setup - I have a clustered weblogic but the filesystem on the both weblogic servers is shared. And we use Oracle as a database (I have a connection pool set up on Oracle, we have 2 datasources that connect to 2 different servers for load balancing).

My question is about best practice for setting up Jackrabbit - where should I store the documents - on filesystem or in the database? How should I handle the clustered weblogic instances?
Does jackrabbit manage it's own transactions? Does it need XA?

Why I am asking this is because I just used default settings, changed connection to point to my db pool and I am getting deadlocks, so trying to figure out what is wrong.
Hitoshi Ozawa
RE: Document storage and best practices
August 4, 2011 4:21 PM
Answer

Hitoshi Ozawa

Rank: Liferay Legend

Posts: 7949

Join Date: March 23, 2010

Recent Posts

I thought you were clustering after reading your post on the deadlock.

Liferay doesn't do dead locking. I think the best practice is to use SAN with dead locking/versioning features.
Roman Hoyenko
RE: Document storage and best practices
August 4, 2011 4:52 PM
Answer

Roman Hoyenko

Community Moderator

Rank: Liferay Master

Posts: 868

Join Date: October 8, 2007

Recent Posts

We use clustered weblogics, deadlocks happen in the database while selecting / deleting from DLFileRank table, as I understand this table is Liferay table.

I can't change the environment that we have and I am not really sure what type of hardware is used and what filesystem. I only know that weblogic servers share the same filesystems.

I am not even sure the jackrabbit configured correctly the way it is now - I found this wiki page, but it's for 5.3, do you think it is still relevant?
Is there any documentation on configuring jackrabbit on the latest Liferay (for clustered weblogic)?
Hitoshi Ozawa
RE: Document storage and best practices
August 4, 2011 5:11 PM
Answer

Hitoshi Ozawa

Rank: Liferay Legend

Posts: 7949

Join Date: March 23, 2010

Recent Posts

Sorry Roman, haven't used WebLogic. Hope somebody else would be able to help you.
Roman Hoyenko
RE: Document storage and best practices
August 5, 2011 8:19 AM
Answer

Roman Hoyenko

Community Moderator

Rank: Liferay Master

Posts: 868

Join Date: October 8, 2007

Recent Posts

Should not be that different, at least jackrabbit and Lucene configuration should be the same.

For future reference, if someone has similar questions - here are some links that I am looking at:
Clustering in Liferay:
http://www.liferay.com/community/wiki/-/wiki/Main/Clustering

Some specifics on Jackrabbit:
http://www.liferay.com/community/forums/-/message_boards/message/4019083

Jackrabbit docs about clustering:
http://wiki.apache.org/jackrabbit/Clustering#Sample_Cluster_Configurations
Hitoshi Ozawa
RE: Document storage and best practices
August 7, 2011 2:03 PM
Answer

Hitoshi Ozawa

Rank: Liferay Legend

Posts: 7949

Join Date: March 23, 2010

Recent Posts

Are you clustering JCR too?
Roman Hoyenko
RE: Document storage and best practices
August 8, 2011 8:41 AM
Answer

Roman Hoyenko

Community Moderator

Rank: Liferay Master

Posts: 868

Join Date: October 8, 2007

Recent Posts

well, for JCR I am thinking of using DB to store documents. As I understand I don't need clustering for this.
Hitoshi Ozawa
RE: Document storage and best practices
August 8, 2011 2:22 PM
Answer

Hitoshi Ozawa

Rank: Liferay Legend

Posts: 7949

Join Date: March 23, 2010

Recent Posts

Just wanted to make sure because the administrative guide gives a confusing title "Clustering Jackrabbit" (bottom of the page).

http://www.liferay.com/documentation/liferay-portal/6.0/administration/-/ai/distributed-cachi-4
Roman Hoyenko
RE: Document storage and best practices
August 25, 2011 12:57 PM
Answer

Roman Hoyenko

Community Moderator

Rank: Liferay Master

Posts: 868

Join Date: October 8, 2007

Recent Posts

Just wanted to answer my own questions, may be someone will find it useful - I hate the threads that only have questions emoticon

I ended up using
dl.hook.impl=com.liferay.documentlibrary.util.FileSystemHook

for document library. It will store the files on local filesystem. Some of the stuff document library stores in the database, but that should work fine in the cluster. And there should not be problems for me in using local filesystem in cluster since it's managed by the filesystem (in case there are simultaneous updates).

This way there is no need in using Jackrabbit at all, so everything becomes easier.
Hiran Chaudhuri
RE: Document storage and best practices
September 4, 2011 12:08 AM
Answer

Hiran Chaudhuri

Rank: Regular Member

Posts: 188

Join Date: September 1, 2010

Recent Posts

Roman Hoyenko:
First about our setup - I have a clustered weblogic but the filesystem on the both weblogic servers is shared. And we use Oracle as a database (I have a connection pool set up on Oracle, we have 2 datasources that connect to 2 different servers for load balancing).

This sounds interesting. How do you perform load balancing by just setting up two datasources with each connecting to a different database?
It sounds more like sharding based on tables. Or did I miss something?

Roman Hoyenko:
My question is about best practice for setting up Jackrabbit - where should I store the documents - on filesystem or in the database? How should I handle the clustered weblogic instances?
Does jackrabbit manage it's own transactions? Does it need XA?

Why I am asking this is because I just used default settings, changed connection to point to my db pool and I am getting deadlocks, so trying to figure out what is wrong.

[...]

I ended up using
dl.hook.impl=com.liferay.documentlibrary.util.FileSystemHook

for document library. It will store the files on local filesystem. Some of the stuff document library stores in the database, but that should work fine in the cluster. And there should not be problems for me in using local filesystem in cluster since it's managed by the filesystem (in case there are simultaneous updates).

From that I would read you tried to put the documentlibrary into a database.
As soon as you have several liferay instances access the same documentlibrary you get deadlock situations.

So there may be room for improvement on the documentlibrary locking code to prevent deadlocking. In my eyes this is within liferay, or if liferay just delegates to Jackrabbit then they might have the same issue.