When a new intern onboards in the LAX office, their first step is to build Liferay from source. As a side-effect of that, usually the person that handles the intern onboarding will see that the reported
ant all time is in hours rather than in minutes, and then start asking people, "Is it normal for it take this long to build Liferay for the first time?"
It happens frequently enough that sometimes my hyperactive imagination supposes that a UI intern's entire first day must be spent cloning the Liferay GitHub repository and building it from source, and then going home.
In hindsight, this must be what the experience is like for a developer in the community looking to contribute back to Liferay as well. Though, rather than go home in a literal sense, they might stare at the source code they've downloaded and built (assuming they got that far) and think, "Wow, if it took this long just to get started, it must be really terrible to try to do more than that," and choose to go home in a less literal way, but with more dramatic flair.
One of the many community-related discussions we have been having internally is how we can make things better, both internally and externally, when it comes to working with Liferay at the source code level. Questions like, "How do we make it easier to compile Liferay?" or "How do we make it easier to debug Liferay?" After all, just how open source are you if it's an uphill battle to compile from source in order find out if things are already fixed in branch?
We don't have great answers to these problems yet, but I believe that we can, at a minimum, provide a little more transparency about what we are trying internally to make things better for ourselves. Sharing that information might give all of us a better path forward, because if nothing else, it lets us ask important questions about the pain points rather than bikeshed color ones.
Step 1: Upgrade Your Build System
Let's say I were to create a survey asking the question, "Which of the following numbers best describes your build time on master in minutes (please round up)?" and gave people a list of options, ranging from 5 minutes and going all the way up to two hours.
This question makes the unstated assumption that you are able to successfully build master from source, because none of the options is, "I can't build master from source." Granted, it may seem strange that I call that an "assumption", because why would you not be able to build an open source product from source?
If you've seen me at past Liferay North America Symposiums, and if you were really knowledgeable about Dell computers in the way that many people are knowledgeable about shoes or cars, you'd know that I've been sporting a Dell Latitude E6510 for a very long time.
It's a nice machine, sporting a mighty 8 GB of RAM. Since memory is one of the more common bottlenecks, this made it at least on-par with some of the machines I saw developers using when I visited Liferay clients as a consultant. However, to be completely honest, a machine with those specifications has no hope of building the current
master branch of Liferay without intimate knowledge of Liferay build process internals. Whenever I attempted to build master from source without customizing the build process, my computer was guaranteed to spontaneously reboot itself in the middle.
So why was this not really a problem for other Liferay developers?
Liferay has a policy of asking its developers to accept upgrades to their hardware once every two to three years. The idea is that if new hardware increases your productivity, it's such a low cost investment that it's always worthwhile to make. A handful of people resist upgrades (inertia, emotional attachment to Home and End keys, etc.), but since almost everyone chooses to upgrade, Liferay has an ivory tower problem, where much of Liferay has no idea what it's like to even start up Liferay on an older machine, not to even discuss what it's like to compile Liferay on those older machines.
- Liferay tries to do parallel builds, which consumes a lot of memory. To successfully build Liferay from source, a dedicated build system needs 8 GB of memory, while a developer machine with an IDE running needs at least 16 GB of memory.
- Liferay writes an additional X GB every time you build, a lot of it being just copies of JARs and
node_modules folders. While it will succeed on platter drives, if you care about build time, you'll want Liferay source code to live on a solid-state drive to handle the mass file creation.
So eventually, I ran into a separate problem which required a computer upgrade, I needed to run a virtual machine that itself wanted 4 GB, and so that combined with running Liferay alongside an IDE meant my machine wasn't up to handling my task. After upgrading, the experience of building Liferay is substantially different from how it used to be. While I have other problems like an oversensitive mousepad, building Liferay is no longer something that made me wonder what else could possibly go wrong.
If you weren't planning on upgrading your computer in the near future, it doesn't make sense to upgrade your computer just to build Liferay. Instead, consider spinning up a virtual machine in a cloud computing environment that has the minimum requirements, such as something in your company's internal cloud infrastructure, or even a spot instance on Amazon EC2. Then you can use those servers to perform the build and you can download the result to your local computer.
Step 2: Clone Central Repository
So let's assume you've got a computer or virtual machine satisfying the requirements listed above. The next step is to get the source code so you can use this machine to build Liferay from source.
The first step that interns get hung up on is cloning the Liferay repository. If you've ever tried to do that, you'll find that Liferay has violated one of the best practices of version control and we've committed a large folder full of binary files, .gradle. As a result of having this massive folder, GitHub sends us angry emails and, of course, cloning our repository takes hours.
How does Liferay make this better internally? Well, in the LAX office, the usual answer is to plug in the ethernet cable. Liferay invested heavily in fast internet, and so simply plugging in the ethernet cable makes the multi-hour process finish in 30 minutes.
However, it turns out that there is actually a better answer, even in the LAX office. Each office has a mirror that holds archives of various GitHub repositories, including
liferay/liferay-portal. We suspect the original being mirrored is maintained by Quality Assurance, because we have heard that keeping all of our thousands of automated testing servers in sync used to result in angry emails from GitHub. Since it's an internal mirror, this means that downloading X GB and unzipping it takes a few minutes, even over WiFi, and it's on the order of seconds if you plug in your ethernet cable.
So, in order to improve our internal processes, we've been trying to get the people who manage our new hires and new interns to recognize that such a mirror exists and to use it during their onboarding process to save a lot of time for new hires on their first day.
So what does this mean for you?
Essentially, if you plan to clone the code directly onto your computer for simplicity, you'll need to make sure that it's during a time where you won't shut down the computer for a few hours and when you don't need it for anything (maybe run it overnight), because it's a time-consuming process.
Alternately, have a remote server perform the clone, and then download an archive of the
.git folder to your local computer, similar to what Liferay is trying to do internally. This will free up your machine to do useful things, and even spinning up Amazon EC2 spot instances (like an m1.small) and bringing things down with either SCP or an S3 bucket as an intermediate point may be beneficial.
Step 3: Build Central Repository
The next step is your first build from source. This is done with a single command that theoretically handles everything. However, before you run this single command, you might need to do things to reduce the number of resources it consumes.
- Liferay issues a lot of requests to the NPM registry in parallel builds. You can cap this by checking
nodejs.npm.args, and taking the commented out line and adding it to your own
- Liferay includes a lot of extra things most people never need. You can remove these by checking
build.include.dirs and using its commented out value in your
build.USERNAME.properties, or adjusting it for your needs if you want more than what it tries by default.
- If you're on Windows, disable Windows Defender (or at least disable it on specific folders or drives). The ongoing scan drastically slows down Liferay builds.
After you've thought through all of the above, you're ready for the command itself, which requires that you download and install Apache Ant, and after knowing that this is what I'm asking you to download, you might also realize that this means that the entry point for everything is
So now you've built the latest Liferay source code, right?
Another trick question!
What's in the master branch of
liferay-portal is actually not the latest code. Liferay has started moving things into subrepositories, which you can see from the hundreds of strangely named repositories that have popped up under the Liferay GitHub account.
However, a lot of these repositories are just placeholders. These placeholders are in what's called "push" mode, where code from the
liferay-portal repository is pushed to the subrepository. However, a handful of them (five at the time of this writing) are actually active where they're in what's called "pull" mode, where code is pulled from the subrepository into the
liferay-portal repository on-demand. You know the difference by looking at the
.gitrepo file in each subrepository and checking the line describing the
However, because all of those files are actually also on the central repository, after you've cloned the
liferay-portal repository, you can use the files there to find out which subrepositories are active with
xargs magic run from the root of the repository.
git ls-files modules | grep -F .gitrepo | xargs grep -Fl 'mode = pull' | xargs grep -h 'remote = ' | cut -d'=' -f 2
I will dive into more detail on the subrepositories in a later entry when we talk about submitting fixes, but for now, they're not relevant to getting off the ground running other than an awareness that they exist, and an awareness that additional wrinkles exist in the fix submission process as a side-effect of their existence.
Step 4: Choose an IDE
At this point, you've built Liferay, and the next thing you might want to do is point an IDE with a debugger to the artifact you've built, so that you can see what it's doing after you start it up. However, if you point an IDE to the Liferay source code in order to load the source files, whether it's Netbeans, Eclipse, or IntelliJ, you'll notice that while Liferay has a lot of default configurations populated, but these default files are missing about 90% of Liferay's source folders.
If you're using Netbeans, given the people who have forked the Liferay Source Netbeans Project Builder overlap exactly with the team I know for sure uses Netbeans, this tool will help you hit the ground running. Since there are recent commits to the repository, I have confidence that the Netbeans users team actively maintains it, though I can't say with equal confidence how they'll react to the news that I'm telling other people about it.
If you're using Eclipse or Liferay IDE, then Jorge Diaz has you covered with his generate-modules-classpath script, which he has blogged about in the past, and his blog post explains its capabilities much clearly than I would be able to in a mini-section at the end of this getting started guide.
If you're using IntelliJ IDEA Ultimate, you can take advantage of the liferay-intellij project and leave any suggestions. It was originally written as a streams tutorial for Java 7 developers rather than as a tool, and I still try to keep it as a streams tutorial even as I make improvements to it, but I'm open to any improvement ideas that make people's lives easier in interacting with Liferay core code.
Step 5: Bend Liferay to Your Will
So now that everything is setup for your first build, and you're able to at least attach a debugger to Liferay, the next thing is to explain what you can do with this newly-discovered power.
However, that's going to require walking through a non-toy example for it to make sense, so I'll do that in my next post so that this one can stay as a "Getting Started" guide.