Pre-Upgrade Scripting (6.2 > DXP) - Pt 3 - Web Content Version Cleanup

We have a lot of distributed web content management in our Liferay installation and many articles are undoubtedly edited many times over a period of years. Let's take a look...:

select count(*) from journalarticle; Returns 19,984 records

And, via the script console we can see there are much fewer unique articles:

out.println(com.liferay.portlet.journal.service.JournalArticleResourceLocalServiceUtil.getJournalArticleResourcesCount())

which returns 3,861 articles. I think we can make some headway, here, too. An "article" resource object actually contains all of its corresponding historical versions and the link between them is the ResourcePrimKey, which is also what is needed to fetch the latest version of a given article. You can get an idea of how many unique versions might exist on a given article, use: select distinct version from journalarticle;  We were shocked to see an article edited as many as 221 times!

Here is the script so far. It can process any number of web content resources or *all* of them. Feel free to comment on potential improvements...: