Category: Alfresco

Alfresco open source content management

Proprietary ECM solutions continue to aggravate resource availability issue

Alan Pelz-Sharpe noted in a recent post at CMSWatch that it is becoming harder and harder to find talented Documentum and, more generally, ECM skills in the marketplace. This is yet another datapoint in a trend that has been building over the last few years. (In a post last January, I asked Documentum to open up the WDK. The resource shortage was one reason.).

In my opinion, this problem is only going to get worse, at least until open source ECM solutions unseat the proprietary vendors. Developers in the ECM space are much more excited about investing their limited
resources in open, standards-based platforms.

Becoming a guru in a proprietary solution within a rapidly commoditizing market may yield short-term gains but is a dead-end in the long-term. Ironically, it’s those short-term gains that make the problem worse–the people currently enjoying those short-term gains are more likely to continue riding the wave than they are to move into an IT shop.

So I see open source ECM solutions as helping the ECM market with the resource availability problem in two ways: (1) by being built on the frameworks today’s developers are interested in learning and (2) by removing the primary barrier to entry (software license and cost) thus exposing more developers and companies with ECM solutions.

Regarding the first point, when you compare a developer who’s built expertise around Documentum’s Web Development Kit (WDK), a JSF-like framework for building Documentum web apps, with one that’s invested in Alfresco in which the development model is based on JSF/Spring/Hibernate, the Alfresco developer has a better foundation of transferrable skills, whether that’s to pure web application development or other open source ECM solutions built on a similar stack. Developers spend time learning stuff they are interested in and they pay attention to transferability.

As for the second point, freely-available open source ECM solutions are more likely to find their way into the hands of developers (and the servers of enterprises, see this post) because there are no barriers to entry. This should result in a larger pool of resources experienced with working with ECM, in general, as well as specific open source solutions.

As a thought exercise, think about what the resource pool would look like if closed-source, proprietary vendors were the only game in town? The demand would have to be much greater and the solutions much more interesting to develop any sort of resource base interested in specializing. (SAP seems like a real-life example. SAP folks are expensive and hard-to-find and I don’t run into too many up-and-coming developers begging to be sent to SAP training at this point).

So if you are an enterprise struggling to staff your ECM “Center of Excellence” or maybe you can’t even keep the lights on in your server room, maybe it is time to take a long, hard look at open source ECM.

Solr-powered Alfresco

Have you checked out the Apache Solr project yet? It’s pretty cool. It’s essentially a search server (deployed as a web app into a servlet container) that sits on top of Lucene. Solr makes it super easy to get content into and out of Lucene via its HTTP and JSON APIs.

Recently, for a prospective Optaros client, we put together a little demo to show how Alfresco WCM could integrate with Solr to provide search and personalization for a web site managed within Alfresco. Here’s what we did at a high level:

  • Create an Alfresco web form and XSLT for my web content as usual.
  • Create an additional XSLT (or Freemarker) template to convert the XML content to the Solr format. This gets configured as an additional presentation template associated with the web form.
  • Wrote a JSP to aggregate the Solr XML for all of the published content.
  • Wrote a servlet to call the JSP every X seconds. It takes the response and posts it to Solr. That’s how the Alfresco content gets into the index.

This setup allowed web content to get indexed by the Solr search engine upon its creation. Web site users (either using the web site in the virtualized sandbox or on the production web site) could then query the content.

The web site was a mix of static HTML and JSPs. The JSPs used custom taglibs to call “Solr Search” widgets in the right spot on the page. This was the first time I had used Alfresco’s virtualization to run a real web application (as opposed to static content). The preview release of 2.0 I was using seemed to have some significant cacheing issues. Hopefully those are resolved in the production release. Other than that, it was easy to see how technical and non-technical content managers could leverage Alfresco virtualization to collaborate together to develop and manage a dynamic web site.

Before using this approach in production, I would need to think about the best way to handle deletes. In the demo, once content got into the index, it didn’t come out if the associated content was removed from Alfresco. As far as Solr goes, it is easy to get the content deleted from the index–it’s a simple HTTP post. The trick is where in Alfresco to put that call.

Alfresco Meet-up in Boston on March 12

Russ Danner, who works at the Christian Science Monitor, is organizing an Alfresco Meet-up in Boston. I’ll be speaking on Alfresco as a platform for application development.

If you are in the area it sounds like several people are planning on attending so it should be a good opportunity to find out what everyone else is doing with Alfresco. The Christian Science Monitor has been using Alfresco (and open source portal, Liferay) for quite a while.
Below is the post Russ sent out to the CMPros list.

The Christian Science Monitor would like to invite you to attend an Alfresco Enterprise Content Management Meet-up in Boston:

Time: 11am – 3pm (Lunch provided with RSVP)

Place: 200 Massachusetts Ave
The First Church of Christ, Scientist
Boston, MA 02115

Parking: Available on site $5.00
To Attend: Please send RSVP to dannerr at csps dot com.

Over the last year the interest and community around the Alfresco product has really begun to blossom. Those of us working with and evaluating Alfresco in the Boston / New England area are very fortunate in that there are quite a number of customers, community members, employees and service integrators at very close proximity to one another. Some of the greatest benefits to innovation found in open source software stem directly from the community that supports it. Our local community is rich in talent, mindshare, and passion – let’s capitalize on this! While we expect most in attendance will be local, all are welcome to attend (Please forward this on to others.)

The Christian Science Monitor would like to invite you to join us and others in the community for an Alfresco meet-up. This meeting will be an informal opportunity to meet one another, discuss our projects and progress and learn more about Alfresco and our community. We will have a few (non-sales) presentations and ample opportunity for attendees to get to know one another.

Opening
Introductions
1st Presentation (30 minutes)
(Rivet Logic – Overview of Alfresco Architecture)

Lunch (45 minutes – no presentations)

2nd Presentation (30 minutes)
(There is a possibility we will have someone from Alfresco for this)

3rd Presentation (30 minutes)
(Optaros – Leveraging Alfresco as a Platform for ECM/Development)

Round table discussion (45 minutes)
Closing

Alfresco 2.0 Released

Alfresco 2.0 Community is now available for download. Alfresco has done a lot of work since the last time I posted about their work-in-progress. In that post, I said that some open issues about their WCM functionality included: (1) how workflow would be integrated, (2) how versioning and rollback would work, and (3) how deployment would be handled.

Two of those open issues–workflow and versioning/rollback–were addressed in subsequent preview/beta releases. In 2.0, WCM workflow leverages the JBPM engine, but you don’t have to fire up the JBPM Process Designer to implement Alfresco WCM workflows. Instead, there are two default workflows provided out-of-the-box–one for serial flows and one for parallel flows. When you configure the workflow for a web form you specify the people that need to review the content before it is published. If you specified a serial workflow, they’ll get the task one after another or simultaneously if you specified parallel.
In most WCM implementations, it isn’t practical for 100% of the content to be templated. But non-templated content needs to be reviewed as well. Alfresco handles this by allowing you to configure a workflows for non-templated content. A cool feature is that you can use a regular expression to route the content appropriately. For example, you might want images to route to one group of people but PDFs or other editorial content to route to another group.

Virtualization comes into play with regard to workflow. Alfresco’s virtualization is leveraged when reviewers preview the web site so they can see what the web site will look like in the context of the changes being proposed.

In the versioning and rollback department, snapshots of the entire site are taken every time changes are promoted to staging. You can roll back to any of those snapshots with a single click.

Alfresco’s done a good job implementing web forms to be reusable across multiple web projects within the repository, even when different web sites may choose to use the web forms differently. For example, when you set up a web form you can define the default output path, zero or more presentation templates, and a default workflow. The cool thing is that when you instantiate a web project, you can pick from the available web forms. When you do, you can change the default options. This feature is going to save a lot template development and maintenance time, especially for companies managing multiple online properties.
I haven’t yet figured out how well deployment is addressed in the GA release of 2.0. I know the plan is to support deployment between a core Alfresco repository and a “read-only” Alfresco repository as well as to the file system. The early and beta releases of 2.0 didn’t include this functionality so it will be interesting to see how deployment is addressed in the GA release.

There are several other changes that have been made since the preview release. And not all 2.0 features are WCM-specific. (Many will be glad to know a tree view of the repository is now available). Check the roadmap at the Alfresco Wiki for a detailed list of what’s new.

Note that if you’re downloading Community 2.0 and you want both the core repository as well as WCM you’ll need to follow the 2.0 download link as well as the WCM link. The WCM download includes the virtualization server, some config info, and a sample form.

Book review: Alfresco Enterprise Content Management

I’ve recently finished the new Alfresco book. The bottom-line on this book is this: End-users evaluating or learning to use Alfresco may find the book helpful. Most chapters are aimed at teaching people how to work with the web client. For this audience, the book does a pretty good job of presenting a logical progression through the product. Although most of the information in the book is readily available through the Alfresco wiki, forums, demos, and documentation, the book pulls it together in a hassle-free, portable format. Technical users, however, will be disappointed. If you want technical depth this book isn’t for you. Before you shell out the $60 you really need to consider what kind of information you are looking for. You may ultimately be better off surfing blogs, forums, and wikis.
When I originally saw the announcement for the book, I was excited. Shelf space is one datapoint that can be used to measure technology adoption and maturity. More specifically, I was hoping the book would be a one stop shop for business users as well as technical users–sort of an Alfresco Unleashed. But as the preface clearly states, “This book is not targeted at developers who want to change the core code structure of Alfresco.” I don’t necessarily want to change the “core code structure,” but I do want to implement, customize, and extend Alfresco for my clients. In that respect my expectations for the book turned out to be way too high. For now, at least, from a technical publication standpoint, Alfresco has yet to be “Unleashed” or “Exposed”.
Workflow is an example where the book could have gone into much more depth, particularly with the introduction of the JBPM integration in release 1.4. But the chapter on workflow focuses almost exclusively on the simple folder-based workflow functionality. Although there is a section on advanced workflow using JBPM, it only skims the surface by providing the steps one goes through to define an advanced workflow. It does this at such a high level it really provides little value other than to inform the reader that there’s such a thing as advanced workflow.

Other topics that got little or no attention include details on Alfresco’s approach to content storage and the underlying relational database schema, LDAP integration/syncronization, scheduling jobs, and the Alfresco SDK. The book is written for Alfresco 1.4 so the new WCM offering included in 2.0 is completely unaddressed.
Aside from the depth issue, the book could use a bit more organization and a more thorough editing job to be more relevant to the intended audience and to resolve some of the more glaring typos and grammatical problems.

Here’s an example of one aspect of the organizational problem. The book is primarily focused at end- and power-users of Alfresco who need to work with the web client (out of a total of 13 chapters, the first 10 are squarely focused on that use case). But Shariff mixes in details on configuration tasks that end-users would never do such as defining content types and tweaking the web client configuration XML. An alternative approach would have been to keep those first 10 chapters clean of any tasks requiring configuration changes and make it purely about web client how-to. Later chapters could have then gone into configuration.

I did find some useful tidbits in the book. The sections on dashlets and Freemarker in Chapter 11, “Customizing the User Interface” were useful to me as well as Chapter 13 on imaging.
Writing on technical books on Enterprise Content Management is tough. There are many different use cases, some of which differ greatly in what’s deemed critical functionality, often depending on industry and type of organization. Plus there are usually many potential approaches for customizing and extending the underlying platforms. Multiply that by the speed of change we’re currently experiencing in the ECM market and it’s a wonder anyone writes on the topic at all. In fact, I sometimes wonder how relevant traditional technical publications are for such dynamic topics, but I digress.

Shariff should be commended for being the first to publish a real book on Alfresco. I know others will follow. Hopefully, right now, somewhere someone’s frantically cranking out chapters and code snippets. To you I say, keep at it! Shariff may have been first to the punch, but Alfresco is still waiting to be Unleashed!

Working through jBPM Process Designer Upgrade

I’m digging in to the embedded jBPM engine in Alfresco. I was using jBPM Process Designer 3.0.8 running in Eclipse 3.1.2, neither of which were the latest releases. Anticipating headaches, I decided to upgrade before moving on.

Eclipse 3.2.1 install is a breeze, including the addition of the Web Standard Tools and J2EE tools which used to be slightly painful. Now it is all automated through the update manager. Same with the Subclipse plug-in.
There was a problem with the jBPM 3.0.12 install, though. After expanding the zip and restarting, I was getting the following “java tooling” exception:

java.lang.NullPointerException
at org.jbpm.ui.util.JbpmClasspathContainer.getJarNames(Unknown Source)
at org.jbpm.ui.util.JbpmClasspathContainer.createJbpmLibraryEntries(Unknown Source)
at org.jbpm.ui.util.JbpmClasspathContainer.getClasspathEntries(Unknown Source)

This post had the answer. One of my old jBPM projects in the workspace was pointing to the old jBPM runtime. I think I could have edited the .classpath file to fix it but it was easier just to move my processdefinition.xml file, blow away the project, create a new one, and drop my processdefinition.xml file back in.