Category: Alfresco

Alfresco open source content management

Early look at Alfresco WCM

I’ve been using my recent travel time to have a look at Alfresco‘s Web Content Management (WCM) Preview Release 1 (download). The Preview Release does not include enough functionality to do an apples-to-apples comparison against existing competitors, but it does give you a feel for the templating functionality and virtualization.

At a high-level, the content authoring/publication workflow goes like this:

  1. A form is defined using XSD and uploaded to the Content Forms folder in the Data Dictionary.
  2. Forms are optionally associated with one or more XSL stylesheets to transform the content into any desired output format. (The XSD and XSL files are conveniently stored within the same folder).
  3. Content authors use a form to submit new content which is saved in the repository as XML. When it is saved, the XML is transformed into one format for each XSL defined for that form.

Creating XML content through templates

The templating engine is based on Chiba. Alfresco’s implementation exceeds what I’ve seen other vendors do with Chiba. The UI is congruent with the rest of the Alfresco web client and the form widget performance seemed quite snappy.

Alfresco’s WCM review guide uses the Hello World of the ECM market–Press Release–as its primary example. We can all relate to a Press Release, which is why everyone uses it, but it has its limitations. I’m anxious to see some advanced form functionality such as wizard-like behavior, complex validation, or form fields that are shown/hidden based on logic. But, again, this is only a preview. The example does include a run-time lookup to a JSP to get the options for a radio button, so that’s good. Ultimately it will take some real world examples (hopefully shared with the community) to show the full power of XForms.

Alfresco has said that the forms piece will be leverageable by non-WCM solutions built on the Alfresco platform, which is definitely important. Form submission is a low-hanging fruit for document management implementations. When I tried to create some form content outside of a web site folder, however, it didn’t work. I didn’t have time to figure out why. I’ll check on it again in a subsequent release.

Providing an isolated working environment through virtualization

The Virtualization Server is where we get our first glimpse of the Interwoven DNA Alfresco acquired to help drive the development of the WCM offering. Those who have worked with Interwoven TeamSite in the past will be instantly familiar with the concept of Virtualization and the Alfresco Sandbox. For those who missed out, a Sandbox is like your own personal snapshot of the website. It gives a content worker (I’ll use “worker” as a generic term to cover “Author”, “Publisher”, “Manager”, or any of the other terms that might have specific connotations to a particular WCM implementation) a way of seeing their changes before they are integrated into the site. Those changes might be to static pages, graphics, or even dynamic element such as JSPs. When multiple content workers are making changes to a site simultaneously, the sandbox model can be very efficient because everyone works in their own pristine environment until they are ready to integrate in the “Staging” area by promoting their changes.

The evaluation bundle comes with a separate instance of Tomcat that acts as the Virtualization Server. If you look in the work directory, you’ll see a web application for each user of each web site–these are essentially the sandboxes.

The first thing you’ll notice when you preview a piece of content is the URL. Out-of-the-box, Alfresco uses a public domain that their name server resolves to your loopback address. The sandbox webapp then takes it from there. This means that when you are running the virtualization server you’ll need either: (1) An internet connection or (2) Your own name server.

Because I frequently give disconnected demos I went with option 2. The high-level steps to get this working on my Ubuntu-based laptop were:

  1. Install bind9 (Follow steps here).
  2. Setup a zone file for my domain with a wildcard entry that resolves to my IP address (Follow rough steps here).
  3. Configure networking to use the new, local name server. In Ubuntu you can use the Networking GUI or update /etc/resolv.conf.
  4. Configure Alfresco to use my local domain for virtualization instead of the public domain by updating shared/classes/alfresco/extension/web-client-config-custom.xml (Read the Virtualization Configuration page on the Alfresco Wiki here).

What’s Next?

This preview achieved its desired goal–it gave us a tantalizing glimpse into the state of Alfresco’s WCM efforts. If you are implementing WCM in the next six months it is worth the time to download and play with the preview release.

The next big things to look for in subsequent preview releases are:

  • Integration of the Alfresco 1.4 jBPM Engine with WCM
  • Versioning and rollback
  • Content deployment

When those are in place and working, early adopters may have enough functionality to implement, and we’ll be able to do valid head-to-head comparisons with Alfresco’s competitors.

Getting started with Alfresco Advanced Workflow

If you are ready to take a look at the new Advanced Workflow features in Alfresco 1.4, here’s what you need to do…

Download and expand the Alfresco 1.4 Preview. If you already have Alfresco 1.3 on your machine I’d recommend moving your extensions directory and your Alfresco web root someplace safe. You should also create a new, separate database and user for the preview (e.g., alfresco14) unless you are also testing out the 1.3-to-1.4 upgrade experience.

Using the Alfresco 1.4 Preview WAR distribution is probably the way to go if you’ve already got 1.3 set up. You’ll obviously need to tweak the db_setup and db_remove scripts with the 1.4-specific user, database, and password. I used a copy of my 1.3 extension files, modified to point to the 1.4-specific database and a new data directory. That kept my LDAP authentication configuration intact. Your steps will vary here depending on your distribution and your local install preferences.

Once you get 1.4 configured like you want it, fire up Alfresco. You’ll notice that the default page is the “Alfresco Dashboard”. If you are familiar with Documentum, this is essentially the inbox. Tasks waiting for a user to take action are displayed here. You can get back here at any time by clicking the “My Alfresco” link in the top-most navigation bar.

Now find (or import) a document and click on the actions dropdown for the document. One of the actions listed should be “Start Workflow”. There are two sample workflows available out-of-the-box. Try them both out to get a feel for some of the functionality that’s available. Note that in the preview release, you must set a due date for the task. If you leave it blank you’ll get an error. (This is already in JIRA as a bug and will be fixed by the time of the full release).

In the preview release the admin console (the link moved from “More Actions” to a little icon at the top of the page) contains a link to the jBPM console. This is useful for debugging workflows. According to the forums, the final release will still make the console available, but it may not be included as a link on the admin console. The jBPM console will make more sense to you after you’ve followed the rest of the recommendations in this post.
Once you get the gist of the end-user experience it is time to take a peek under the covers.

Assuming you don’t already know the details of jBPM here are the steps to follow…

  1. Download and expand the jBoss jBPM Starter Kit.
  2. Install the jBPM Graphical Process Designer plugin for Eclipse. It’s include as part of the starter kit. It lets you define the workflows graphically and then deploy them to the jBPM run-time without ever leaving Eclipse. (For playing around with jBPM on its own you can use the run-time included in the jBPM Starter Kit. For deploying processes to Alfresco, the jBPM run-time is the Alfresco repository).
  3. Next, go through the Getting Started with jBPM Guide, which, strangely enough, is not included in the Starter Kit. This provides a good overview of the core jBPM functionality and provides context you might find helpful as you get started with jBPM within Alfresco.
  4. Now look through the user guide that’s included with the Starter Kit. I’d focus on chapters 9, 10, and 11. The rest is good information but isn’t critical for now.
  5. Finally, go through the Workflow Administration page on the Alfresco wiki to learn how to build your own Alfresco workflows with jBPM.

As you’ll see on the wiki page, any advanced workflow you create will probably require defining an extension to the content model as well as creating property sheet definitions so that the metadata you want your users to set during the workflow will be presented as part of the Alfresco UI. That means if you haven’t already, you’ll definitely need to understand the Alfresco extension mechanism and custom content models which are documented pretty well on the wiki.

Alfresco still has some work to do before 1.4 ships but the preview release should give you more than enough functionality to figure out how you’ll implement your complex business processes within Alfresco 1.4 when it goes GA. While the configuration and deployment is a bit more involved than it is with Documentum Workflow Manager/Business Process Manager), I think you’ll find Alfresco’s embedded jBPM to be a much more flexible, extensible, and powerful workflow engine overall. I’ll expand on that thought in a future post.

Alfresco 1.3 podcast

John Newton and Ian Howells discuss Alfresco 1.3 in this podcast. If you’ve already played with 1.3 (or maybe you’re already on to the 1.4 pre-release) there’s nothing new here. If you’re new to Alfresco it might give you a taste for how the platform is evolving.

Thoughts on workflow and jBPM

I just got back from jBPM training at JBoss. The class itself needs a bit of streamlining but the instructors are aware of that and are working to improve the offering. The technology, though, is very cool. It really got me thinking about how my own workflow apps have evolved over the years and made me even more excited about the up-coming Alfresco 1.4 release (see Roadmap) which will include jBPM as its workflow engine.

Just about every application I’ve developed over the last thirteen years has had some type of workflow. In the early Lotus Notes days, Notes equalled “workflow”. What that meant to most Notes applications was that there was a field on a Notes document that kept track of the document’s status. Security on the document (or even fields on the document) would change based on that status field. Moving from state-to-state was handled by any number of bits of code in actions (macros) or form events. Every aspect of the business process was essentially diffused into every nook-and-cranny of the Notes database. There was no concept of a central definition of the business process–the process was everywhere. Ironically, Lotus Notes–the de facto standard for “workflow” applications–didn’t have a built-in workflow engine as we would identify one today.

Custom techniques for handling workflow evolved over time. In a custom content management application I was involved with, for example, we developed our own XML-based state engine to handle workflow which was a vast improvement over simple field-based state management. It had its own API you could call to move documents through the process and supported simple events like sending an email notification. At about that time, third-parties began offering add-on products for Lotus Notes and Domino that let you define, manage and execute a process in a similar fashion. Although loosely-coupled with the app, you were still tied to the underlying Notes/Domino platform.

Then I began working with Documentum. Documentum’s Workflow Manager (and, later, Business Process Manager) uses a graphical tool to define process templates which are then instantiated and executed in a run-time (Tomcat) that runs within the context of the overall Documentum content server. It worked pretty well for every project I ever used it with, but it has a few short-comings:

  • Although the marketing hype is that any business analyst can create and manage the workflow, this is true only for the simplest of processes. Every workflow I have ever worked on is complex enough to require automated activities (tasks written with code). In Documentum’s workflow tools automated tasks are not transparent to the analyst. So working with workflows is inherently a “development-oriented” task. Expecting that a non-technical business analyst can change a workflow without significant knowledge of the underlying application just isn’t realistic.
  • The process definition is proprietary. Your two choices for creating and managing process definitions in Documentum are to use Documentum’s graphical tools or by writing your own custom code against the Documentum API to create process definitions. On a related note, the definition isn’t human readable without the tool–you can’t just pull up a Documentum workflow template with a text editor to see what’s going on.
  • The concept of lifecycles and processes are separate. To me, the concept of what state a document is in (Documentum calls this a “lifecycle”) and the process a document goes through (“workflow”) are so interdependent they should be modeled together. In Documentum these are two separate concepts. I suppose it can be simplifying–if you only need states you can just use lifecycles without workflows. But I always found it a bit clumsy when dealing with the interactions between the two.
  • The process definitions are tightly-coupled with the Documentum platform. The obvious problem with this is that processes defined within Documentum are not portable to other content management systems or workflow engines. In my clients, I also saw this as a source of confusion–when should they use the “embedded” workflow within Documentum versus another workflow product?
  • The framework is limited and cannot be easily extended. Documentum’s workflow is purpose-built for moving documents around, not data. That’s great if you are working with files but there are many applications that need process functionality that aren’t document specific. If you’re working with the Documentum workflow engine and you just want to route some raw data around your only real choice is to put it in a “content-less object” and route that around. And because the framework is proprietary you couldn’t fix this if you wanted to.

Now I’m digging in to jBPM and I’m excited about what I’m finding. Loosely-coupling the workflow engine with the content management system, basing the process definition on open standards, making the implementation open and extensible, and providing a run-time that requires nothing more than a servlet container and a relational database creates a robust, flexible workflow engine that addresses many of the shortcomings of embedded, proprietary solutions. jBPM is one example of such a solution but there are others in the open source world.

jBPM process definitions can be defined graphically using an Eclipse plug-in. Because the process definition is expressed in XML, you also have the option of writing these by hand, programmatically, or with any tool that can output XML.

Don’t like the out-of-the-box jBPM implementation for a “split” or a “join”? No problem. Override their implementation with your own logic. In fact, adding your own logic to the process is usually as simple as pointing the event handler for a node or transition to a POJO that implements a simple, often single-method, interface.

Any application can take advantage of the engine. Integration is possible by directly talking to the jBPM API or through less tightly-coupled methods such as JMS and web services.

If you are a Documentum customer about to implement a process-centric application, should you ditch Business Process Manager and go with something like jBPM? I’m not ready to draw that conclusion. What gets me excited, though, is knowing that I can implement robust workflow in any application I build by leveraging an open tool like jBPM. And when open source projects like Alfresco incorporate it into their solutions, I don’t feel like I’m giving up anything when compared to proprietary competitors. In the case of Alfresco with jBPM compared to something like Documentum and its proprietary, embedded worklfow engine, it actually feels like I’m gaining functionality.

To learn more about jBPM, check out the jBPM Wiki. Specifically, the Getting Started Guide has everything you should need to start learning.

Alfresco web client customization

I implemented a simple Alfresco web client customization over the weekend. At Optaros, we’ve got two Alfresco repositories–one in North America and one in Europe. Alfresco doesn’t yet offer federated repositories, but we needed some way to make it easier for folks to jump between the two, and give at least a rough feeling of there being one repository, not two.

So, I added a new section to the “Shelf”. The shelf is a little piece of collapsible real estate in the Alfresco web client UI that contains things like recently viewed spaces (i.e., folders) and bookmarks. The new Shelf Item I added is called “Repositories”. It is essentially a list of links that point to all of the Alfresco repositories in your environment. Users can then click the repository name to open the home space for that repository.

Obviously, it doesn’t implement single sign-on, but at least people can jump between the repositories quickly. And, we should be able to leverage the config in the future to do things like federated search.

This type of customization is a decent way to learn about Alfresco UI customization because it is pretty constrained in scope and yet involves a good cross-section of Alfresco UI config elements like components, tag libs, the configuration service, and actions.

I originally wanted to extend the Alfresco JSPs, but as it turns out, the shelf is implemented in a “parts” JSP that is included in just about every other JSP. The maintenance pain of overriding the config for every out-of-the-box JSP is worse than the potential pain of simply overwriting their JSP with my customized JSP so I chose the latter path. Still, everything else follows their customization model.

According to the forum and this JIRA post, they are aware of the problem that those explicitly-included JSP files cause. No word on when it will be fixed.

I’ll write up the details for how I did the customization and post them here when I get a chance.

A few Alfresco tips

Here are a few Alfresco tips. Nothing Earth-shattering here–these issues are documented on the wiki or in the forums. I came across these issues while getting the 1.3.0 Alfresco WAR running on Tomcat 5.5.17 with MySQL 4.12 on Ubuntu 5.10 (“Breezy Badger”).

Use extension configuration files to override default repository settings. Refer to this wiki page to learn how to do this. If you are running the WAR file only, you can get some sample configuration files from either the source distribution or the Alfresco-Tomcat bundle. In the bundle, they are in /tomcat/shared/classes/alfresco/extension.

If you move your data directory, drop and re-create your alfresco database. There’s probably a way to avoid this if you need to relocate a production data directory, but if you are just working with test data, dropping and re-creating may be the quickest solution. The data directory stores Lucene full-text indexes as well as user account information. You can specify a location for your data directory using the dir.root property in an extension file.

Add “userServerPrepStmts=false” to your JDBC connection string. If you are using MySQL and you get a Hibernate exception like “Could not execute JDBC batch update…Incorrect arguments to mysql_stmt_execute” try making this change. You can tweak the default JDBC connection string by overriding the db.url configuration property.

Alfresco promises better portal integration

Recently, John Newton sent me an email thanking me for my post on Alfresco’s JBoss Portal integration. He said they are looking at providing additional Alfresco portlets in up-coming releases. Being able to use Alfresco as a replacement JCR repository for JBoss Portal is also in the works. Apparently the Liferay-Alfresco bundle is configured in this way but I haven’t had a chance to take a peek yet.