Month: August 2006

Thoughts on workflow and jBPM

I just got back from jBPM training at JBoss. The class itself needs a bit of streamlining but the instructors are aware of that and are working to improve the offering. The technology, though, is very cool. It really got me thinking about how my own workflow apps have evolved over the years and made me even more excited about the up-coming Alfresco 1.4 release (see Roadmap) which will include jBPM as its workflow engine.

Just about every application I’ve developed over the last thirteen years has had some type of workflow. In the early Lotus Notes days, Notes equalled “workflow”. What that meant to most Notes applications was that there was a field on a Notes document that kept track of the document’s status. Security on the document (or even fields on the document) would change based on that status field. Moving from state-to-state was handled by any number of bits of code in actions (macros) or form events. Every aspect of the business process was essentially diffused into every nook-and-cranny of the Notes database. There was no concept of a central definition of the business process–the process was everywhere. Ironically, Lotus Notes–the de facto standard for “workflow” applications–didn’t have a built-in workflow engine as we would identify one today.

Custom techniques for handling workflow evolved over time. In a custom content management application I was involved with, for example, we developed our own XML-based state engine to handle workflow which was a vast improvement over simple field-based state management. It had its own API you could call to move documents through the process and supported simple events like sending an email notification. At about that time, third-parties began offering add-on products for Lotus Notes and Domino that let you define, manage and execute a process in a similar fashion. Although loosely-coupled with the app, you were still tied to the underlying Notes/Domino platform.

Then I began working with Documentum. Documentum’s Workflow Manager (and, later, Business Process Manager) uses a graphical tool to define process templates which are then instantiated and executed in a run-time (Tomcat) that runs within the context of the overall Documentum content server. It worked pretty well for every project I ever used it with, but it has a few short-comings:

  • Although the marketing hype is that any business analyst can create and manage the workflow, this is true only for the simplest of processes. Every workflow I have ever worked on is complex enough to require automated activities (tasks written with code). In Documentum’s workflow tools automated tasks are not transparent to the analyst. So working with workflows is inherently a “development-oriented” task. Expecting that a non-technical business analyst can change a workflow without significant knowledge of the underlying application just isn’t realistic.
  • The process definition is proprietary. Your two choices for creating and managing process definitions in Documentum are to use Documentum’s graphical tools or by writing your own custom code against the Documentum API to create process definitions. On a related note, the definition isn’t human readable without the tool–you can’t just pull up a Documentum workflow template with a text editor to see what’s going on.
  • The concept of lifecycles and processes are separate. To me, the concept of what state a document is in (Documentum calls this a “lifecycle”) and the process a document goes through (“workflow”) are so interdependent they should be modeled together. In Documentum these are two separate concepts. I suppose it can be simplifying–if you only need states you can just use lifecycles without workflows. But I always found it a bit clumsy when dealing with the interactions between the two.
  • The process definitions are tightly-coupled with the Documentum platform. The obvious problem with this is that processes defined within Documentum are not portable to other content management systems or workflow engines. In my clients, I also saw this as a source of confusion–when should they use the “embedded” workflow within Documentum versus another workflow product?
  • The framework is limited and cannot be easily extended. Documentum’s workflow is purpose-built for moving documents around, not data. That’s great if you are working with files but there are many applications that need process functionality that aren’t document specific. If you’re working with the Documentum workflow engine and you just want to route some raw data around your only real choice is to put it in a “content-less object” and route that around. And because the framework is proprietary you couldn’t fix this if you wanted to.

Now I’m digging in to jBPM and I’m excited about what I’m finding. Loosely-coupling the workflow engine with the content management system, basing the process definition on open standards, making the implementation open and extensible, and providing a run-time that requires nothing more than a servlet container and a relational database creates a robust, flexible workflow engine that addresses many of the shortcomings of embedded, proprietary solutions. jBPM is one example of such a solution but there are others in the open source world.

jBPM process definitions can be defined graphically using an Eclipse plug-in. Because the process definition is expressed in XML, you also have the option of writing these by hand, programmatically, or with any tool that can output XML.

Don’t like the out-of-the-box jBPM implementation for a “split” or a “join”? No problem. Override their implementation with your own logic. In fact, adding your own logic to the process is usually as simple as pointing the event handler for a node or transition to a POJO that implements a simple, often single-method, interface.

Any application can take advantage of the engine. Integration is possible by directly talking to the jBPM API or through less tightly-coupled methods such as JMS and web services.

If you are a Documentum customer about to implement a process-centric application, should you ditch Business Process Manager and go with something like jBPM? I’m not ready to draw that conclusion. What gets me excited, though, is knowing that I can implement robust workflow in any application I build by leveraging an open tool like jBPM. And when open source projects like Alfresco incorporate it into their solutions, I don’t feel like I’m giving up anything when compared to proprietary competitors. In the case of Alfresco with jBPM compared to something like Documentum and its proprietary, embedded worklfow engine, it actually feels like I’m gaining functionality.

To learn more about jBPM, check out the jBPM Wiki. Specifically, the Getting Started Guide has everything you should need to start learning.

Alfresco web client customization

I implemented a simple Alfresco web client customization over the weekend. At Optaros, we’ve got two Alfresco repositories–one in North America and one in Europe. Alfresco doesn’t yet offer federated repositories, but we needed some way to make it easier for folks to jump between the two, and give at least a rough feeling of there being one repository, not two.

So, I added a new section to the “Shelf”. The shelf is a little piece of collapsible real estate in the Alfresco web client UI that contains things like recently viewed spaces (i.e., folders) and bookmarks. The new Shelf Item I added is called “Repositories”. It is essentially a list of links that point to all of the Alfresco repositories in your environment. Users can then click the repository name to open the home space for that repository.

Obviously, it doesn’t implement single sign-on, but at least people can jump between the repositories quickly. And, we should be able to leverage the config in the future to do things like federated search.

This type of customization is a decent way to learn about Alfresco UI customization because it is pretty constrained in scope and yet involves a good cross-section of Alfresco UI config elements like components, tag libs, the configuration service, and actions.

I originally wanted to extend the Alfresco JSPs, but as it turns out, the shelf is implemented in a “parts” JSP that is included in just about every other JSP. The maintenance pain of overriding the config for every out-of-the-box JSP is worse than the potential pain of simply overwriting their JSP with my customized JSP so I chose the latter path. Still, everything else follows their customization model.

According to the forum and this JIRA post, they are aware of the problem that those explicitly-included JSP files cause. No word on when it will be fixed.

I’ll write up the details for how I did the customization and post them here when I get a chance.

A few Alfresco tips

Here are a few Alfresco tips. Nothing Earth-shattering here–these issues are documented on the wiki or in the forums. I came across these issues while getting the 1.3.0 Alfresco WAR running on Tomcat 5.5.17 with MySQL 4.12 on Ubuntu 5.10 (“Breezy Badger”).

Use extension configuration files to override default repository settings. Refer to this wiki page to learn how to do this. If you are running the WAR file only, you can get some sample configuration files from either the source distribution or the Alfresco-Tomcat bundle. In the bundle, they are in /tomcat/shared/classes/alfresco/extension.

If you move your data directory, drop and re-create your alfresco database. There’s probably a way to avoid this if you need to relocate a production data directory, but if you are just working with test data, dropping and re-creating may be the quickest solution. The data directory stores Lucene full-text indexes as well as user account information. You can specify a location for your data directory using the dir.root property in an extension file.

Add “userServerPrepStmts=false” to your JDBC connection string. If you are using MySQL and you get a Hibernate exception like “Could not execute JDBC batch update…Incorrect arguments to mysql_stmt_execute” try making this change. You can tweak the default JDBC connection string by overriding the db.url configuration property.