Month: August 2009

Understanding the differences between Alfresco’s repository implementations

People new to Alfresco are often unaware of the existence of two different repository implementations within the product. One, which I’ll call the “DM Store”, is the classic store, the one that’s been used by Alfresco since the beginning. The other, the “WCM Store” or, as it is often referred to in API-speak, the “AVM Store”, was born with the addition of the Alfresco WCM product offering. Whether you are doing document management or web content management, you use the same Explorer client, but under the covers, your content lives in two very different types of repositories.

The Alfresco story on why a second repository implementation was created is that the Engineers writing WCM didn’t believe the DM store was capable of providing the kind of support for versioning, branching, and layering functionality they needed (hence, the AVM acronym, which stands for Advanced Versioning Manager) so they created an entirely new repository implementation to support WCM.

Why does this matter, apart from being a possible topic of conversation at your next get-together (“Healthcare is easy to fix. Do you think Alfresco will ever unify their two repository implementations?”)? It matters because the “two sides” of Alfresco are not equivalent in terms of functionality and depending on what you need to do, you may find yourself performing unnatural acts to work around the disparity.

Many projects will be completely unaffected by the differences between Alfresco DM and Alfresco WCM. But it is important to know what these differences are when you first begin to plan your solution to avoid uncomfortable conversations between you and your customer when you realize you’ve made a bad assumption.

I’ll assume you know the high-level capabilities of both Alfresco DM and Alfresco WCM. Obviously there are some things one product can do that the other can’t that are by design (sandboxes and virtualization in WCM, for example). What’s more important to understand are the subtle (and sometimes not-so subtle) differences between the two. Here’s the list and a table that summarizes, if you are into the whole brevity thing:

Content Modeling. Alfresco DM uses a proprietary XML-based description of the content model while Alfresco WCM uses XML Schema. On the surface this isn’t a big deal, but it does mean if your repository contains a mix of DM- and WCM-stored data, you won’t have a single model that defines it all and you could possibly have duplication between the two.

Custom Content Types. In Alfresco DM, when you create content, you tell Alfresco what its content type is. If you’ve extended the out-of-the-box model, you can have any number of business-specific content types with your own custom metadata. In Alfresco WCM, custom content types are not supported. In WCM, your content type is your web form. Interestingly, although the “Type” dropdown is shown in the “Create Web Content” dialog, and it will contain custom content types you’ve defined using the Alfresco DM model, your selection will not be honored. All AVM content is created as an instance of the “avmplaincontent” content type no matter what you select. However, although you must do it through an API call, you can apply custom aspects to AVM content.

User Interface Configuration. Alfresco DM uses a proprietary XML-based configuration file to define the “property sheets” that display metadata in the Alfresco Explorer client for a given content type or aspect. Alfresco WCM uses the embedded Chiba XForms engine to inspect the XML Schema (XSD) and automatically create a web form that will produce data that conforms to the XSD. XSD annotations can be used to influence the presentation of the form fields. One outcome of this is that it is much easier to localize things like property labels in Alfresco DM than it is in Alfresco WCM.

User Interface Extension. If you need to change how the Alfresco Explorer client behaves, there are some things you can do through XML, but advanced customizations will require JavaServer Faces (JSF) development. Alfresco DM and WCM both use the same Explorer client so this applies to both (See “Alfresco User Interface: What are my options?”). However, if you need to change how the web form engine works, you may need to write new Chiba XForms widgets. For instance, Optaros developed a web form used to describe points and regions on Google Maps. That kind of thing requires you to understand how to extend Chiba.

Structured (XML) data entry. Data entered in an Alfresco WCM web form is saved as XML that conforms to the XSD you’ve defined. There is no similar facility for capturing data as XML available within Alfresco DM. At one point the Community code line had “ECM Forms” which was essentially WCM web forms for the DM side of the house, but that’s disappeared in the latest Community release. On the DM side, when you edit metadata you are editing object properties whose values get stored in the database, not as XML.

Transformations. You can use either Freemarker or XSLT to transform Alfresco web form XML into other formats. That transformation is defined as part of the web form configuration which you do within the Explorer client. In Alfresco DM, transformations are more about binary file transformations (DOC to PDF or GIF to PNG, for example). If you want to do Freemarker or XSLT transformations on XML content stored in Alfresco DM, you’ll need to write that yourself (an Action would do the trick). If you want to do DM-style transformations on binary files in Alfresco WCM, that’s not out-of-the-box. You’ll have to do that using the API.

Rule actions. Alfresco DM allows you to configure rules on folders to trigger actions (out-of-the-box or custom) to operate against newly-added, updated, or deleted documents. Alfresco WCM does not support rule actions at all.

Auditing. Alfresco DM has a granular auditing sub-system. You can configure it to audit just about anything you want. Anything except WCM. You can audit web project creation, but not changes to individual web assets within a web project. At least not out-of-the-box.

Object-level permissions. In Alfresco DM you can assign users and groups to roles at the folder and file level. In Alfresco WCM, the UI will only let you go as low as the web project level. The API supports more granular security but you have to implement that yourself with custom code.

Search. Everything in Alfresco DM is full-text indexed and searchable. In Alfresco WCM, only the Staging Sandbox of each web project is indexed. You can do a search from your user sandbox but you’re really searching the Staging Sandbox. If you have any content you’ve created in your user sandbox that you have not yet committed to Staging, web project search won’t find it. Another limitation is that you cannot search across web projects. That search box that’s visible in the far upper right-hand corner of the Alfresco Explorer client is the Alfresco DM search–it won’t find anything in any of your web projects.

Advanced Workflow. Alfresco DM and Alfresco WCM use the same JBoss jBPM workflow engine so there’s no functional difference between what you can do with workflow on either side. The only catch is that in Alfresco DM, all deployed workflows show up in the “Start Advanced Workflow” dialog whereas in WCM, you have to tell Alfresco which deployed workflows are okay to use for WCM. That’s covered in the Alfresco Developer Guide and on the wiki.

File protocols. CIFS and FTP are the only two file protocols supported by both Alfresco DM and Alfresco WCM. Similar protocols supported by Alfresco DM such as WebDAV, inbound SMTP, and IMAP, are not supported by Alfresco WCM.

Deployment. Some people use Alfresco DM to manage content that is published to the web because they don’t need the additional features WCM offers, or they have some other reason to export content to another server. Unfortunately, Alfresco DM does not yet offer a deployment component like the one in Alfresco WCM. If you want to export content from Alfresco DM to some other destination in a systematic way, you’ll have to roll your own solution.

As John and Paul said, “It’s getting better”

Some of these differences will become less drastic in coming releases. For example, Alfresco is implementing a new form service that will be used to define the content model and user interface across the entire product line, so that helps. The WCM deployment functionality is also being refactored and will ultimately work for both DM and WCM. And at every community event Alfresco talks about “repository unification” as a goal for the future, although the timeline is lightyears away in terms of software releases.

As I said, depending on what you’re doing these differences may not affect you at all. Just make sure you don’t assume that a given feature is available everywhere, and make sure you’ve made a conscious decision about what content to put in which repository (DM or WCM) based on your requirements.

Review: Liferay Portal 5.2 Systems Development, by Jonas X. Yuan

I’ve just finished Jonas X. Yuan’s book, Liferay Portal 5.2 Systems Development and I thought I’d share a few thoughts.

First, I should probably get this out of the way: Jonas works for Cignex, which, from time to time, competes for business with my firm, Optaros. Okay, back to the book…

My overall impression of this book is that it essentially documents the work Jonas and his team did for one of their clients. While it is great that their project was broad enough to generate enough material to be compiled into a book, I felt like I was reading “here’s what we did on our project” instead of “let me teach you how to do Liferay development”.

When I read a technical book, I like to read about concepts and how I might apply those in different situations, and then dive into a realistic application of that concept. This book definitely covers realistic examples–the screenshots are lifted right out of the solution Jonas and his team built for their client. And I like that the example is fairly consistent throughout the book. But I found it very light on context and concepts. That left me feeling a bit disoriented as Jonas jumped from detail to detail with very little being done to set the scene. A simple explanation of “Why are we doing this?” would have been a big help.

Another thing that made this a tough read for me is that there are many grammatical issues with the text. If this were in one or two places, you could rightly accuse me of being a hard-grading Grandson of an English teacher (which I am). Unfortunately the problem isn’t limited to one or two places–there’s one on nearly every page. I don’t blame Jonas for this, I blame the editor. Is the pressure to publish on schedule so great that there is no time to perform even rudimentary grammar checks for things like missing articles?

If you can get past the style, there are good takeaways in the book. You’ll learn:

  • The difference between building customizations in ext versus plugins
  • How to use ServiceBuilder
  • How to build portlets using Struts and Tiles
  • How to extend the Journal CMS with structures and templates
  • How to build and customize themes and layout templates

There’s a chapter on Liferay’s Social Office and how it works behind the scenes, including details on Inter-Portlet Communication. Jonas has also included a chapter on moving content between multiple environments (Staging/Production) which is an area where portals are often less than optimal.

There is a lot of code included in the book and available for download. Several of the code snippets in the book need to be debugged before they will run properly, but most are easily worked through. The book suggests working with the Liferay source from HEAD, but I had to use the 5.2.3 tag to get the ServiceBuilder stuff to work correctly.

While this book isn’t for everyone, I’m glad Jonas wrote it. Liferay is a complex piece of software and the community needs all the documentation help it can get.

Summer grilling tips for your CMS vendor

I like this post from Jon Marks at JonOnTech. It’s about questions you should be asking your CMS vendor you might not have thought to ask. The first five are especially good (see his post for the explanation of each question and the rest of the list):

  1. Who was the last vendor to beat you in the last round of a selection exercise? Why do you think they won?
  2. If, in a few years time, we decided to move away from your product, how would I go about migrating all my content into a new system?
  3. How many active developers do you have on your developer forums?
  4. All of these are important, but please rate these in order of your priority: a) Product Features b) Performance and Stability c) Usability d) Security
  5. How much would I expect to pay a contractor developer that is skilled with your CMS, and are they easy to find?

I am consistently disappointed with how companies evaluate and choose software vendors. Part of the problem is when companies use RFP processes that handle software purchases the same way that factory equipment purchases are handled, but that’s another post (see Making RFP’s More Effective).

The other part of the problem is the questions that never get asked during the vendor pitch. To Jon’s list, I would add:

  1. How long and how many resources did it take to build this demo? You’re looking for closeness of fit, effort to customize, and skillsets involved.
  2. What are the top three technical resources my team should have at the ready during the implementation? You’re looking for availability and helpfulness of documentation. How much of it is vendor-produced versus community-produced? It’s not necessarily bad if the majority of the resources are community-produced–it’s just a data point.
  3. If it makes sense depending on the kind of software, ask do you use your own software in-house. If they don’t, that’s certainly a data point. If they do, ask, as an end-user, what are your top-three headaches when using the software? This is sort of a “what is your biggest area for improvement” kind of question–watch out for turn-your-weakness-into-a-positive kind of answer (“The software is just too powerful!”). Every piece of software has idiosyncrasies. They should be able to name a few.
  4. Tell us about the last implementation that just went completely sideways for reasons attributable to the technology, not to project mis-management, political, or other issues. Obviously, the vendor scores points for honesty on this one, but it’s also interesting to hear how much/little the vendor was involved in salvaging the deal (if it was able to be salvaged).
  5. What is your maintenance renewal rate? I’ve never heard this one asked, but I would think this would be a very telling stat. Customers have all sorts of reasons for not renewing maintenance, but the obvious one is that they feel like the vendor isn’t giving them enough support value for the expense. For commercial open source vendors, support may be their sole source of revenue (excluding professional services, hosting, etc.), so for them you’d think this would be a very high number, otherwise, what’s the point?

By the way, giving your vendors a good grilling isn’t limited to software companies. Picking a services firm also deserves a good set of probing questions, but that’s also another post.

What about you? Got any good questions to ask CMS or other software vendors?

Yet another reason to love Open Source Content Management

Man, I don’t miss delivering solutions on top of Documentum. After reading Laurence Hart’s post on Documentum Developer Edition, I’m reminded how much I take for granted working exclusively in the open source content management world.

Laurence’s post was intended to discuss the ins and outs of Documentum’s efforts to make it easier for developers, and, as usual, he’s done a good job of that. But it also underscores the benefits enjoyed by those who work in open source land. In case you don’t know how good you’ve got it, my open source brothers and sisters, check it out:

Developers working with closed source ECM vendors have to pay to get the software

As Laurence points out,

“There are lots of independent consultants out there that have trouble keeping-up with the technology because they can’t afford to become partners for the requisite fee.”

If you are a developer looking to go deep on closed source software, you have no choice but to pay. There’s no other way to get access to the software. Sometimes you can’t even get access to the documentation or the bug database without a paid-up partner account (or a client that lets you use theirs).

[UPDATE: Jerry Silver, from EMC, points out that the Documentum Developer Edition is a free download. My original post made it sound like you had to be part of the partner program to obtain the download.]

With open source, the barrier to entry is much lower. You pay nothing to get the software. It’s all about the time and energy you put into learning the product and implementing cool solutions.

To be fair, commercial open source vendors often charge partner fees as well, but the bottom line is that it costs nothing to get started with the code.

Developers working with closed source ECM vendors struggle with giant developer footprints

I feel sorry for Laurence’s laptop:

“The complete Development install calls for 3GB of RAM (after a 1.7+GB download).  That is no small thing for a development laptop.  It needs to be on a newer machine.  If you can move the database service to a different box, that will make your life easier.”

Oh dear. A 1.7GB download for a developer setup? Am I downloading a VM image or a content management server? Let’s look at Alfresco for a comparison. Assuming you are starting from scratch, and assuming you are going to go full-on with the Alfresco platform, your total download is right around 300MB. That includes:

  • Alfresco SDK
  • Alfresco WAR
  • Alfresco WCM (Deployment listener and add-on to core repo)
  • Apache Tomcat
  • Sun JDK
  • MySQL (Server and connector)

All of which runs comfortably in 2GB of RAM and won’t even cause your fan to kick on in 4GB.

Developers working with closed source ECM vendors have less choice

Optaros consultants are now split fairly evenly in their choice of OS across Windows, Mac OS X, and some flavor of Linux. Some people prefer MySQL and some prefer PostgreSQL. Mostly we use Eclipse for Java development but everyone’s got a preference. I use Tomcat for everything locally while others like JBoss. The point is, developers want to use their tools the way they want to. It’s not a stubbornness thing it’s an efficiency thing.

Within my CMS I want the same flexibility. I want to tweak settings. I want to name my database what I want. I want the flexibility to deploy across as many (or as few) nodes as I need to. From Laurence’s post, it sounds like Documentum clearly falls down here.

Developers working with closed source ECM vendors can’t see the code

It’s obvious, I know. For developers that work with open source it is extremely natural to use the CMS source code when debugging or for reference. You don’t even think about it–it’s just there and you use it. Imagine the frustration of someone who works with closed source CMS who has to routinely decompile classes to figure out what’s going on. That truly sucks. What good is a “Developer Edition” that doesn’t come with source code?

Partner defections from closed source are on the rise

I’ve seen recent announcements from multiple partners who were previously exclusive to closed source vendors but are now adding open source to their partner list. This is a reflection of increasing demand by customers who are realizing the business value of open source, especially in tough economic times as well as partners’ desire to make up for sagging demand in the proprietary world. But could it also be that more firms are realizing how much more productive and pleasant it is to work with open source content management?

Help your employer/client see the light

Open source ECM technologies like Alfresco, Drupal, Liferay, Lucene, and many others, are now at or beyond their closed source equivalents. If you are a developer who’s sick of the shackles closed source CMS places on you, why not suggest exploring open source alternatives?