Month: September 2008

Alfresco plus Drupal thoughts

I’ve had several discussions with Optaros clients and internal team members lately around Drupal and Alfresco integration. Particularly around this topic, I usually try to listen more than I talk. I want to make sure I understand where the value is for this kind of integration rather than simply geeking out on yet another “stupid CMS trick”. I thought maybe I’d bounce a summary of these thoughts off of you.

The key is to leverage the strengths of each. If you don’t have a problem that requires this particular combination of strengths, assembling a solution from these two components isn’t going to be of any value at all. What are some of the key strengths relevant to this discussion?

Drupal has:

  • A front-end presentation framework. (I would add that it is written in PHP–a relatively widespread language that’s easy to pick up).
  • A very large library of modules, most of them focused on building community-centric web sites.
  • A lightweight footprint, requiring only a web server, MySQL, and PHP. (Yes, I know it is possible to run Drupal on other databases but not every module will).

Alfresco has:

  • Robust workflow via the embedded JBoss jBPM engine.
  • Smart management of file-based objects (files go on the file system, metadata goes in the database, and an API that abstracts the separation).
  • A plethora of file-based protocols and API’s for getting content into and out of the repository, including a framework to easily expose content and business logic via REST.

Silo’d community solutions are best implemented in Drupal alone. Why complicate your life with a separate repository? It adds no value in that situation. Similarly, straight document management (and even team-based collaboration) really can be addressed with the Alfresco repository and the standard Alfresco web client (or, soon, with Alfresco’s new Share client).

I think where Drupal-Alfresco makes the most sense is in cases where there is a significant amount of file-based content that requires “basic content services” such as workflow, versioning, security, check-in/check-out, but needs to be shared in the context of a community.

Alfresco becomes even more useful when there are multiple communities that need this content because you can start to leverage the “content-as-a-service” idea to make the content available to any number of front-end sites (where those sites might or might not be Drupal-based).

Suppose rather than one community, you have ten. Each community will have community-specific content but there may also be a set of content that needs to be leveraged across many communities. A subset of things you might be concerned about include:

  • Some content might need editorial review and approval before it goes live, but not everything.
  • Not all content comes from internal sources (pushed out from the repository). Some might originate from one of the communities as user-generated content. That content might need editorial review before it is made available on other community sites.
  • Communities need flexibility in how/if they expose cross-community content.
  • Tags and other metadata value need to be consistent between an end point and the repository (and therefore, across all end points).
  • Search needs to be properly scoped (does it include community content only, community plus shared content in the repo, multiple selected communities, or all communities)
  • Some clients may not be able to control the technology used on these community endpoints.

In these scenarios, Alfresco acts as your core repository and Drupal provides the front-end presentation layer. When you look at it this way, Drupal really becomes equivalent (in terms of where it sits in the architecture and the role it plays) to traditional portals like Liferay or JBoss Portal.
Content sitting in Drupal is harder for other systems to get to than when it sits in Alfresco. There are Drupal modules that make it easier to syndicate out but Alfresco’s purpose-built to expose content in this way. Once it is in Alfresco, content can be routed through Alfresco workflows, and then approved to be made available to one or more front-end Drupal sites. Content could come from a Drupal site, get persisted to Alfresco, routed around for editorial review, and then be made available. It really opens up a lot of possibilities.

Not all Drupal modules need to persist their data back to Alfresco. Things like comments and ratings will likely never need to be treated as real content. Instead of trying to persist everything you would either modify select modules to integrate with Alfresco or create new ones that work with Alfresco. For example, you might want to have Drupal stick file uploads in Alfresco instead of the local file system. Or, it might make sense to have a “send to alfresco” button visible to certain roles that would send the current node to Alfresco.

It doesn’t all have to be Drupal getting and posting to Alfresco. There might be cases where you need some Drupal data from within Alfresco. Maybe you are in Alfresco and you want to tag objects using the same set of tags Drupal knows about, for example. Or maybe you want to do a mass import of Drupal objects into the Alfresco repository.

I’ve got a little test module that uses Alfresco’s REST API (including the new CMIS URLs) to retrieve content from Alfresco and show it in a Drupal block. I can talk about it in a separate post.

Dogs and Cats: EMC, Microsoft, IBM, & Alfresco release CMIS

EMC, Microsoft, IBM, and Alfresco are announcing today a new specification for interoperability between content management systems (EMC press release, Alfresco press release). The specification is called CMIS which stands for Content Management Interoperability Services (full specification). Other major players involved with the spec include OpenText, Oracle, and SAP.

What the spec outlines is essentially an abstraction layer between content-centric applications and back-end repositories. The abstraction layer is implemented as a set of services (SOAP-based and REST-based). The services are primarily focused on CRUD functions but they also include a SQL-like query language. In his blog post on CMIS, John Newton says that CMIS will become “the SQL of content management”.

This means that, theoretically, a content-centric application can be written that will work with any back-end CMS that implements CMIS. If it sounds like JCR that’s because the two share the same goal, but CMIS is broader because it is language-independent. (Not to be a buzz kill but think about how many applications you’ve seen where the underlying JCR repository could truly be swapped out at no “cost”. It is too early to tell whether that will get any better with CMIS).

In what is really a “dogs and cats living together” moment, think about
this: The new Alfresco Share client (or any Surf-based web application
for that matter) can now be used as the front-end for any
CMIS-compliant repository like Sharepoint or Documentum. (Maybe that’d be a nice “bridge” to have in place while you’re migrating off of those legacy repositories!)

In my post back in June (Slinging some ideas around RESTful content) I mentioned Apache Jackrabbit, Apache Sling, and how there ought to be a standard, REST-based API for working with content repositories. I wondered if Alfresco’s inclusuion of Abdera, Apache’s implementation of the ATOM Publishing Protocol, into the Labs code line signaled Alfresco’s move in that direction. Well, CMIS is that standard. And if you look at Alfresco’s Draft CMIS Implementation, you’ll see that Abdera is, in fact, in the mix.

Back during my Documentum days, a spec was mentioned called iECM that I thought Documentum and maybe AIIM were working on together. But then it seemed like it kind of died. The goals (and some of the details) sound eerily familiar to CMIS. Could the popularity of more modern content management API’s like Alfresco’s web scripts and Apache Sling have spurred the legacy vendors into actually doing something about interoperability for real? (I just saw in John Newton’s blog post that iECM did spawn CMIS but it doesn’t speak to the motivation of the other vendors).

You can try out Alfresco’s CMIS implementation by downloading the latest Labs 3.0 B build.

Here, here! Asay says most analysts are laggards

I really liked Matt Asay’s post on industry analysts. His primary point is that firms like Gartner and Forrester really aren’t that forward-looking. Matt says,

“…They tell an enterprise buyer from whom she should have purchased
software and hardware a few years ago, not where she should invest IT
dollars tomorrow. As an example, despite the massive influx of
open-source vendors in the enterprise, Gartner persists in believing that open source is years away from making a dent in the enterprise, and you’ll rarely find an open-source vendor in a Gartner Magic Quadrant.”

Gartner’s portals and collaboration summit I went to in Vegas last year (Day 1, Day 2, Day 3) really frustrated me because when open source was mentioned, it was spun primarily as a risky investment not worthy of the enterprise. My initial reaction was that firms like Gartner are threatened by open source because most open source projects will never pay the kind of money Gartner demands to have their software “reviewed” and included in the mystical quadrant. Matt’s post goes to a more fundamental issue which is that they just don’t get it.