Alfresco plus Drupal thoughts

I’ve had several discussions with Optaros clients and internal team members lately around Drupal and Alfresco integration. Particularly around this topic, I usually try to listen more than I talk. I want to make sure I understand where the value is for this kind of integration rather than simply geeking out on yet another “stupid CMS trick”. I thought maybe I’d bounce a summary of these thoughts off of you.

The key is to leverage the strengths of each. If you don’t have a problem that requires this particular combination of strengths, assembling a solution from these two components isn’t going to be of any value at all. What are some of the key strengths relevant to this discussion?

Drupal has:

  • A front-end presentation framework. (I would add that it is written in PHP–a relatively widespread language that’s easy to pick up).
  • A very large library of modules, most of them focused on building community-centric web sites.
  • A lightweight footprint, requiring only a web server, MySQL, and PHP. (Yes, I know it is possible to run Drupal on other databases but not every module will).

Alfresco has:

  • Robust workflow via the embedded JBoss jBPM engine.
  • Smart management of file-based objects (files go on the file system, metadata goes in the database, and an API that abstracts the separation).
  • A plethora of file-based protocols and API’s for getting content into and out of the repository, including a framework to easily expose content and business logic via REST.

Silo’d community solutions are best implemented in Drupal alone. Why complicate your life with a separate repository? It adds no value in that situation. Similarly, straight document management (and even team-based collaboration) really can be addressed with the Alfresco repository and the standard Alfresco web client (or, soon, with Alfresco’s new Share client).

I think where Drupal-Alfresco makes the most sense is in cases where there is a significant amount of file-based content that requires “basic content services” such as workflow, versioning, security, check-in/check-out, but needs to be shared in the context of a community.

Alfresco becomes even more useful when there are multiple communities that need this content because you can start to leverage the “content-as-a-service” idea to make the content available to any number of front-end sites (where those sites might or might not be Drupal-based).

Suppose rather than one community, you have ten. Each community will have community-specific content but there may also be a set of content that needs to be leveraged across many communities. A subset of things you might be concerned about include:

  • Some content might need editorial review and approval before it goes live, but not everything.
  • Not all content comes from internal sources (pushed out from the repository). Some might originate from one of the communities as user-generated content. That content might need editorial review before it is made available on other community sites.
  • Communities need flexibility in how/if they expose cross-community content.
  • Tags and other metadata value need to be consistent between an end point and the repository (and therefore, across all end points).
  • Search needs to be properly scoped (does it include community content only, community plus shared content in the repo, multiple selected communities, or all communities)
  • Some clients may not be able to control the technology used on these community endpoints.

In these scenarios, Alfresco acts as your core repository and Drupal provides the front-end presentation layer. When you look at it this way, Drupal really becomes equivalent (in terms of where it sits in the architecture and the role it plays) to traditional portals like Liferay or JBoss Portal.
Content sitting in Drupal is harder for other systems to get to than when it sits in Alfresco. There are Drupal modules that make it easier to syndicate out but Alfresco’s purpose-built to expose content in this way. Once it is in Alfresco, content can be routed through Alfresco workflows, and then approved to be made available to one or more front-end Drupal sites. Content could come from a Drupal site, get persisted to Alfresco, routed around for editorial review, and then be made available. It really opens up a lot of possibilities.

Not all Drupal modules need to persist their data back to Alfresco. Things like comments and ratings will likely never need to be treated as real content. Instead of trying to persist everything you would either modify select modules to integrate with Alfresco or create new ones that work with Alfresco. For example, you might want to have Drupal stick file uploads in Alfresco instead of the local file system. Or, it might make sense to have a “send to alfresco” button visible to certain roles that would send the current node to Alfresco.

It doesn’t all have to be Drupal getting and posting to Alfresco. There might be cases where you need some Drupal data from within Alfresco. Maybe you are in Alfresco and you want to tag objects using the same set of tags Drupal knows about, for example. Or maybe you want to do a mass import of Drupal objects into the Alfresco repository.

I’ve got a little test module that uses Alfresco’s REST API (including the new CMIS URLs) to retrieve content from Alfresco and show it in a Drupal block. I can talk about it in a separate post.