Alfresco plus Drupal thoughts

I’ve had several discussions with Optaros clients and internal team members lately around Drupal and Alfresco integration. Particularly around this topic, I usually try to listen more than I talk. I want to make sure I understand where the value is for this kind of integration rather than simply geeking out on yet another “stupid CMS trick”. I thought maybe I’d bounce a summary of these thoughts off of you.

The key is to leverage the strengths of each. If you don’t have a problem that requires this particular combination of strengths, assembling a solution from these two components isn’t going to be of any value at all. What are some of the key strengths relevant to this discussion?

Drupal has:

  • A front-end presentation framework. (I would add that it is written in PHP–a relatively widespread language that’s easy to pick up).
  • A very large library of modules, most of them focused on building community-centric web sites.
  • A lightweight footprint, requiring only a web server, MySQL, and PHP. (Yes, I know it is possible to run Drupal on other databases but not every module will).

Alfresco has:

  • Robust workflow via the embedded JBoss jBPM engine.
  • Smart management of file-based objects (files go on the file system, metadata goes in the database, and an API that abstracts the separation).
  • A plethora of file-based protocols and API’s for getting content into and out of the repository, including a framework to easily expose content and business logic via REST.

Silo’d community solutions are best implemented in Drupal alone. Why complicate your life with a separate repository? It adds no value in that situation. Similarly, straight document management (and even team-based collaboration) really can be addressed with the Alfresco repository and the standard Alfresco web client (or, soon, with Alfresco’s new Share client).

I think where Drupal-Alfresco makes the most sense is in cases where there is a significant amount of file-based content that requires “basic content services” such as workflow, versioning, security, check-in/check-out, but needs to be shared in the context of a community.

Alfresco becomes even more useful when there are multiple communities that need this content because you can start to leverage the “content-as-a-service” idea to make the content available to any number of front-end sites (where those sites might or might not be Drupal-based).

Suppose rather than one community, you have ten. Each community will have community-specific content but there may also be a set of content that needs to be leveraged across many communities. A subset of things you might be concerned about include:

  • Some content might need editorial review and approval before it goes live, but not everything.
  • Not all content comes from internal sources (pushed out from the repository). Some might originate from one of the communities as user-generated content. That content might need editorial review before it is made available on other community sites.
  • Communities need flexibility in how/if they expose cross-community content.
  • Tags and other metadata value need to be consistent between an end point and the repository (and therefore, across all end points).
  • Search needs to be properly scoped (does it include community content only, community plus shared content in the repo, multiple selected communities, or all communities)
  • Some clients may not be able to control the technology used on these community endpoints.

In these scenarios, Alfresco acts as your core repository and Drupal provides the front-end presentation layer. When you look at it this way, Drupal really becomes equivalent (in terms of where it sits in the architecture and the role it plays) to traditional portals like Liferay or JBoss Portal.
Content sitting in Drupal is harder for other systems to get to than when it sits in Alfresco. There are Drupal modules that make it easier to syndicate out but Alfresco’s purpose-built to expose content in this way. Once it is in Alfresco, content can be routed through Alfresco workflows, and then approved to be made available to one or more front-end Drupal sites. Content could come from a Drupal site, get persisted to Alfresco, routed around for editorial review, and then be made available. It really opens up a lot of possibilities.

Not all Drupal modules need to persist their data back to Alfresco. Things like comments and ratings will likely never need to be treated as real content. Instead of trying to persist everything you would either modify select modules to integrate with Alfresco or create new ones that work with Alfresco. For example, you might want to have Drupal stick file uploads in Alfresco instead of the local file system. Or, it might make sense to have a “send to alfresco” button visible to certain roles that would send the current node to Alfresco.

It doesn’t all have to be Drupal getting and posting to Alfresco. There might be cases where you need some Drupal data from within Alfresco. Maybe you are in Alfresco and you want to tag objects using the same set of tags Drupal knows about, for example. Or maybe you want to do a mass import of Drupal objects into the Alfresco repository.

I’ve got a little test module that uses Alfresco’s REST API (including the new CMIS URLs) to retrieve content from Alfresco and show it in a Drupal block. I can talk about it in a separate post.

8 comments

  1. eigentor says:

    Interesting idea. Somebody would have to try. Since it will be quite hard to really get the systems together, a starting point would be to have shared user data. Furthermore there must be a way Drupal can integrate the files in the Alfresco file structure.

    Don’t know much about Alfresco. just read an extensive review about it and so believe understand the basic concept 😉

    Some Glue modules – here you go to try out.

    Another thing is to find out if people need this kind of thing, if there is an intersection of customers. As Drupal is only slowly making its way into the boardroom and also Alfresco will not be World-dominating in the Micro$oft – dominated enterprise world, both might benefit from each other.

    But this is quite some work if done seriously and would need some really comitted Maintainers.

  2. Matthew Cone says:

    Wow! I’m looking forward to hearing more about the test module. Sounds like something a lot of us could use!

  3. Jacob Singh says:

    We built an alfresco integration of sorts for http://amnesty.org last year. It wasn’t particularly fun or clean enough to contirubte to the community. Our partners at Amnesty who were doing the alfresco bit decided not to use most of the standard Alfresco APIs for their own business reasons, instead opting to perform the integration Alfresco -> Drupal via an ATOM feed over a REST interface.

    For Drupal -> Alfresco we may have written something contributable, however I think that also involved some custom code on the Alfresco side.

    Bottom line is that it is a complex issue, in large part because of the node-centric vision of Drupal. Alfresco’s document and revision handling is far more complex, and mapping may be difficult. Also, most orgs implementing Alfresco (like AI) have HUGE repositories of data, and trying to port them to nodes / keep taxonomy in sync sounds like a nightmare. But that means no views, comments, ratings, search, etc…

    For now, the best and simplest method is probably using the FeedAPI in drupal and creating those stub nodes, but that’s not very effective. I think in the long run, it will require a node bridge for it to be of any value to most organizations to go with a drupal front end.

  4. Rob loren says:

    Hi. Interesting article. We’re considering this exact setup. I would love to see your test module code, if you are willing to share. Another route we are considering is KnowledgeTree, as there’s a module for that in Drupal. What do you think?

  5. Chris Fuller says:

    Jeff,

    I think you’re exactly right about persistence of data. Even more than workflow (to some degree) this seems to me to be the key issue. Drupal can provide a great presentation layer architecture to complement Alfresco’s ECM capabilities by adding a number of social tools that can be loosely coupled to Alfresco’s data store, but Drupal doesn’t excel at file management, let alone true DAM.

    As we’ve discussed, the way that Achieve has pursued a combination of these systems is in the context of a social networking tool for “behind the firewall” at enterprise organizations, combining Drupal’s rich user profiles, groups and tagging with Alfresco’s file management, exposed through web scripts. I think a lot more can be done with these two systems, though, and I’m particularly interested in the idea of persisting Drupal’s taxonomy functionality across both applications, since this seems to be another area where the systems are complimentary.

    I’m looking forward to continuing this conversation and seeing what other ideas come up.

  6. jpotts says:

    antonella,

    The module is very basic–it was really just an excuse to spend some time in Drupal and try out the new CMIS stuff in Alfresco.

    Here’s an overview of how it works:
    – When you configure the module, you specify credentials and a URL for the Alfresco repository.
    – The module currently has one block which is a folder listing block. When you configure the block you give it a path in your Alfresco repository.
    – The block then renders itself by invoking Alfresco’s CMIS REST API for retrieving the contents of a folder. CMIS queries return back ATOM feeds with extensions. The block parses the XML and creates an unnumbered list of the objects it found and links them using the objects’ download URL.

    Jeff

Comments are closed.