Tag: Drupal

Quick look at Acquia Reservoir, a Headless Drupal Distribution

Drupal is a very popular open source Web Content Management system. One of its key characteristics is that it owns both the back-end repository where content is stored and the front-end where content is rendered. In CMS parlance this is typically called a “coupled” CMS because the front-end and the back-end are coupled together.

Historically, the coupled nature of Drupal was a benefit most of the time because it facilitated a fast time-to-market. In many cases, customers could just install Drupal, define their content types, install or develop a theme, and they had a web site up-and-running that made it easy for non-technical content editors to manage the content of that web site.

But as architectural styles have shifted to “API-first” and Single Page Applications (SPAs) written in client-side frameworks like Angular and React and with many clients finding themselves distributing content to multiple channels beyond web, having a CMS that wants to own the front-end becomes more of a burden than a benefit, hence the rise of the “headless” or “de-coupled” CMS. Multiple SaaS vendors have sprung up over the last few years, creating a Content-as-a-Service market which I’ve blogged about before.

Drupal has been able to expose its content and other operations via a RESTful API for quite a while. But in those early days it was not quite as simple as it could be. If you have a team, for example, that just wants to model some content types, give their editors a nice interface for managing instances of those types, and then write a front-end that fetches that content via JSON, you still had to know a fair amount about Drupal to get everything working.

Last summer, Acquia, a company that provides enterprise support for Drupal headed up by Drupal founder, Dries Buytaert, released a new distribution of Drupal called Reservoir that implements the “headless CMS” use case. Reservoir is Drupal, but most of the pieces that concern the front-end have been removed. Reservoir also ships with a JSON API module that exposes your content in a standard way.

I was curious to see how well this worked so I grabbed the Reservoir Docker image and fired it up.

The first thing I did was create a few content types. Article is a demo type provided out-of-the-box. I added Job Posting and Team Member, two types you’d find on just about any corporate web site.

My Team Member type is simple. It has a Body field, which is HTML text, and a Headshot field, which is an image. My Job Posting type has a plain text Body field, a Date field for when the job was posted, and a Status field which has a constrained list of values (Open and Closed).

With my types in place I started creating content…

Something that jumped out at me here was that there is no way to search, filter, or sort content. That’s not going to work very well as the number of content items grows. I can hear my Drupal friends saying, “There’s a module for that!”, but that seems like something that should be out-of-the-box.

Next, I jumped over to the API tab and saw that there are RESTful endpoints for each of my content types that allow me to fetch a list of nodes of a given type, specific nodes, and the relationships a node has to other nodes in the repository. POST, PATCH, and DELETE methods are also supported, so this is not just a read-only API.

Reservoir uses OAuth to secure the API, so to actually test it out, I grabbed the “Demo app” client UUID, then went into Postman and did a POST against the /oauth/token endpoint. That returned an access token and a refresh token. I grabbed the access token and stuck it in the authorization header for future requests.

Here’s an example response for a specific “team member” object.

My first observation is that the JSON is pretty verbose for such a simple object. If I were to use this today I’d probably write a Spring Boot app that simplifies the API responses further. As a front-end developer, I’d really prefer for the JSON that comes back to be much more succinct. The front-end may not need to know about the node’s revision history, for example.

Another reason I might want my front-end to call a simplified API layer rather than call Drupal directly is to aggregate multiple calls. For example, in the response above, you’ll notice that the team member’s headshot is returned as part of a relationship. You can’t get the URL to the headshot from the Team Member JSON.

If you follow the field_headshot “related” link, you’ll get the JSON object representing the headshot:

The related headshot JSON shown above has the actual URL to the headshot image. It’s not the end of the world to have to make two HTTP calls for every team member, but as a front-end developer, I’d prefer to get a team member object that has exactly what I need in a single response.

One of the things that might help improve this is support for GraphQL. Reservoir says it plans to support GraphQL, but in the version that ships on the Docker image, if you try to enable it, you get a message that it is still under development. There is a GraphQL Drupal module so I’m sure this is coming to Reservoir soon.

Many of my clients are predominantly Java shops–they are often reluctant to adopt technology that would require new additions to their toolchain, like PHP. And they don’t always have an interest in hiring or developing Drupal talent. Containers running highly-specialized Drupal distributions, like Reservoir, could eventually make both of these concerns less of an issue.

In addition to Acquia Reservoir, there is another de-coupled Drupal Distribution called Contenta, so if you like the idea of running headless Drupal, you might take a look at both and see which is a better fit.

Five of your favorite Alfresco-related presentations

If views of my presentations on SlideShare are any indication, a whole lot of you are interested in integrating Drupal and Alfresco. Despite the fact that the presentation is four years old, it consistently makes the “most viewed” list out of my uploads. If you are considering Drupal but need something a bit more document-centric to serve up your files as part of that Drupal site, take a look:

With over 12,000 views, it is safe to say there is definitely something to the combination of Alfresco and Drupal.

Another apparent classic is:

Which is kind of scary given its age and brevity. I think the popularity of this is due to the seemingly inexhaustible demand for “getting started” resources for new Alfresco developers.

This one has similar info, but with more details, and is probably a better choice for developers trying to get an extremely high-level overview:

The CMIS API is now the preferred way to interact with the Alfresco repository remotely, and many people use this presentation to get a quick overview:

In fact, I’ll have a CMIS powerhouse panel on Tech Talk Live tomorrow (July 10, 2013). So if you are just getting started with CMIS, please join us.

If you like CMIS but you don’t want to fool around with your own server, you can use Alfresco in the Cloud. This deck gives a CMIS overview and discusses the Alfresco API at a high-level with links to sample code and screencasts:

Thanks to everyone who has made use of these presentations!

Screencast: Drupal Open Atrium with Alfresco CMIS

UPDATE: Screencast now lives here:

I recorded a quick screencast of a simple integration we did to show Open Atrium leveraging Alfresco as a formal document repository via CMIS. This leverages the CMIS Alfresco module we developed and released on Drupal.org.

As I point out in the screencast, there’s not much to the integration from a technical standpoint. Open Atrium is Drupal and the CMIS module already has a CMIS repository browser. So, all we had to do was expose the module as a “feature”, which is something Open Atrium uses to bundle modules together that create a given chunk of functionality.

Readers familiar with Alfresco Share will instantly recognize the Open Atrium concepts. Instead of “sites” Atrium uses “groups”. Instead of “pages” or “tools”, Atrium uses “features”. The overall purpose, self-provisioned team-based collaboration, is the same and many of the tools/features are the same (blog, calendar, member directory). I’m not advocating using one over the other–as usual, what works best for you depends on a lot of factors. I just thought Atrium provided a nice way to show yet another example of Drupal and Alfresco together (post).

Drupal + Alfresco webinar slides available

People want intranets that are fun and easy to use, full of compelling content relevant to their job, and enabled with social and community features to help them discover connections with other teams, projects, and colleagues. IT wants something that’s lightweight and flexible enough to respond to the needs of the business that won’t cost a fortune.

That’s why Drupal + Alfresco is a great combination for things like intranets like the one Optaros built for Activision and why we had a record-breaking turnout for the Drupal + Alfresco webinar Chris Fuller and I did today. Thanks to everyone who came and asked good questions. I’ve posted the slides. Alfresco recorded the webinar so they’ll make it available soon, I’m sure. When that happens, I’ll update the post with a link. Until then, enjoy the slides.

[UPDATE: Fixed the slideshare link (thanks, David!) and added the links to the webinar recording below]

1. Streaming recording link:
https://alfresco.webex.com/alfresco/lsr.php?AT=pb&SP=TC&rID=42774837&act=pb&rKey=b44130d69cc9ec5f

2. Download recording link:
https://alfresco.webex.com/alfresco/ldr.php?AT=dw&SP=TC&rID=42774837&act=pf&rKey=c50049ac82e1220a

Yet another reason to love Open Source Content Management

Man, I don’t miss delivering solutions on top of Documentum. After reading Laurence Hart’s post on Documentum Developer Edition, I’m reminded how much I take for granted working exclusively in the open source content management world.

Laurence’s post was intended to discuss the ins and outs of Documentum’s efforts to make it easier for developers, and, as usual, he’s done a good job of that. But it also underscores the benefits enjoyed by those who work in open source land. In case you don’t know how good you’ve got it, my open source brothers and sisters, check it out:

Developers working with closed source ECM vendors have to pay to get the software

As Laurence points out,

“There are lots of independent consultants out there that have trouble keeping-up with the technology because they can’t afford to become partners for the requisite fee.”

If you are a developer looking to go deep on closed source software, you have no choice but to pay. There’s no other way to get access to the software. Sometimes you can’t even get access to the documentation or the bug database without a paid-up partner account (or a client that lets you use theirs).

[UPDATE: Jerry Silver, from EMC, points out that the Documentum Developer Edition is a free download. My original post made it sound like you had to be part of the partner program to obtain the download.]

With open source, the barrier to entry is much lower. You pay nothing to get the software. It’s all about the time and energy you put into learning the product and implementing cool solutions.

To be fair, commercial open source vendors often charge partner fees as well, but the bottom line is that it costs nothing to get started with the code.

Developers working with closed source ECM vendors struggle with giant developer footprints

I feel sorry for Laurence’s laptop:

“The complete Development install calls for 3GB of RAM (after a 1.7+GB download).  That is no small thing for a development laptop.  It needs to be on a newer machine.  If you can move the database service to a different box, that will make your life easier.”

Oh dear. A 1.7GB download for a developer setup? Am I downloading a VM image or a content management server? Let’s look at Alfresco for a comparison. Assuming you are starting from scratch, and assuming you are going to go full-on with the Alfresco platform, your total download is right around 300MB. That includes:

  • Alfresco SDK
  • Alfresco WAR
  • Alfresco WCM (Deployment listener and add-on to core repo)
  • Apache Tomcat
  • Sun JDK
  • MySQL (Server and connector)

All of which runs comfortably in 2GB of RAM and won’t even cause your fan to kick on in 4GB.

Developers working with closed source ECM vendors have less choice

Optaros consultants are now split fairly evenly in their choice of OS across Windows, Mac OS X, and some flavor of Linux. Some people prefer MySQL and some prefer PostgreSQL. Mostly we use Eclipse for Java development but everyone’s got a preference. I use Tomcat for everything locally while others like JBoss. The point is, developers want to use their tools the way they want to. It’s not a stubbornness thing it’s an efficiency thing.

Within my CMS I want the same flexibility. I want to tweak settings. I want to name my database what I want. I want the flexibility to deploy across as many (or as few) nodes as I need to. From Laurence’s post, it sounds like Documentum clearly falls down here.

Developers working with closed source ECM vendors can’t see the code

It’s obvious, I know. For developers that work with open source it is extremely natural to use the CMS source code when debugging or for reference. You don’t even think about it–it’s just there and you use it. Imagine the frustration of someone who works with closed source CMS who has to routinely decompile classes to figure out what’s going on. That truly sucks. What good is a “Developer Edition” that doesn’t come with source code?

Partner defections from closed source are on the rise

I’ve seen recent announcements from multiple partners who were previously exclusive to closed source vendors but are now adding open source to their partner list. This is a reflection of increasing demand by customers who are realizing the business value of open source, especially in tough economic times as well as partners’ desire to make up for sagging demand in the proprietary world. But could it also be that more firms are realizing how much more productive and pleasant it is to work with open source content management?

Help your employer/client see the light

Open source ECM technologies like Alfresco, Drupal, Liferay, Lucene, and many others, are now at or beyond their closed source equivalents. If you are a developer who’s sick of the shackles closed source CMS places on you, why not suggest exploring open source alternatives?

ECM vendors have their heads in the cloud, can you see through the fog?

The hype around cloud computing has reached a fevered pitch so it is natural that ECM vendors try to take advantage of that as much as they can. Some examples from the open source ECM world:

  • Alfresco always seems to be partnering with one cloud vendor or another. I went to a brief session on Alfresco, GoGrid, and ParaScale earlier this year. (As an aside, those GoGrid cycling socks, which I thought was a strange giveaway at the time, are awesome).
  • At the end of last year eZ Publish announced a partnership with Mamut to provide eZ as SaaS.
  • Just last week Nuxeo announced a cloud edition of its product.

Clearly, ECM vendors are busy figuring out how to take advantage of the cloud. But what does it mean for ECM to be “in the cloud”? When might it work for you?

Cirrus, Stratus, or Cumulonimbus

The first thing you need to realize is that when people say “cloud” they often mean very different things. Generally, there are three types of clouds: Software-as-a-Service (Saas), Platform-as-a-Service (PaaS), and Infrastructure-as-a-Service (IaaS).

Software-as-a-Service (SaaS) is the same model that’s been around for years but has lately taken advantage of the cloud moniker. Google Apps and Salesforce.com are the big SaaS players but there are SaaS offerings for all kinds of business applications, including content management.

The allure of SaaS ECM is the same as that of SaaS in general:

  • Lower up-front costs
  • Someone else gets to worry about running and scaling the infrastructure
  • Depending on the vendor, you may only have to pay for what you use

The challenges of SaaS ECM include things like:

  • The ability to do heavy customization and complex workflows
  • Ease of integration with other systems
  • Client perceptions (and real issues) around data security
  • Data portability/vendor lock-in

Open Source CM vendors Nuxeo and eZ Systems have SaaS offerings as do proprietary vendors such as SpringCM, CrownPeak, Clickability, and PaperThin, to name a few. Beyond just general-purpose document and content management, I think you’ll also see vendors build verticalized SaaS offerings on top of hosted content management technology.

The next type of cloud is Platform-as-a-Service (PaaS). The two best examples of PaaS are Google App Engine (GAE) and Salesforce.com’s force.com platform. With PaaS, you provide the code and the PaaS provider does the rest. Of course this means your code has to follow certain standards and is often subject to limitations, but the beauty is that you get a completely custom solution without worrying about any of the infrastructure.

I like GAE. For certain applications, the benefits of instantaneous, global scale far outweigh the limitations of the platform. But I don’t expect ECM vendors that would do well in SaaS or IaaS clouds to do much with PaaS. You can’t take an Alfresco or a Drupal and run it on a PaaS cloud. I do think we will see PaaS-native content management systems. For example, I’ve seen apps in the Salesforce.com AppExchange that are basically tools for building a web site that’s tightly integrated with Salesforce.com. I think you’ll also see solutions that leverage a PaaS for certain components or sub-systems.

The third type of cloud is Infrastructure-as-a-Service (IaaS). An IaaS cloud is about providing virtual servers on-demand. Examples include things like Amazon’s EC2, Rackspace Cloud, and GoGrid. With these services you can instantly provision as many servers as you need. What you do with them is up to you. When you’re done, you turn them off. Specifics vary but you are essentially billed for CPU time.

The way people leverage IaaS differs. Some people will provision a server and install their ECM software of choice and stop there. Other than dealing with different file storage approaches of various IaaS vendors, this is really no different than running your own virtual servers. So when someone says they are running XYZ CMS “in the cloud” and it turns out to be a single node on a virtual machine, I can barely stifle a yawn. It’s fast and convenient to set up, yes, but technically it’s pretty boring.

The more interesting way to use ECM in an IaaS cloud is to leverage the ability of the infrastructure to scale on-demand. That’s the real value of “the cloud” after all. For example, at Optaros we run an IaaS-hosted solution called OView that syndicates content and content-centric applications to web sites. When a client places that content or app on Yahoo’s home page we get a huge spike in traffic. We run the solution on Amazon EC2 images and we use RightScale to dynamically provision additional nodes when traffic warrants.

The degree to which a specific ECM vendor can operate in a dynamically-scaled infrastructure varies greatly. Simply “running in the cloud” is easy. Scaling your ECM infrastructure automagically is harder.

What do you really need?

If the list of SaaS benefits have a lot of appeal to you and the challenges and potential limitations aren’t much of a bother, SaaS ECM might be worth evaluating. This will most likely be a better fit for clients with limited IT resources and simple to moderate requirements around ECM.

On the IaaS front, if it is just an issue of externally-hosting your ECM infrastructure, make sure the cloud is what you want. The best use case for the cloud is when demand is temporary or unpredictable with huge spikes. I would argue that for your core ECM infrastructure demand is neither temporary nor unpredictable.

If “scale” is your issue, I would challenge you to think about exactly what needs to be scaled. If it is just content delivery of static content, maybe you could get by with a CDN. If your content management system can separate authoring from dynamic delivery of content, maybe only the dynamic content delivery mechanism needs to be able to scale quickly.

You might have certain processes (large-scale video transcoding, for example, or other types of periodic batch processing) that you could leverage the cloud for without cloud-enabling your entire ECM infrastructure. Acquia‘s hosted spam filtering service, Mollum, and their newly-released hosted-search offering are two examples where only specific pieces of your infrastructure are off-loaded to the cloud.

If it turns out that you need to scale the whole ball of wax, fine, it can be done, but have a good reason.

ECM in the cloud is, um, cloudy

The cloud as a style of computing is exciting. The cloud as a “feature” is potentially confusing. ECM vendors are going to do what they can do have it somewhere “on the box”. But it’s not something you can simply check off. The next time you hear an ECM vendor say, “cloud-ready”, ask them what they mean. Then figure out whether or not that has any relevance at all to your real requirements.

Is the cloud on your horizon? Let me know if/how the cloud relates to your ECM strategy.

Drupal, Django, and Alfresco in Chicago

I’ll be in Chicago tomorrow for the Alfresco Meetup. I’ll be speaking during the Barcamp on Alfresco and Drupal integration with CMIS (module, screencast). I’ll also have the Alfresco-Django integration running on my laptop. I may not have time to show Alfresco-Django during my slot, but I’ll be happy to stick around and do informal demos and talk about either integration if you’re interested because I’d like your feedback on it.

Alfresco User Interface: What are my options?

People often need to build a custom user interface on top of the Alfresco repository and I see a lot of people asking general questions about how to do it. There are lots of options to consider. Here are four options for creating a user interface on top of Alfresco, at a high level:

Option 1: Use your favorite programming language and/or framework to talk to Alfresco via REST or Web Services. PHP? Python? Java? Flex? Whatever, it’s up to you. The REST API is nice because if you can’t find a URL that does what you need it to out-of-the-box, you can always roll-your-own with the web script framework. This option offers the most flexibility and creative freedom, but of course you might end up building constructs or components that you may have gotten “for free” from a higher-level framework. Optaros‘ streamlined web client, DoCASU, built on Ext-JS, is one freely-available example of a custom UI on top of Alfresco but there are others.

Option 2: Use Alfresco’s Surf framework. Alfresco’s Surf framework is just that–it’s a framework. Don’t confuse it with Alfresco Share which is a team-centric collaboration client built on top of Surf. And, don’t assume that just because a piece of functionality is in Share it is available to you in the lower-level Surf framework. You may have to do some extra work to get some of the cool stuff in Share to work in your pure Surf app. Also realize that Surf is brand new and still maturing. You’ll be quickly disappointed if you hold it to the same standard as a more widely-used, well-established framework like Seam or Django. Surf is a good option for quick, Alfresco-centric solutions, especially if you think you might want to leverage Alfresco’s browser-based site assembly tool, Web Studio, at some point in the future. (See Do-it-yourself Alfresco Surf Code Camp).

Option 3: Customize the Alfresco “Explorer” web client. There are varying degrees to which you can customize the web client. On one end of the spectrum you’ve got Freemarker “presentation templates” followed closely by XML configuration. On the other end of the spectrum you’ve got more elaborate enhancements you can make using JavaServer Faces (JSF). Customizing the Alfresco Explorer web client should only be considered if you can keep your enhancements to an absolute minimum because:

  1. Alfresco is moving away from JSF in favor of Surf-based clients. The Explorer client will continue to be around, but I wouldn’t expect major efforts to be focused on that client going forward.
  2. JSF-based customizations of the web client can be time-consuming and potentially complex, particularly if you are new to JSF.
  3. For most solutions, you’ll get more customer satisfaction bang out of your coding buck by building a purpose-built, eye-catching, UI designed with your specific use cases in mind than you will by starting with the general-purpose web client and extending from there.

Option 4: Use a portal, community, or WCM platform. This includes PHP-based projects like Drupal (Drupal CMIS Screencast) or Joomla as well as Java-based projects like Liferay and JBoss Portal. This is a good option if you have requirements that match up well with the built-in (or easily added-on) capabilities of those platforms.

It’s worth talking about Java portal servers specifically. I think people are struggling a bit to find The Best Way to integrate Alfresco with a portal. Of course there probably is no single approach that will fit every situation but I think Alfresco (with help from the community) could do more to provide best practices.

Here are the options you have when integrating with a portal:

Portal Option 1: Configure Alfresco to be the replacement JSR-170 repository for the portal. This option seems like more trouble than it is worth. If all you need is what you can get out of JSR-170, you might as well use the already-integrated Jackrabbit repository that most open source portals ship with these days unless you have good reasons not to. I’m open to having my mind changed on this one, but it seems like if you want to use Alfresco and a portal, you’ve got bigger plans that are probably going to require custom portlets anyway.

Portal Option 2: Run Alfresco and the portal in the same JVM (post). This is NOT recommended if you need to scale beyond a small departmental solution and, really, I think with the de-coupling of the web script engine we should consider this one deprecated at this point.

Portal Option 3: Run the Alfresco web script engine and the portal in the same JVM. Like the previous option, this gives you the ability to write web scripts that are wrapped in a portlet but it cuts down on the size of the web app significantly and it frees up your portal to scale independently of the Alfresco repository tier. It’s a fast development cycle once you get it set up. But I haven’t seen great instructions for setting it up yet. Alfresco should document this on their wiki if they are going to support this pattern.

Portal Option 4: Write your own portlets that make services calls. This is the “cleanest” approach because it treats Alfresco like any other back-end you might want to integrate with from the portal. You write custom portlets and have them talk to Alfresco via REST or SOAP. You’ll have to decide how you want to handle authentication with Alfresco.

What about CMIS?

CMIS fits under the “Option 1: Use your favorite programming language” and “Portal Option 4: Write your own portlets” categories. You can make CMIS calls to Alfresco using both REST and SOAP from your own custom code, portlet or otherwise. The nice thing about CMIS is that you can use it to abstract the underlying repository so that (in theory) your front-end code will work with different CMIS-compliant back-ends. Just realize that CMIS isn’t a fully-ratified standard yet and although a CMIS implementation is in the Enterprise version of Alfresco, it isn’t clear to me whether or not you’d be supported if you had a problem. (The last response I saw on this specific question was a Peter Monks tweet saying, “I don’t think so”).

The CMIS standard should be approved by the end-of-the-year and if Alfresco’s past performance is an indicator of the future, they’ll be the first to market with a production-ready, fully-supported CMIS implementation based on the final spec.

Pick your poison

Those are the options as I see them. Each one has trade-offs. Some may become more or less attractive over time as languages, frameworks, and the state of the art evolve. Ultimately, you’re going to have to evaluate which one fits your situation the best. You may have a hard time making a decision, but you have to admit that having to choose from several options is a nice problem to have.

Screencast: Alfresco-Drupal CMIS Integration Demo

UPDATE: Screencast now lives here:

I’ve created a new screencast that shows the Alfresco-Drupal CMIS integration in action over at Optaros Labs. The screencast shows content moving back-and-forth between Alfresco and Drupal, content being displayed in a Drupal site that lives in Alfresco, and a CMIS CQL query being executed against the Alfresco repository from Drupal.

The Drupal CMIS module and the CMIS Alfresco module are available at Drupal.org.