Category: Content Management

Enterprise Content Management (ECM), Web Content Management (WCM), Document Management (DM). Whatever you call it this category covers market happenings and lessons learned.

Using Alfresco’s PHP API

Alfresco is starting to incorporate more and more PHP into the platform. This is exciting for us at Optaros because we’ve built some great PHP-based solutions for clients and we’ve had a lot of success with the productivity PHP brings to front-end developers. But, the PHP world lacks an enterprise ready, highly-scalable, robust content management repository.

Now hold on! Hear me out. Projects like Drupal and Joomla are great for certain types of requirements. There are many large, extremely high-volume sites built with Drupal. What I’m talking about is something more generic. Something that makes zero assumptions about the presentation. I’m talking about a back-end repository for rich content plus the back-end “library services” typically associated with document management systems such as workflow, versioning, security, extensible data model, etc. Think “Zope” without the ZODB that can run in any servlet container.
Alfresco has had a PHP API for a while but until today I hadn’t had a chance to do anything with it. On the plane home from Boston today I decided to take it for a spin (details at the bottom of this post).

Using PHP with Alfresco

If you want to write PHP applications that leverage content in Alfresco you can do that today with the PHP API. The PHP API leverages Alfresco Web Services. I have not yet confirmed whether you can do everything with the PHP API that you can with the Java-based Alfresco Web Services API but theoretically you should be able to. And, if you want to write your own Alfresco-centric web services you can–you could deploy them in the same Axis container as Alfresco’s and then hit them with PHP.

If you want to manage a web site using Alfresco WCM that uses PHP you can do that today. But when you use the Alfresco Virtualization Server your PHP scripts won’t run because the Virtualization Server is a customized version of Apache Tomcat and it doesn’t know how to run PHP. Alfresco may address that in the near future.

If Alfresco expands the use of PHP within the product, Alfresco may become more than just an ECM suite–it could become a powerful platform for Next Generation Internet applications. Until then, the next time your PHP application needs a loosely-coupled content repository, consider standing up Alfresco and hitting it with the PHP API.

Trying out the Alfresco PHP API

Once you get all of the pieces together it isn’t that hard to run the Alfresco PHP samples. But it’s not extremely well documented so I’ve included some pointers here. My stack is Ubuntu Dapper Drake, MySQL, Alfresco 2.0 Enterprise, PHP 5.2.1.

  1. Download PHP 5 from http://www.php.net. Follow the directions for building, configuring, and making PHP. When you configure php, make sure to include the –enable-soap flag. You’ll also need to uncomment the extension=php_soap.dll line in your php.ini file.
  2. Use Pear to install Pear::SOAP and its dependencies. (UPDATE: Per Lukas’ comment below, this step is not needed.)
  3. The Alfresco PHP API on the community download site is still at version 1.2. The latest and greatest Alfresco PHP API can be found in the source tree under “modules”.
  4. Follow the instructions in modules/php-sdk/source/php/remote/installation.txt to set up the samples. (See below for how my files were set up).
  • Created a directory called /usr/local/lib/alfresco.
  • Created an alias and a Directory entry in httpd.conf that points to /usr/local/lib/alfresco per the installation.txt file.
  • Copied the contents of modules/php-sdk/source/php/remote to /usr/local/lib/alfresco.
  • Added /usr/local/lib/alfresco to my include_path in my php.ini file.

If your Alfresco repository is running and you’ve restarted Apache with the updated httpd.conf and php.ini, you should be good to go. On my system, I can check /var/log/apache/error.log to troubleshoot. If everything is okay, when you point your browser to http://localhost/alfresco/Examples/SimpleBrowse you will see a list of the spaces and content in your Alfresco store. You can navigate the list, open documents, etc.

The other example is the QueryExecuter example. It performs a full-text query and presents a node list of the query results.

Both of these are very simple examples but by looking at the code, the PHP API source under the Alfresco directory, and the Alfresco Web Services SDK, you should be able to see how you can incorporate Alfresco into your PHP-based application.

SalesForce.com targets Documentum with new ECM functionality

According to CMSWatch, SalesForce.com is going to begin offering ECM capability. The post says the SalesForce.com CEO is boasting that customers can get rid of their Documentum implementations. For some clients who may be under-utilizing Documentum, that could be true. But I imagine SalesForce.com is going to have a lot of work to do to get its acquired Koral software at the functional level required to supplant most large Documentum installations.

Alfresco’s John Newton comments on Dave DeWalt’s departure

John Newton, Alfresco CTO and Chairman, comments on Documentum CEO, Dave DeWalt’s recent departure on his blog. John worked with Dave when he was at Documentum.
As a side note, I’ve met Dave on a couple of occasions–one was a surreal experience in which Dave, Joe Tucci, one of my clients and I threw beads at conference goers from a mini mardi gras float (post). I was impressed with his desire to make a connection with as many conference attendees as he could. Rather than being ushered in and out of a large conference hall never to be seen again, Dave walked amongst the crowd and genuinely listened to feedback from his customers.

I also know he had developed sort of a cult following amongst Documentum employees. From a culture perspective, his departure from EMC seems like a huge loss.

Excellent turnout for 1st Alfresco Meet-up

Over thirty people showed up for the first-ever Alfresco Meet-up today at the Christian Science Monitor in Boston. Russ Danner, Christian Science Monitor developer, opened up the day with a short overview on Alfresco. I then led a short discussion on Alfresco as a platform which included talking about what Alfresco should and shouldn’t be used for. (Here‘s my deck. I’ve made small tweaks for clarification purposes.) Sumer Jabri, a consultant from D.C.-based Rivet Logic, drilled down on the Alfresco architecture. Sumer did a great job.

As is often the case, the most valuable tidbits came up during the roundtable discussion at the end of the day. Russ led the discussion but several others, including both integrators (Optaros, Eyestreet, Rivet Logic) and end-users (Shimano, MIT, Harvard, Kaplan), participated. The liveliest topic was why Alfresco WCM was rolled out without a practical static content deployment approach. Other topics included Liferay-Alfresco integration, JCR, and plans for future Meet-Ups.

Multiple participants said they were interested in seeing more case studies from production implementations. Everyone agreed the session was valuable. Thanks again, Russ, for putting it together.

Proprietary ECM solutions continue to aggravate resource availability issue

Alan Pelz-Sharpe noted in a recent post at CMSWatch that it is becoming harder and harder to find talented Documentum and, more generally, ECM skills in the marketplace. This is yet another datapoint in a trend that has been building over the last few years. (In a post last January, I asked Documentum to open up the WDK. The resource shortage was one reason.).

In my opinion, this problem is only going to get worse, at least until open source ECM solutions unseat the proprietary vendors. Developers in the ECM space are much more excited about investing their limited
resources in open, standards-based platforms.

Becoming a guru in a proprietary solution within a rapidly commoditizing market may yield short-term gains but is a dead-end in the long-term. Ironically, it’s those short-term gains that make the problem worse–the people currently enjoying those short-term gains are more likely to continue riding the wave than they are to move into an IT shop.

So I see open source ECM solutions as helping the ECM market with the resource availability problem in two ways: (1) by being built on the frameworks today’s developers are interested in learning and (2) by removing the primary barrier to entry (software license and cost) thus exposing more developers and companies with ECM solutions.

Regarding the first point, when you compare a developer who’s built expertise around Documentum’s Web Development Kit (WDK), a JSF-like framework for building Documentum web apps, with one that’s invested in Alfresco in which the development model is based on JSF/Spring/Hibernate, the Alfresco developer has a better foundation of transferrable skills, whether that’s to pure web application development or other open source ECM solutions built on a similar stack. Developers spend time learning stuff they are interested in and they pay attention to transferability.

As for the second point, freely-available open source ECM solutions are more likely to find their way into the hands of developers (and the servers of enterprises, see this post) because there are no barriers to entry. This should result in a larger pool of resources experienced with working with ECM, in general, as well as specific open source solutions.

As a thought exercise, think about what the resource pool would look like if closed-source, proprietary vendors were the only game in town? The demand would have to be much greater and the solutions much more interesting to develop any sort of resource base interested in specializing. (SAP seems like a real-life example. SAP folks are expensive and hard-to-find and I don’t run into too many up-and-coming developers begging to be sent to SAP training at this point).

So if you are an enterprise struggling to staff your ECM “Center of Excellence” or maybe you can’t even keep the lights on in your server room, maybe it is time to take a long, hard look at open source ECM.

Solr-powered Alfresco

Have you checked out the Apache Solr project yet? It’s pretty cool. It’s essentially a search server (deployed as a web app into a servlet container) that sits on top of Lucene. Solr makes it super easy to get content into and out of Lucene via its HTTP and JSON APIs.

Recently, for a prospective Optaros client, we put together a little demo to show how Alfresco WCM could integrate with Solr to provide search and personalization for a web site managed within Alfresco. Here’s what we did at a high level:

  • Create an Alfresco web form and XSLT for my web content as usual.
  • Create an additional XSLT (or Freemarker) template to convert the XML content to the Solr format. This gets configured as an additional presentation template associated with the web form.
  • Wrote a JSP to aggregate the Solr XML for all of the published content.
  • Wrote a servlet to call the JSP every X seconds. It takes the response and posts it to Solr. That’s how the Alfresco content gets into the index.

This setup allowed web content to get indexed by the Solr search engine upon its creation. Web site users (either using the web site in the virtualized sandbox or on the production web site) could then query the content.

The web site was a mix of static HTML and JSPs. The JSPs used custom taglibs to call “Solr Search” widgets in the right spot on the page. This was the first time I had used Alfresco’s virtualization to run a real web application (as opposed to static content). The preview release of 2.0 I was using seemed to have some significant cacheing issues. Hopefully those are resolved in the production release. Other than that, it was easy to see how technical and non-technical content managers could leverage Alfresco virtualization to collaborate together to develop and manage a dynamic web site.

Before using this approach in production, I would need to think about the best way to handle deletes. In the demo, once content got into the index, it didn’t come out if the associated content was removed from Alfresco. As far as Solr goes, it is easy to get the content deleted from the index–it’s a simple HTTP post. The trick is where in Alfresco to put that call.

Alfresco Meet-up in Boston on March 12

Russ Danner, who works at the Christian Science Monitor, is organizing an Alfresco Meet-up in Boston. I’ll be speaking on Alfresco as a platform for application development.

If you are in the area it sounds like several people are planning on attending so it should be a good opportunity to find out what everyone else is doing with Alfresco. The Christian Science Monitor has been using Alfresco (and open source portal, Liferay) for quite a while.
Below is the post Russ sent out to the CMPros list.

The Christian Science Monitor would like to invite you to attend an Alfresco Enterprise Content Management Meet-up in Boston:

Time: 11am – 3pm (Lunch provided with RSVP)

Place: 200 Massachusetts Ave
The First Church of Christ, Scientist
Boston, MA 02115

Parking: Available on site $5.00
To Attend: Please send RSVP to dannerr at csps dot com.

Over the last year the interest and community around the Alfresco product has really begun to blossom. Those of us working with and evaluating Alfresco in the Boston / New England area are very fortunate in that there are quite a number of customers, community members, employees and service integrators at very close proximity to one another. Some of the greatest benefits to innovation found in open source software stem directly from the community that supports it. Our local community is rich in talent, mindshare, and passion – let’s capitalize on this! While we expect most in attendance will be local, all are welcome to attend (Please forward this on to others.)

The Christian Science Monitor would like to invite you to join us and others in the community for an Alfresco meet-up. This meeting will be an informal opportunity to meet one another, discuss our projects and progress and learn more about Alfresco and our community. We will have a few (non-sales) presentations and ample opportunity for attendees to get to know one another.

Opening
Introductions
1st Presentation (30 minutes)
(Rivet Logic – Overview of Alfresco Architecture)

Lunch (45 minutes – no presentations)

2nd Presentation (30 minutes)
(There is a possibility we will have someone from Alfresco for this)

3rd Presentation (30 minutes)
(Optaros – Leveraging Alfresco as a Platform for ECM/Development)

Round table discussion (45 minutes)
Closing

Alfresco 2.0 Released

Alfresco 2.0 Community is now available for download. Alfresco has done a lot of work since the last time I posted about their work-in-progress. In that post, I said that some open issues about their WCM functionality included: (1) how workflow would be integrated, (2) how versioning and rollback would work, and (3) how deployment would be handled.

Two of those open issues–workflow and versioning/rollback–were addressed in subsequent preview/beta releases. In 2.0, WCM workflow leverages the JBPM engine, but you don’t have to fire up the JBPM Process Designer to implement Alfresco WCM workflows. Instead, there are two default workflows provided out-of-the-box–one for serial flows and one for parallel flows. When you configure the workflow for a web form you specify the people that need to review the content before it is published. If you specified a serial workflow, they’ll get the task one after another or simultaneously if you specified parallel.
In most WCM implementations, it isn’t practical for 100% of the content to be templated. But non-templated content needs to be reviewed as well. Alfresco handles this by allowing you to configure a workflows for non-templated content. A cool feature is that you can use a regular expression to route the content appropriately. For example, you might want images to route to one group of people but PDFs or other editorial content to route to another group.

Virtualization comes into play with regard to workflow. Alfresco’s virtualization is leveraged when reviewers preview the web site so they can see what the web site will look like in the context of the changes being proposed.

In the versioning and rollback department, snapshots of the entire site are taken every time changes are promoted to staging. You can roll back to any of those snapshots with a single click.

Alfresco’s done a good job implementing web forms to be reusable across multiple web projects within the repository, even when different web sites may choose to use the web forms differently. For example, when you set up a web form you can define the default output path, zero or more presentation templates, and a default workflow. The cool thing is that when you instantiate a web project, you can pick from the available web forms. When you do, you can change the default options. This feature is going to save a lot template development and maintenance time, especially for companies managing multiple online properties.
I haven’t yet figured out how well deployment is addressed in the GA release of 2.0. I know the plan is to support deployment between a core Alfresco repository and a “read-only” Alfresco repository as well as to the file system. The early and beta releases of 2.0 didn’t include this functionality so it will be interesting to see how deployment is addressed in the GA release.

There are several other changes that have been made since the preview release. And not all 2.0 features are WCM-specific. (Many will be glad to know a tree view of the repository is now available). Check the roadmap at the Alfresco Wiki for a detailed list of what’s new.

Note that if you’re downloading Community 2.0 and you want both the core repository as well as WCM you’ll need to follow the 2.0 download link as well as the WCM link. The WCM download includes the virtualization server, some config info, and a sample form.

Book review: Alfresco Enterprise Content Management

I’ve recently finished the new Alfresco book. The bottom-line on this book is this: End-users evaluating or learning to use Alfresco may find the book helpful. Most chapters are aimed at teaching people how to work with the web client. For this audience, the book does a pretty good job of presenting a logical progression through the product. Although most of the information in the book is readily available through the Alfresco wiki, forums, demos, and documentation, the book pulls it together in a hassle-free, portable format. Technical users, however, will be disappointed. If you want technical depth this book isn’t for you. Before you shell out the $60 you really need to consider what kind of information you are looking for. You may ultimately be better off surfing blogs, forums, and wikis.
When I originally saw the announcement for the book, I was excited. Shelf space is one datapoint that can be used to measure technology adoption and maturity. More specifically, I was hoping the book would be a one stop shop for business users as well as technical users–sort of an Alfresco Unleashed. But as the preface clearly states, “This book is not targeted at developers who want to change the core code structure of Alfresco.” I don’t necessarily want to change the “core code structure,” but I do want to implement, customize, and extend Alfresco for my clients. In that respect my expectations for the book turned out to be way too high. For now, at least, from a technical publication standpoint, Alfresco has yet to be “Unleashed” or “Exposed”.
Workflow is an example where the book could have gone into much more depth, particularly with the introduction of the JBPM integration in release 1.4. But the chapter on workflow focuses almost exclusively on the simple folder-based workflow functionality. Although there is a section on advanced workflow using JBPM, it only skims the surface by providing the steps one goes through to define an advanced workflow. It does this at such a high level it really provides little value other than to inform the reader that there’s such a thing as advanced workflow.

Other topics that got little or no attention include details on Alfresco’s approach to content storage and the underlying relational database schema, LDAP integration/syncronization, scheduling jobs, and the Alfresco SDK. The book is written for Alfresco 1.4 so the new WCM offering included in 2.0 is completely unaddressed.
Aside from the depth issue, the book could use a bit more organization and a more thorough editing job to be more relevant to the intended audience and to resolve some of the more glaring typos and grammatical problems.

Here’s an example of one aspect of the organizational problem. The book is primarily focused at end- and power-users of Alfresco who need to work with the web client (out of a total of 13 chapters, the first 10 are squarely focused on that use case). But Shariff mixes in details on configuration tasks that end-users would never do such as defining content types and tweaking the web client configuration XML. An alternative approach would have been to keep those first 10 chapters clean of any tasks requiring configuration changes and make it purely about web client how-to. Later chapters could have then gone into configuration.

I did find some useful tidbits in the book. The sections on dashlets and Freemarker in Chapter 11, “Customizing the User Interface” were useful to me as well as Chapter 13 on imaging.
Writing on technical books on Enterprise Content Management is tough. There are many different use cases, some of which differ greatly in what’s deemed critical functionality, often depending on industry and type of organization. Plus there are usually many potential approaches for customizing and extending the underlying platforms. Multiply that by the speed of change we’re currently experiencing in the ECM market and it’s a wonder anyone writes on the topic at all. In fact, I sometimes wonder how relevant traditional technical publications are for such dynamic topics, but I digress.

Shariff should be commended for being the first to publish a real book on Alfresco. I know others will follow. Hopefully, right now, somewhere someone’s frantically cranking out chapters and code snippets. To you I say, keep at it! Shariff may have been first to the punch, but Alfresco is still waiting to be Unleashed!