Category: Content Management

Enterprise Content Management (ECM), Web Content Management (WCM), Document Management (DM). Whatever you call it this category covers market happenings and lessons learned.

Curl up with a good web script

Curl is a useful tool for all sorts of things. One specific example of when it comes in handy is when you are developing Alfresco web scripts. On a Surf project, for example, you might divide into a “Surf tier” team and a “Repository tier” team. Once you’ve agreed on the interface, including both the URLs and the format of the data that goes back-and-forth between the tiers, the two teams can start cranking out code in parallel.

If you’re on the repo team, you need a way to test your API, and you probably don’t have a UI to test it with (that’s what the other team’s working on). There are lots of solutions to this but curl is really handy and it runs everywhere (on Windows, use Cygwin).

This post isn’t intended to be a full reference or how-to for curl, and obviously, you can use curl for a lot of tasks that involve HTTP, not just Alfresco web scripts. Here are some quick examples of using curl with Alfresco web scripts to get you going.

Get a ticket

It’s highly likely that your web script will require authentication. So the first thing you do is call the login web script to get a ticket.

curl -v "http://localhost:8080/alfresco/service/api/login?u=admin&pw=somepassword"

Alfresco will respond with something like:

<?xml version="1.0" encoding="UTF-8"?>
<ticket>TICKET_e46107058fdd2760441b44481a22e7498e7dbf66</ticket>

Now you can take that ticket and append it to your subsequent web script calls.

Any web script you’ve got that accepts GET can be tested using the same simple syntax.

Post JSON to your custom web script

If all you had were GETs you’d probably just test them in your browser. POSTs, PUTs and DELETEs require a little more doing to test. You’re going to want to test those web scripts so that when the front-end team has their stuff ready, it all comes together without a lot of fuss.

So let’s say you’ve got a web script that the front-end will be POSTing JSON to. To test it out, create a file with some test JSON, then post it to the web script using curl, like this:

curl -v -X POST "http://localhost:8080/alfresco/service/someco/someScript?alf_ticket=TICKET_e46107058fdd2760441b44481a22e7498e7dbf66" -H "Content-Type: application/json" -d @/Users/jpotts/test.json

By the way, did you know that starting with 3.0, if you name your controller with “.json” before the “.js” the JSON will be sitting in a root variable called “json”?  So in this case instead of naming my controller “someScript.post.js” I’d name it “someScript.post.json.js” and then in my JavaScript, I can just eval the “json” variable that got created for me automatically and start working with the object,  like this:

var postedObject = eval('(' + json + ')');
logger.log("Customer name:" + postedObject.customerName);

Run a CMIS query

With 3.0 Alfresco added an implementation of the proposed CMIS spec to the product. CMIS gives you a Web Services API, a RESTful API, and a SQL-like query language. Once you figure out the syntax, it’s easy to post CMIS queries to the repository. You can wrap the CMIS query in XML:

<cmis:query xmlns:cmis="http://www.cmis.org/2008/05" >
<cmis:statement><![CDATA[select * from cm_content where cm_name like '%Foo%']]></cmis:statement>
</cmis:query>

Then post it using the same syntax as you saw previously, but with a different Content-Type in the header, like this:

curl -v -X POST "http://localhost:8080/alfresco/service/api/query?alf_ticket=TICKET_e46107058fdd2760441b44481a22e7498e7dbf66" -H "Content-Type: application/cmisquery+xml" -d @/Users/jpotts/cmis-query.xml

Alfresco will respond with ATOM, but it’s a little verbose so I won’t take up space here to show you the result. Also, I noticed this bombed when I ran it against 3.1 Enterprise but I haven’t drilled down on why yet.

Create a new object using CMIS ATOM

Issuing a GET against a CMIS URL returns ATOM. But CMIS URLs can also accept POSTed ATOM to do things like create new objects. For example, to create a new content object you would first create the ATOM XML:

<?xml version="1.0" encoding="utf-8"?>
<entry xmlns="http://www.w3.org/2005/Atom" xmlns:cmis="http://www.cmis.org/2008/05">
<title>Test Plain Text Content</title>
<summary>Plain text content created via CMIS POST</summary>
<content type="text/plain">SGVyZSBpcyBzb21lIHBsYWluIHRleHQgY29udGVudC4K</content>
<cmis:object>
<cmis:properties>
<cmis:propertyString cmis:name="ObjectTypeId"><cmis:value>document</cmis:value></cmis:propertyString>
</cmis:properties>
</cmis:object>
</entry>

Note that the content has to be Base64 encoded. In this case, the content is plain text that reads, “Here is some plain text content.” One way to encode it is to use OpenSSL like “openssl base64 -in <infile> -out <outfile>”. The exact syntax of ATOM XML with CMIS is the subject for another post.

Once you’ve got the XML ready to go, post it using the same syntax shown previously, with a different Content-Type in the header:

curl -v -X POST "http://localhost:8080/alfresco/service/api/node/workspace/SpacesStore/18fd9821-42a5-4c6a-86d3-3f252679cf7d/children?alf_ticket=TICKET_e46107058fdd2760441b44481a22e7498e7dbf66" -H "Content-Type: application/atom+xml" -d @/Users/jpotts/testCreate.atom.xml

The node reference in the URL above is a reference to the folder in which this new child will be created. There’s also a similar URL that uses the path instead of a node ref if that’s more your thing.

Refreshing Web Scripts from Ant

One of the things you do quite frequently when you develop web scripts is tell Alfresco to refresh its list of web scripts. There are lots of ways to automate this, but one is to create an Ant task that uses curl to invoke the web script refresh URL. This lets you deploy your changes and tell Alfresco to refresh the list in one step (and makes sure you and your teammates never forget to do the refresh).

<target name="deploy-webscripts" depends="deploy" description="Refreshes the list of webscripts">
<exec executable="curl">
<arg value="-d"/>
<arg value="reset=on"/>
<arg value="http://${alfresco.web.url}/service/index"/>
</exec>
</target>

In this example, the “deploy” ant task this task depends on is responsible for copying the web scripts to the appropriate place in the exploded Alfresco WAR. (Thanks to my colleague Eric Shea (http://www.eshea.net/2009/01/30/alfresco-dev-survivors-kit-part-1/) for this tip).

So there you go. It’s not Earth-shattering but it might give you a productivity boost if you don’t already have curl or an alternative already in your bag of tricks.

Screencast: Alfresco-Drupal CMIS Integration Demo

UPDATE: Screencast now lives here:

I’ve created a new screencast that shows the Alfresco-Drupal CMIS integration in action over at Optaros Labs. The screencast shows content moving back-and-forth between Alfresco and Drupal, content being displayed in a Drupal site that lives in Alfresco, and a CMIS CQL query being executed against the Alfresco repository from Drupal.

The Drupal CMIS module and the CMIS Alfresco module are available at Drupal.org.

CMIS, ECM Interoperability, and Services-Oriented Content Management

The emerging Content Management Interoperability Services (CMIS) standard was the main buzz at the AIIM conference in Philadelphia this week (followed closely by social networking and Enterprise 2.0). Why does CMIS garner so much attention? Because as EMC’s David Choy said, “Unless they are starting from scratch, everybody implementing ECM has an interoperability problem.”

The impact of the problem depends on who you are and what you’re trying to do. If you use one ECM repository for email archiving and another to manage the content on your web site, it doesn’t really matter that the two repositories aren’t interoperable. But what if you have a portfolio of web sites and asset repositories across a variety of platforms and you need to share content across all of them? Then it’s a problem.

Customers in this situation have all kinds of issues they have to deal with, but several come down to a few basic needs having to do with interoperability:

  • It needs to be easy for front-ends to store and retrieve content across multiple repositories (different platforms)
  • It needs to be easy to move content between repositories
  • It needs to be easy to find interesting and relevant content (which I may then want to access from the front-end or send to some other repository)

This multi-site/multi-repository problem is common, and at Optaros, we think we can help address it with tools and services that are, in many cases, driven by CMIS.

Front-ends

I’ve met with clients who’ve invested a lot of time and effort in a custom front-end for Vignette, for example. Over time, they’ve added additional systems to manage assets but the front-end is forever tied to Vignette. Sure, they could introduce a services layer that abstracts the repository, but each time they add a new kind of repository, that’s more development they have to do. CMIS provides a standard way of interfacing with all CMIS-compliant repositories, so the amount of repository-specific coding you have to do is reduced.

At Optaros, we’re developing CMIS adapters or integrations and releasing them as open source so that popular front-ends can get to content more easily. (See Drupal CMIS). We think there’s probably also a case to be made for a CMIS-based “content bus” that front-ends and compliant repositories could plug into.

Replication

What if you want to move content between repositories? I’ve talked to folks who have Drupal web sites, but they’d like to take some of the user-generated content that comes from Drupal and treat it more formally–like maybe they want to route it through an internal workflow, tag it, and then make it available to some of the other sites in their portfolio that might or might not be Drupal. CMIS gives us a common way to export and import content (through ATOM XML). Throw a transformation in the middle to handle schema differences and you’ve got yourself a CMIS-based replication engine that can move content between different kinds of repositories.

We’ve built a basic synchronization between Alfresco and Drupal as part of the Drupal CMIS and Alfresco modules, but we think this kind of functionality should be separated out and treated more generically–like a replication server that could sit between any number of CMIS-compliant endpoints.

Feeds & Search

How do you know when new content you might be interested in shows up in one of the numerous repositories you have in your environment? One answer is search. Initially, I was thinking a federated search server based on OpenSearch would be worthwhile but I think at this point, most people are just indexing everything rather than taking a federated approach.

Search is good when you are actively looking for specific content, but people (and systems) need to find content they are interested in as it becomes available. CMIS returns data as ATOM XML. That means you can use the same RSS/ATOM feed aggregation tools and techniques you use to track news to track content updates. This isn’t new–you’ve always been able to bolt on RSS feeds to your content repository by using the repository’s API to query for content and return it formatted as a feed, but CMIS gives us this “for free”.

So as CMIS becomes more widely adopted, we may see a big increase in the amount of ATOM/RSS flowing through the Enterprise as people begin to use feeds to discover new content. An Enterprise RSS server would seem like a good tool to leverage to aggregate the feeds coming from various content repositories across the content domain. It could also be used for other non-CMIS related feeds which will also surely increase as Enterprise 2.0 adoption spreads.

Most of the feed aggregation functionality available today is embedded within broader platforms (RSS portlets, for example). At Optaros, we think that, like replication, feed aggregation should be split out into a dedicated service. Think “Google Reader for the Enterprise”, essentially. There are proprietary systems out that that do this but no open source alternatives that are more than just personal feed aggregators (at least that I could find) so we may have to develop this.

Services-Oriented Content Management

These individual services–CMIS adapters, Replication, and Feeds–are part of a Services-Oriented approach to Content Management. The services are interrelated, and there are others I haven’t discussed, but the idea is that this type of approach can make a multi-silo’d content domain much more manageable and useful. Some of it depends on CMIS and some of it doesn’t. These ideas are still being hammered out so if you have interest in any of it, please let me know.

It’ll be interesting to see what happens as CMIS moves through additional iterations and ultimately becomes ratified hopefully by the end of the year.

For more info, see:

Alfresco 3.1 Enterprise is a significant release for WCM users

I don’t typically write a post every time a vendor releases a new version of software but Alfresco 3.1 Enterprise, which became available for download on Tuesday, is a significant release. Users of 2.2, who had previously been asked not to upgrade to 3.0, should now upgrade to either 2.2 SP3 or 3.1. All of the fixes in 2.2 SP3 are included in the 3.1 release. If you are currently on 2.2 SP3, the biggest reason to move to 3.1 would be to take advantage of Alfresco Share. Conservative users may decide to wait until 3.1.1.

One big change with 3.1 is that modified items are now merged with the Staging sandbox asynchronously. That means when submitting modified items, users are immediately returned to the sandbox list and will still see the modified items until they are committed. In a cursory test of this, I was surprised at how long it took for those changes to leave my Modified Items list. Maybe the polling interval is configurable.

This change affects the out-of-the-box WCM submission workflow. So if you created your own custom WCM submission workflow, you’re going to need to make some changes. The required changes are documented in the release notes, or you can take a look at the new submission workflow to get the gist.

Another new feature of Alfresco 3.1 is the REST API for WCM. A lot of URLs you probably already created on your own are included. The new API includes things like adding users to web projects, creating user sandboxes, creating, updating, and deleting assets, submitting assets, and more. Even if you already built these yourself, you should take a look to see if these meet your needs. Why continue to maintain your own custom code?

The 3.1 release also marks a shift in Alfresco’s “two flavors” approach. According to John Newton (post), Alfresco is looking for ways to entice large Enterprise users to migrate from Labs to Enterprise. So they’ve created functionality that they feel only appeals to large Enterprises and are making it available only to people paying for subscriptions. This includes things like monitoring (JMX, Hyperic plug-in), proprietary database extensions, and clustering.

Newton says 100% of the source code will still be available for both releases and that fixes made in Enterprise will be made available in the next Labs release (although he didn’t say how long Labs releases will lag behind Enterprise).

Other noteworthy 3.1 fixes or enhancements include:

  • No one (not even admin) can write to a Staging sandbox in WCM.
  • Share now includes a “Links” component (which means I don’t have to finish coding the “Bookmarks” component we started but never finished). There are numerous other Share enhancements around Calendar, rich text editing, and previewing.
  • Actions can now have AND/OR conditions and can trigger on property values.
  • A new group called ALFRESCO_ADMINISTRATORS has been created that makes it much easier to designate administrator users other than admin.

See the release notes for a full list of Jira tickets addressed by the release.

Let’s celebrate open source freedom in Philly at AIIM

I’ll be honest. AIIM is kind of a beat-down for me. All year long my world is pretty much all open source, all the time. Except for a few days each year when I go to AIIM and I’m literally engulfed by the super-booths of Mystical Quadrant leaders.

That’s why, if you want to find me, I’ll be hanging with the Alfresco crew. I like to think of it as an open source oasis of sorts. I’m going to demo the Alfresco-Drupal CMIS integration at 1:15 on Tuesday in the Alfresco booth but I might be around at other times in-between sessions as well.

By Tuesday I’ll be focusing on the positive. Some of the CMIS and Enterprise 2.0 talks sound interesting. Nuxeo has a booth so I’ll probably drop in on them. And I’m looking forward to meeting up with ecmarchitect.com readers, so please do say hello. If AIIM’s buying any beer we can use it to drink a toast to open source!

Webinar: Developer’s Intro to Alfresco Part 3: Web Scripts & Surf

Part 3 of the “Developer’s Introduction to Alfresco” webinars is today at 12 Eastern (GMT – 5). I’ll be talking about web scripts, Surf, and CMIS. The format will be the same as the first two parts: you’ll watch a pre-recorded presentation for about 30 minutes, then I’ll do at least a 30 minute live Q&A session.

Also, I know I’ve been slow in getting this posted, but the Part Two presentation and Q&A recordings are available at Alfresco’s on-demand events page.

Webinar: Developer’s Intro to Alfresco Part Two

I’m doing Part Two of a three-part “Developer’s Introduction to Alfresco” webinar tomorrow at 11:00 Central/12:00 Eastern/16:00 GMT. We’ll be covering custom content modeling and actions.

Last week’s Part One webinar was well-attended and we had a great Q&A session following so I’m looking forward to another good one, particularly as we start to get a little more technical.

Event logistics are at Alfresco’s events page.

Reminder: DFW Alfresco Meet-up is Monday

Don’t forget to sign-up for the first ever DFW Alfresco Meet-up. It’s happening Monday, 3/9 at Ackerman McQueen over in Las Colinas. Plan to arrive around 5:30 and we’ll start our first topic at 6:00. We’ll hear about Ackerman McQueen’s recent Alfresco WCM-based project as well as the portal implementation built on Alfresco DM and Django (a Python-based framework) from the folks over at Neiman Marcus.

We’re letting Optaros pick up the tab on food and drinks so if you’re doing an Alfresco project right now or considering it, you need to join us. Come share what you’ve learned with others and maybe leave with a few new ideas as well.

Address and directions are on the sign-up page.

Grasping Thumbnails in Alfresco 3

With Alfresco 3 (both Labs and Enterprise), Alfresco added a new thumbnail service. It isn’t documented too well yet so I thought I’d write up a quick example.

What is it

The Thumbnail Service is used to create alternate renditions of objects. Typically, those alternate renditions are small images called “thumbnails”. You can see a working application of the thumbnail service if you take a look at Alfresco Share’s document library. When you upload a document, the thumbnail service is invoked, and a small image is shown next to each item in the list.

Where thumbnails live

Like everything else in Alfresco, thumbnails are stored as nodes. Nodes are instances of cm:thumbnail and are stored as children of the object they represent. (You can see this for yourself by looking at the thumbnailed object in the node browser). Objects can have any number of thumbnails. This lets you have thumbnails of different sizes and mime types, for example.

Once Alfresco generates a thumbnail for an object, the object will have the cm:thumbnailed aspect applied to it so it is easy to find or filter objects based on whether or not they have at least one thumbnail.

Thumbnail definitions & thumbnail names

Every thumbnail has a thumbnail definition. The thumbnail definition keeps track of things like the mime type, transformation options, placeholder path, and thumbnail name. The thumbnail name uniquely identifies the thumbnail definition in the thumbnail registry. When you want to generate or display a thumbnail for an object, you must specify the name. For example, given a thumbnail definition named “scImageThumbnail”, you could use JavaScript to create a thumbnail for an object by calling the “createThumbnail” method on a ScriptNode like this:

document.createThumbnail("scImageThumbnail", true);

The first argument is the name of the thumbnail definition. The second argument says the thumbnail should be generated asynchronously.

Registering thumbnail definitions

The thumbnail registry needs to know about your thumbnail definitions. The out-of-the-box thumbnails are registered in the thumbnail-service-context.xml file. I don’t see a clean way to extend that without repeating the definitions, so in my example, I wrote a bean that calls the Thumbnail Registry and registers the custom thumbnail definitions provided in the Spring context file:


public class ThumbnailRegistryBootstrap {
  private ThumbnailService thumbnailService;
  private List<ThumbnailDefinition> thumbnailDefinitions;
  private Logger logger = Logger.getLogger(ThumbnailRegistryBootstrap.class);

public void init() {
  ThumbnailRegistry thumbnailRegistry = thumbnailService.getThumbnailRegistry();
    for (ThumbnailDefinition thumbDef : thumbnailDefinitions) {
      logger.info("Adding thumbnail definition:" + thumbDef.getName());
      thumbnailRegistry.addThumbnailDefinition(thumbDef);
    }
}

public void setThumbnailService(ThumbnailService thumbnailService) {
  this.thumbnailService = thumbnailService;
}

public void setThumbnailDefinitions(
  List<ThumbnailDefinition> thumbnailDefinitions) {
  this.thumbnailDefinitions = thumbnailDefinitions;
}

}

So this class will add all of my thumbnail definitions to the thumbnail registry. The class and the definitions are configured in a Spring context file. The config for a single thumbnail called “scImageThumbnail” which is a PNG 100 pixels high and retains the original aspect ratio of the image would be:


<bean id="someco.thumbnailRegistry"
 class="com.someco.thumbnails.ThumbnailRegistryBootstrap"
 depends-on="ThumbnailService"
 init-method="init">
  <property name="thumbnailService" ref="ThumbnailService" />
  <property name="thumbnailDefinitions">
    <list>
      <bean class="org.alfresco.repo.thumbnail.ThumbnailDefinition">
        <property name="name" value="scImageThumbnail" />
        <property name="mimetype" value="image/png"/>
        <property name="transformationOptions">
          <bean  class="org.alfresco.repo.content.transform.magick.ImageTransformationOptions">
            <property name="resizeOptions">
              <bean class="org.alfresco.repo.content.transform.magick.ImageResizeOptions">
              <property name="height" value="100"/>
              <property name="maintainAspectRatio" value="true"/>
              <property name="resizeToThumbnail" value="true" />
            </bean>
          </property>
        </bean>
     </property>
     <property name="placeHolderResourcePath" value="alfresco/extension/thumbnail/thumbnail_placeholder_scImageThumbnail.png" />
   </bean>
 </list>
 </property>
</bean>

The placeholder is a graphic that the thumbnail service can return as the thumbnail if the thumbnail for a given node has not been generated. In my example I just copied one of the out-of-the-box placeholders and renamed it but you could use anything you want there.

Example

I built a simple example to show how this works. Here is a screencast that shows it running or you can download the source and build it yourself.

In the example, a simple form is presented to allow a file to be uploaded. The form posts to a web script which creates a new node using the file provided. The form GET and POST web scripts are essentially the “helloworldform” web scripts from the Alfresco Developer Guide.

The “image list” is a simple GET web script that queries the folder where the images are uploaded to and writes out a list of image tags. The interesting thing to note here is the URL that’s used:

${url.serviceContext}/api/node/workspace/SpacesStore/${image.id}/content/thumbnails/scImageThumbnail?ph=true&c=queue

That URL is an out-of-the-box web script that returns the specified thumbnail for a given node reference. In my example I’m using the “ph” and “c” arguments. The “ph” argument says whether or not the placeholder image should be returned if the thumbnail does not exist. The “c” argument says that if a thumbnail doesn’t exist, queue a request for thumbnail creation. (Note that the descriptor says the queue create argument is “qc” but if you look at the controller source you’ll see it is actually just “c”. I’ll check to see if there’s a Jira on that).

When you add a new image and then go to the image list you’ll see the placeholder graphic. Behind the scenes, a thumbnail creation request has been queued. If you refresh the page, the thumbnail should show up because Alfresco has had a chance to generate it. If you wanted to queue the request when the node is created, you could either create a rule on the folder that holds the images, or you could add a call to “createThumbnail” in the upload POST web script controller, as shown earlier. (I’ve got an example of that commented out in the source).

That’s it

Hopefully this has given you some insight into the new thumbnail service in Alfresco. If you want to play with it yourself, you can download the source for the example and build it with Ant (make sure you set build.properties to match your environment first) by running “ant deploy”. Make sure you’ve got ImageMagick installed on your Alfresco server–the thumbnail service depends on it. You’ll also need the SDK to compile the registry bootstrap class. If you want to see what the thumbnail service is actually doing you’ll need the Alfresco source. None of the thumbnail source is included in the source code that currently accompanies the SDK.