Category: General

General thoughts that defy categorization.

Review: Professional Alfresco, Wrox

I recently finished reading Professional Alfresco, the new Alfresco book written by some of the Alfresco engineers and John Newton, Alfresco’s CTO. Before I share my thoughts, a few disclaimers: First, I wrote my own book on Alfresco called the Alfresco Developer Guide (Packt, 2008). Second, Wrox provided the book to me free of charge in the hopes that I’d write something about it here. Third, I have a strategic relationship with Alfresco, part of which includes bringing each other business.

Okay, with that out of the way, let’s get to it. Professional Alfresco is a new book written about Alfresco 3.2 by a team of authors with a unique insider perspective: all are Alfresco employees. It’s not an end-user focused book nor is it strictly for developers–it’s actually aimed at several different audiences:

  • Part 1, “Getting to Know Alfresco”, is aimed at IT managers or other folks who might be evaluating Alfresco. It covers the business benefits of Alfresco and provides a high-level overview of the platform.
  • Part 2, “Getting Technical”, looks at the platform’s components and services at a closer level. These chapters are directed at Technical Architects or anyone who’s trying to figure out the technical capabilities of Alfresco.
  • Parts 3 & 4 are aimed squarely at Developers. More specifically, these 6 chapters cover Web Scripts (primarily JavaScript, but a Java example is given) and Alfresco Share. The last 3 chapters of Part 4 provide a step-by-step example of building a Knowledgebase application by customizing Alfresco Share, including a few (brief) pages on the new Form Service. A lot of people are working on Share customization projects these days so many will find this a welcome set of material.

While the majority of the technical how-to in the book is focused on Web Scripts and Share, I particularly liked the chapter on Advanced Workflows. It did a good job of explaining what you can do with the jBPM engine without getting too far into the weeds. The section on the Authentication subsystem, LDAP config, and chaining was also very good, particularly as the subsystem setup is a fairly recent development that not everyone is familiar with.

While I did find a lot to like about the book, there were a few things to pick on. First, if you read the book front-to-back, you’ll notice a significant amount of repetition. I suppose that could be a good thing when one of the three audiences the book is written for picks up the book and goes directly to their area of interest. I wonder, though, if some of it was due to having so many authors collaborating on the writing project. The repetition left me feeling like I was really slogging through the material rather than cutting to the chase.

I thought the content modeling chapter was thorough, but I had to wonder why the author chose to step us through the modelSchema.xsd file instead of providing example content model XML. It’s good to know the content model schema is there if I need it, but I think examples of what I can do with the XML are far more illustrative than walking through the schema.

The form service and Share/Surf aren’t covered in nearly enough detail. Other aspects of the platform simply aren’t addressed at all. I think some of that may be because of timing. The form service, for example, continues to evolve with version 3.3, and when you undertake a project this big, you have to draw a line somewhere. Plus, the focus on web scripts and Share is aligned with where Alfresco is focused right now.

The organization of the book is good. It follows a logical progression through the platform. And I like that the end-to-end Knowledgebase example is placed at the end as a sort of capstone applying the concepts learned earlier in the book. If you’re looking for a tutorial-style book though, you may be frustrated by the amount of theory up-front. It’s just not that kind of book. One side note on organization, Chapter 17 is a bit of an odd duck. It’s got interesting content–the chapter discusses various patterns of Alfresco implementation and integration with other systems. I just thought it was weird that it was at the end of the book instead of in one of the first two parts. Not a huge deal, and I’m glad they included it, even if its placement makes it seem like an after-thought.

Overall, Professional Alfresco is a good book appropriate to several different types of readers. Even though there were several authors that wrote it, other than the repetition issue I noted, I didn’t feel like the transitions between authors were very noticeable–the editors did a great job stitching everything together and making it seem like one voice.

The bottom line is that if you are evaluating Alfresco and are trying to understand the architecture of the platform, or if you are a developer focused on web scripts and Share, you’ll find this book to be a valuable resource.

New Tutorial: Getting Started with CMIS

I’ve written a new tutorial on the proposed Content Management Interoperability Services (CMIS) standard called, “Getting Started with CMIS“. The tutorial first takes you through an overview of the specification. Then, I do several examples. The examples start out using curl to make GET, PUT, POST, and DELETE calls against Alfresco to perform CRUD functions on folders, documents, and relationships in the repository. If you’ve been dabbling with CMIS and you’ve struggled to find examples, particularly of POSTs, here you go.

I used Alfresco Community built from head, but yesterday, Alfresco pushed a new Community release that supports CMIS 1.0 Committee Draft 04 so you can download that, use the hosted Alfresco CMIS repository, or spin up an EC2 image (once Luis gets it updated with the new Community release). If you don’t want to use Alfresco you should be able to use any CMIS repository that supports 1.0cd04. I tried some, but not all, of the command-line examples against the Apache Chemistry test server.

Once you’ve felt both the joy and the pain of talking directly to the CMIS AtomPub Binding, I take you through some very short examples using JavaScript and Java. For Java I show Apache Abdera, Apache Chemistry, and the Apache Chemistry TCK.

For the Chemistry TCK stuff, I’m using Alfresco’s CMIS Maven Toolkit which Gabriele Columbro and Richard McKnight put together. That inspired me to do my examples with Maven as well (plus, it’s practical–the Abdera and Chemistry clients have a lot of dependencies, and using Maven meant I didn’t have to chase any of those down).

So take a look at the tutorial, try out the examples with your favorite CMIS 1.0 repo, and let me know what you think. If you like it, pass it along to a friend. As with past tutorials, I’ve released it under Creative Commons Attribution-Share Alike.

[Updated to correct typo with Gabriele’s name. Sorry, Gab!]

Summer grilling tips for your CMS vendor

I like this post from Jon Marks at JonOnTech. It’s about questions you should be asking your CMS vendor you might not have thought to ask. The first five are especially good (see his post for the explanation of each question and the rest of the list):

  1. Who was the last vendor to beat you in the last round of a selection exercise? Why do you think they won?
  2. If, in a few years time, we decided to move away from your product, how would I go about migrating all my content into a new system?
  3. How many active developers do you have on your developer forums?
  4. All of these are important, but please rate these in order of your priority: a) Product Features b) Performance and Stability c) Usability d) Security
  5. How much would I expect to pay a contractor developer that is skilled with your CMS, and are they easy to find?

I am consistently disappointed with how companies evaluate and choose software vendors. Part of the problem is when companies use RFP processes that handle software purchases the same way that factory equipment purchases are handled, but that’s another post (see Making RFP’s More Effective).

The other part of the problem is the questions that never get asked during the vendor pitch. To Jon’s list, I would add:

  1. How long and how many resources did it take to build this demo? You’re looking for closeness of fit, effort to customize, and skillsets involved.
  2. What are the top three technical resources my team should have at the ready during the implementation? You’re looking for availability and helpfulness of documentation. How much of it is vendor-produced versus community-produced? It’s not necessarily bad if the majority of the resources are community-produced–it’s just a data point.
  3. If it makes sense depending on the kind of software, ask do you use your own software in-house. If they don’t, that’s certainly a data point. If they do, ask, as an end-user, what are your top-three headaches when using the software? This is sort of a “what is your biggest area for improvement” kind of question–watch out for turn-your-weakness-into-a-positive kind of answer (“The software is just too powerful!”). Every piece of software has idiosyncrasies. They should be able to name a few.
  4. Tell us about the last implementation that just went completely sideways for reasons attributable to the technology, not to project mis-management, political, or other issues. Obviously, the vendor scores points for honesty on this one, but it’s also interesting to hear how much/little the vendor was involved in salvaging the deal (if it was able to be salvaged).
  5. What is your maintenance renewal rate? I’ve never heard this one asked, but I would think this would be a very telling stat. Customers have all sorts of reasons for not renewing maintenance, but the obvious one is that they feel like the vendor isn’t giving them enough support value for the expense. For commercial open source vendors, support may be their sole source of revenue (excluding professional services, hosting, etc.), so for them you’d think this would be a very high number, otherwise, what’s the point?

By the way, giving your vendors a good grilling isn’t limited to software companies. Picking a services firm also deserves a good set of probing questions, but that’s also another post.

What about you? Got any good questions to ask CMS or other software vendors?

Google App Engine Now Supports Java

I’ve been playing with the newly-released Java support in Google App Engine and it is pretty cool. You can do more than I expected you could:

  • The Google App Engine Eclipse plug-in gives you a template project and associated config files, Ant build scripts, a deployment tool, and a local run-time environment that acts like GAE (user service, data store, limitations imposed by the platform).
  • You’ve got full persistence and query capability via JDO. You pretty much just model your entities as POJO’s, then you annotate the fields in those classes as “persistent” and you’re good to go. You do JDOQL to query your objects. Queries will only return the first 1000 results.
  • You can run cron jobs. A cron job wakes up on a schedule and invokes a URL you specify.
  • Servlets and JSPs are supported but you can also use things like Struts and Spring (See Will it Work in Google App Engine?).
  • You can take advantage Google’s User service, which means anyone with a Google account can sign-in to your app without creating a new account.
  • You can take advantage of Memcache if you need it (JCache).
  • You can fetch URLs via the URL Fetch service or java.net.URLConnection.
  • You can send mail via JavaMail.
  • You can use their Image service to resize, rotate, flip, and crop images.
  • Both JDK 5 and JDK 6 are supported.

There are some limits:

  • Execution of requests is limited to 30 seconds and that includes URLs invoked by cron jobs.
  • You can’t write to the file system. If you need to write out files, I assume you’d use S3 or something.
  • You can’t open sockets.
  • Each developer can create up to 10 applications and apps can’t be deleted so don’t fill up on Hello Worlds.
  • You can run an app that has up to 500 MB of storage and serves 5 million page views per month at no cost.

The beauty, obviously, is that as a developer, you get to focus on the code and let Google worry about scaling. For many applications, this Platform-as-a-Service (PaaS) will be preferred over Infrastructure-as-a-Service (IaaS). In an IaaS setup, you can use solutions like RightScale to automatically provision new nodes to handle spikes in demand, but you still have to set that up. Plus, you’ve got the additional cost and headache of installing, configuring, and maintaining the application server and database software (and making sure it is set up to work when new nodes are auto-provisioned). With the app engine, scaling globally is pretty simple: Step 1 – Write (Good) Code; Step 2 – Deploy Code to GAE.

Grasping Thumbnails in Alfresco 3

With Alfresco 3 (both Labs and Enterprise), Alfresco added a new thumbnail service. It isn’t documented too well yet so I thought I’d write up a quick example.

What is it

The Thumbnail Service is used to create alternate renditions of objects. Typically, those alternate renditions are small images called “thumbnails”. You can see a working application of the thumbnail service if you take a look at Alfresco Share’s document library. When you upload a document, the thumbnail service is invoked, and a small image is shown next to each item in the list.

Where thumbnails live

Like everything else in Alfresco, thumbnails are stored as nodes. Nodes are instances of cm:thumbnail and are stored as children of the object they represent. (You can see this for yourself by looking at the thumbnailed object in the node browser). Objects can have any number of thumbnails. This lets you have thumbnails of different sizes and mime types, for example.

Once Alfresco generates a thumbnail for an object, the object will have the cm:thumbnailed aspect applied to it so it is easy to find or filter objects based on whether or not they have at least one thumbnail.

Thumbnail definitions & thumbnail names

Every thumbnail has a thumbnail definition. The thumbnail definition keeps track of things like the mime type, transformation options, placeholder path, and thumbnail name. The thumbnail name uniquely identifies the thumbnail definition in the thumbnail registry. When you want to generate or display a thumbnail for an object, you must specify the name. For example, given a thumbnail definition named “scImageThumbnail”, you could use JavaScript to create a thumbnail for an object by calling the “createThumbnail” method on a ScriptNode like this:

document.createThumbnail("scImageThumbnail", true);

The first argument is the name of the thumbnail definition. The second argument says the thumbnail should be generated asynchronously.

Registering thumbnail definitions

The thumbnail registry needs to know about your thumbnail definitions. The out-of-the-box thumbnails are registered in the thumbnail-service-context.xml file. I don’t see a clean way to extend that without repeating the definitions, so in my example, I wrote a bean that calls the Thumbnail Registry and registers the custom thumbnail definitions provided in the Spring context file:


public class ThumbnailRegistryBootstrap {
  private ThumbnailService thumbnailService;
  private List<ThumbnailDefinition> thumbnailDefinitions;
  private Logger logger = Logger.getLogger(ThumbnailRegistryBootstrap.class);

public void init() {
  ThumbnailRegistry thumbnailRegistry = thumbnailService.getThumbnailRegistry();
    for (ThumbnailDefinition thumbDef : thumbnailDefinitions) {
      logger.info("Adding thumbnail definition:" + thumbDef.getName());
      thumbnailRegistry.addThumbnailDefinition(thumbDef);
    }
}

public void setThumbnailService(ThumbnailService thumbnailService) {
  this.thumbnailService = thumbnailService;
}

public void setThumbnailDefinitions(
  List<ThumbnailDefinition> thumbnailDefinitions) {
  this.thumbnailDefinitions = thumbnailDefinitions;
}

}

So this class will add all of my thumbnail definitions to the thumbnail registry. The class and the definitions are configured in a Spring context file. The config for a single thumbnail called “scImageThumbnail” which is a PNG 100 pixels high and retains the original aspect ratio of the image would be:


<bean id="someco.thumbnailRegistry"
 class="com.someco.thumbnails.ThumbnailRegistryBootstrap"
 depends-on="ThumbnailService"
 init-method="init">
  <property name="thumbnailService" ref="ThumbnailService" />
  <property name="thumbnailDefinitions">
    <list>
      <bean class="org.alfresco.repo.thumbnail.ThumbnailDefinition">
        <property name="name" value="scImageThumbnail" />
        <property name="mimetype" value="image/png"/>
        <property name="transformationOptions">
          <bean  class="org.alfresco.repo.content.transform.magick.ImageTransformationOptions">
            <property name="resizeOptions">
              <bean class="org.alfresco.repo.content.transform.magick.ImageResizeOptions">
              <property name="height" value="100"/>
              <property name="maintainAspectRatio" value="true"/>
              <property name="resizeToThumbnail" value="true" />
            </bean>
          </property>
        </bean>
     </property>
     <property name="placeHolderResourcePath" value="alfresco/extension/thumbnail/thumbnail_placeholder_scImageThumbnail.png" />
   </bean>
 </list>
 </property>
</bean>

The placeholder is a graphic that the thumbnail service can return as the thumbnail if the thumbnail for a given node has not been generated. In my example I just copied one of the out-of-the-box placeholders and renamed it but you could use anything you want there.

Example

I built a simple example to show how this works. Here is a screencast that shows it running or you can download the source and build it yourself.

In the example, a simple form is presented to allow a file to be uploaded. The form posts to a web script which creates a new node using the file provided. The form GET and POST web scripts are essentially the “helloworldform” web scripts from the Alfresco Developer Guide.

The “image list” is a simple GET web script that queries the folder where the images are uploaded to and writes out a list of image tags. The interesting thing to note here is the URL that’s used:

${url.serviceContext}/api/node/workspace/SpacesStore/${image.id}/content/thumbnails/scImageThumbnail?ph=true&c=queue

That URL is an out-of-the-box web script that returns the specified thumbnail for a given node reference. In my example I’m using the “ph” and “c” arguments. The “ph” argument says whether or not the placeholder image should be returned if the thumbnail does not exist. The “c” argument says that if a thumbnail doesn’t exist, queue a request for thumbnail creation. (Note that the descriptor says the queue create argument is “qc” but if you look at the controller source you’ll see it is actually just “c”. I’ll check to see if there’s a Jira on that).

When you add a new image and then go to the image list you’ll see the placeholder graphic. Behind the scenes, a thumbnail creation request has been queued. If you refresh the page, the thumbnail should show up because Alfresco has had a chance to generate it. If you wanted to queue the request when the node is created, you could either create a rule on the folder that holds the images, or you could add a call to “createThumbnail” in the upload POST web script controller, as shown earlier. (I’ve got an example of that commented out in the source).

That’s it

Hopefully this has given you some insight into the new thumbnail service in Alfresco. If you want to play with it yourself, you can download the source for the example and build it with Ant (make sure you set build.properties to match your environment first) by running “ant deploy”. Make sure you’ve got ImageMagick installed on your Alfresco server–the thumbnail service depends on it. You’ll also need the SDK to compile the registry bootstrap class. If you want to see what the thumbnail service is actually doing you’ll need the Alfresco source. None of the thumbnail source is included in the source code that currently accompanies the SDK.

And we’re back

According to my hosting provider, the raid controller on the box that hosts this site failed at some point yesterday. I guess my bare bones hosting subscription doesn’t include failing me over to another box because I was down from the point of the failure until new parts could be overnighted and installed. Anyway, we’re back up now with a new appreciation for how many 9’s $9.95 buys you.

Alfresco plus Drupal thoughts

I’ve had several discussions with Optaros clients and internal team members lately around Drupal and Alfresco integration. Particularly around this topic, I usually try to listen more than I talk. I want to make sure I understand where the value is for this kind of integration rather than simply geeking out on yet another “stupid CMS trick”. I thought maybe I’d bounce a summary of these thoughts off of you.

The key is to leverage the strengths of each. If you don’t have a problem that requires this particular combination of strengths, assembling a solution from these two components isn’t going to be of any value at all. What are some of the key strengths relevant to this discussion?

Drupal has:

  • A front-end presentation framework. (I would add that it is written in PHP–a relatively widespread language that’s easy to pick up).
  • A very large library of modules, most of them focused on building community-centric web sites.
  • A lightweight footprint, requiring only a web server, MySQL, and PHP. (Yes, I know it is possible to run Drupal on other databases but not every module will).

Alfresco has:

  • Robust workflow via the embedded JBoss jBPM engine.
  • Smart management of file-based objects (files go on the file system, metadata goes in the database, and an API that abstracts the separation).
  • A plethora of file-based protocols and API’s for getting content into and out of the repository, including a framework to easily expose content and business logic via REST.

Silo’d community solutions are best implemented in Drupal alone. Why complicate your life with a separate repository? It adds no value in that situation. Similarly, straight document management (and even team-based collaboration) really can be addressed with the Alfresco repository and the standard Alfresco web client (or, soon, with Alfresco’s new Share client).

I think where Drupal-Alfresco makes the most sense is in cases where there is a significant amount of file-based content that requires “basic content services” such as workflow, versioning, security, check-in/check-out, but needs to be shared in the context of a community.

Alfresco becomes even more useful when there are multiple communities that need this content because you can start to leverage the “content-as-a-service” idea to make the content available to any number of front-end sites (where those sites might or might not be Drupal-based).

Suppose rather than one community, you have ten. Each community will have community-specific content but there may also be a set of content that needs to be leveraged across many communities. A subset of things you might be concerned about include:

  • Some content might need editorial review and approval before it goes live, but not everything.
  • Not all content comes from internal sources (pushed out from the repository). Some might originate from one of the communities as user-generated content. That content might need editorial review before it is made available on other community sites.
  • Communities need flexibility in how/if they expose cross-community content.
  • Tags and other metadata value need to be consistent between an end point and the repository (and therefore, across all end points).
  • Search needs to be properly scoped (does it include community content only, community plus shared content in the repo, multiple selected communities, or all communities)
  • Some clients may not be able to control the technology used on these community endpoints.

In these scenarios, Alfresco acts as your core repository and Drupal provides the front-end presentation layer. When you look at it this way, Drupal really becomes equivalent (in terms of where it sits in the architecture and the role it plays) to traditional portals like Liferay or JBoss Portal.
Content sitting in Drupal is harder for other systems to get to than when it sits in Alfresco. There are Drupal modules that make it easier to syndicate out but Alfresco’s purpose-built to expose content in this way. Once it is in Alfresco, content can be routed through Alfresco workflows, and then approved to be made available to one or more front-end Drupal sites. Content could come from a Drupal site, get persisted to Alfresco, routed around for editorial review, and then be made available. It really opens up a lot of possibilities.

Not all Drupal modules need to persist their data back to Alfresco. Things like comments and ratings will likely never need to be treated as real content. Instead of trying to persist everything you would either modify select modules to integrate with Alfresco or create new ones that work with Alfresco. For example, you might want to have Drupal stick file uploads in Alfresco instead of the local file system. Or, it might make sense to have a “send to alfresco” button visible to certain roles that would send the current node to Alfresco.

It doesn’t all have to be Drupal getting and posting to Alfresco. There might be cases where you need some Drupal data from within Alfresco. Maybe you are in Alfresco and you want to tag objects using the same set of tags Drupal knows about, for example. Or maybe you want to do a mass import of Drupal objects into the Alfresco repository.

I’ve got a little test module that uses Alfresco’s REST API (including the new CMIS URLs) to retrieve content from Alfresco and show it in a Drupal block. I can talk about it in a separate post.

A perfect vacation policy might be to have no policy at all

Sounds like Matt Asay is shopping around for vacation policies. Matt points to an associate’s policy of stopping vacation accrual at six weeks as a way of encouraging employees to use their vacation.

At Navigator, we had a vacation policy everyone loved–we had no policy. We expected everyone to behave like adults and take the vacation they needed when they needed it. Some folks took a week or two a year while others took three or four. But no one had to worry about accruals or carryovers or time banks or any other silly thing. If you were sick, you stayed home. If you were wrapping up a project and needed some down time, you took the time off. Even as the company grew and we began to hire younger recruits, I never saw any instance of abuse.
Of course the “no policy” policy didn’t do anything to encourage people to take vacations. And, on termination, no one wrote you a check for the unused accrued vacation. But we all loved the flexibility and the freedom. It was certainly a factor people considered when they looked at alternative employers. You might not ever take the time off, but even a four-week vacation policy seemed restrictive in comparison.

Stop the end-of-year use-it-or-lose-it madness and jettison your vacation policy altogether.