Tag: CMIS

My New Book: CMIS and Apache Chemistry in Action

I’ve been working on another book project along with Jay Brown from IBM and Florian Mueller from SAP. It’s called “CMIS and Apache Chemistry in Action” and we intend for it to be the definitive guide to CMIS. I’m having a great time working with Jay and Florian who are two guys who have been heavily involved with the CMIS spec and Apache Chemistry from the beginning.

The book is being published by Manning and should be out in April of 2013. Today it has just been made available through the Manning Early Access Program (MEAP). We’re excited about having the book on MEAP because it means you can buy the book today and get the chapters as they are written. This gets drafts of the chapters in your hands quickly so you can apply what you learn to your projects immediately, but, even more importantly, gives you a chance to give us feedback that we can incorporate into the book.

I hope that everyone who wants to write content-centric applications on top of repositories like Alfresco, FileNet, SharePoint, Documentum, and so on, will benefit from the book, whether you are writing those apps in Java, Groovy, Python, PHP, JavaScript, C#, or Objective-C. The book starts out with an intro to CMIS and then moves through a real world example–a CMIS-based music mash-up application–built step-by-step. Once we’ve sufficiently covered the client-side stuff we move on to the server-side for those that need to know how to implement their own CMIS repositories.

The book covers the current 1.0 specification as well as the forthcoming 1.1 version of the specification.

We’ve still got a lot to write, but it feels great to reach the MEAP milestone. I look forward to hearing feedback from all of you as we continue to knock out chapters this Winter.

If you want to buy the book (MEAP or print, when it is available), you can use this code to get 37% off: 12cmisal.

Webinar: Getting Started with CMIS

If you are brand new to CMIS or have heard about it but aren’t sure how to get started, you might want to join me in a free webinar on Thursday, January 26 at 15:00 GMT. I’m going to give a brief intro to the Content Management Interoperability Services (CMIS) standard and then I’m going to jump right in to examples that leverage Apache Chemistry OpenCMIS (Java), Apache Chemistry cmislib (Python), and Groovy (via the OpenCMIS Workbench).

UPDATED on 1/26 to fix webinar link (thanks, Alessandro). See comments for a link to webinar recording and slides.

Apache Chemistry cmislib 0.4 incubating now available

Apache Chemistry LogoThe Apache Chemistry development team is pleased to announce that the 0.4 incubating release of cmislib, the Python client API for CMIS, is now available for download. You may have to use one of the backup servers until the mirrors fully update. Alternatively, you can use easy_install to install cmislib by typing “easy_install cmislib”.

This release has various fixes and enhancements that the community has contributed since cmislib joined the Apache Chemistry project with its 0.3 release. If you are using Alfresco, you might be interested in an enhancement in cmislib 0.4 that makes it possible to use ticket-based authentication instead of basic auth.

For those who haven’t used it, cmislib makes it easy to work with CMIS-compliant repositories from Python.

Deconstructing DeckShare: A brief look at Alfresco Web Quick Start

Last Fall, just before the Developer Conference in New York City, Alfresco approached Metaversant with a small project–they needed a web site to share presentations from the DevCon sessions and other events. There are several generic slide-sharing sites out there–Alfresco and I have both used SlideShare for that sort of thing pretty extensively. But Alfresco was looking for something they could have complete control over, plus they were looking to exercise their new Web Quick Start offering. So I rounded up a sucker–I mean a collaborator–named Michael McCarthy from over at Tribloom, and we knocked it out.

Although it makes good marketing sense for Alfresco to use its own offering for the site, I do think the “private presentation-sharing” use case is also generally applicable to many other businesses out there. Technology companies, of course, but also any company with a large sales force or even more modest extranet needs could benefit from a solution like this. SlideShare is great when what you want to share is public. When presentations need to be securely shared, many companies use expansive portals to share sales collateral, marketing presentations, or company communications, but those often come with a very low signal-to-noise ratio. The solution we built–we call it “DeckShare”–is laser-focused on one thing: Making it easy for content consumers to find the presentations they need, quickly.

DeckShare is built on top of Alfresco’s new Web Content Management (WCM) offering called Web Quick Start. Honestly, Web Quick Start, as the name implies, provides a good starting point for building a dynamic web site on top of Alfresco, and the sample site was so close to what a basic slide-sharing site needed, we didn’t have to do a whole lot of work. Even so, if you want to do slide-sharing on top of Alfresco, you can save even more time by starting with DeckShare.

I thought it might be cool to walk you through how we got from the Web Quick Start sample app to DeckShare as a way of getting you familiar with Web Quick Start and to help you understand how DeckShare works in case you want to use it on your project.

Some of this write-up is also included in the “About” page within the DeckShare site. Alfresco is currently branding DeckShare to meet their needs so I have no idea if the About page will still be there when the site goes live, so I may or may not be repeating myself somewhat.

What is DeckShare?

Alfresco’s DeckShare implementation isn’t live yet, so let me give you the nickel tour. DeckShare is a solution for quickly creating a self-hosted presentation-sharing site on top of Alfresco without having to write any code. DeckShare lets non-technical users manage and categorize presentations that are then consumed by end-users. End-users can find presentations by browsing one or more hierarchies (“Topics”, “Events”, “Audience”, for example), performing a full-text search, or by browsing the list of the latest and “featured” presentations. Content managers can associate presentations with “related” presentations and can link supporting files with a presentation.

In this solution, Content Managers are a different set of users than Content Consumers. As it stands, the solution doesn’t accomodate user-contributed presentations, unlike SlideShare. Obviously, it could be customized to do that.

Here’s what it looks like when a Content Manager edits the metadata for a particular presentation (click the image for full-size):

DeckShare Manage Details Screenshot

Those familiar with Alfresco can tell from the screenshot that Content Managers use Alfresco Share as the “content administration” user interface. Optionally, you could turn on Alfresco’s Web Editor Framework and allow Content Managers to edit the site in-context, but that will require some minor tweaking if you go that route. Using Share for content administration works really well. What started out as a team collaboration web app has essentially turned into Alfresco’s uber client.

Now let’s take a look at the Content Consumer user interface:

DeckShare Home Screenshot

If you’ve seen either of the sample Web Quick Start applications, the layout will look familiar. The home page includes a featured content carousel, a recently added document list, and a document category tree. We’re using a different carousel widget than the Web Quick Start sample apps and the category tree is also something we’ve added.

The set of categories is almost completely arbitrary and can be managed by DeckShare administrators. In fact, if you are running DeckShare on top of an existing Alfresco repository, you can specify the subset of categories you want to show in the category tree. When you click a specific category, the resulting page looks like this:

DeckShare Category Page Screenshot

The category page shows only the presentations for a specific category and features the same category tree browser that appears on the home page.

Ultimately, an end-user will either download a presentation or they’ll click on the details page link to learn more. The details page, shown below, shows high-level metadata about the presentation, supporting files that accompany the presentation, related presentations, and a flash-based document preview. The flash-based preview allows end-users to browse the presentation without downloading the entire file.

DeckShare Presentation Details Screenshot

That’s really it from a Content Consumer perspective. The site also has full-text search and that page is very similar to the category list page.

From a Content Manager’s point of view, DeckShare, like all Web Quick Start sites, is built on concepts familiar to anyone who has built or managed a web site: A site consists of collections of assets that get categorized and tagged and then presented in various ways across one or more pages. The look-and-feel of the end-user web site can be completely customized to match client branding needs. And, both the metadata model and category hierarchy can be extended to support specific client requirements.

Assets are stored in Alfresco’s Document Management repository and managed through the Alfresco Share User Interface. Alternatively, Content Managers have several other options for getting presentations into the repository, including FTP, drag-and-drop via Windows Explorer or Mac Finder, emailing content into the repository, drag-and-drop from within Microsoft Outlook or Lotus Notes, or saving directly from Microsoft Office as if the repository were a Microsoft SharePoint Server. Regardless of how the files arrive, Alfresco automatically takes care of creating thumbnails and PDF renditions of the presentations.

What is Web Quick Start?

Web Quick Start is a sample web application built with Spring, Spring Surf, and Apache Chemistry’s OpenCMIS library. It is essentially a sample web application that sits on top of the Web Quick Start API (Java), some presentation tier services, some repository tier services, and an extended content model.

Assets are stored in Alfresco’s Document Management repository and managed through the Alfresco Share User Interface. That means you can use all of the familiar building blocks present in the repository such as custom types and aspects to model your data, behaviors, and web scripts.

The presentation tier uses Alfresco Surf to lay out pages and to define regions on those pages. Regions get their content from presentation tier web scripts. What’s different with Web Quick Start as opposed to previous Surf-based web application examples is the use of the Apache Chemistry OpenCMIS library. Instead of using Surf’s object dispatcher to load and persist objects, the Web Quick Start API uses OpenCMIS to make CMIS requests between the front-end and the repository tier. There are some places where the Web Quick Start API uses non-CMIS web scripts, so it is not a pure CMIS implementation, however.

The Web Quick Start API is exposed to the Alfresco JavaScript API and Freemarker API on the presentation tier, so everything you’ve already learned about Spring Surf is immediately leverageable when you build the front-end. However, if you’ve decided on another framework, you can still use the Web Quick Start API, the services, the content model, and Share for editing content. For example, Metaversant recently worked with a client that chose Spring 3 and Apache Tiles for the front-end because that was their standard, but they used Web Quick Start for everything else from the API, back.

Web Quick Start sites have a flexible deployment configuration which can be boiled down to “single-server” or “multi-server”. In the single-server approach, the same Alfresco server is used for content authoring and content serving whereas in the multi-server approach, Alfresco’s transfer service is used to move content from the “authoring” or “editorial” web server to the “Live” server (it doesn’t have to be one hop, you could throw in one or more “QA” servers as well if you want to, for example).

High-level steps

Web Quick Start is meant as a starter application. It’s functional out-of-the-box, but you’re expected to use as much (or as little) of it as you need. Here are the high-level steps we took to reshape the starter app into DeckShare.

Step 1: Extend the content model

Web Quick Start has a simple content model that provides for articles, images, and user feedback. For DeckShare, we added two new aspects to our own model. One is used to associate a presentation with zero or more related presentations. The other is used to associate a presentation with zero or more supporting files (like source code downloads). At some point we may also add an aspect to track fine-grained event metadata such as session time, speaker, etc. Extending the model with your own aspects is pretty easy–it’s some XML to define the model and then some XML to expose the model to the Share interface.

Step 2: Set up site sections and collections

Web Quick Start has a default folder structure that lives in the Document Library of the Share site being used to manage the content. Within that there is one folder for editorial content and one for live content. Then, it breaks down into “section” folders for site assets (publications, in this case) and collections. Each section will typically map to a section of a site and will therefore show up in the site navigation, but you can exclude sections from the navigation.

Every section folder has a set of collections. A collection is an arbitrary grouping of site assets. Collections can either be static or query based. For a static collection, Content Managers choose the assets that belong in that section with standard Share “picker” components. For a query-based collection, Content Managers can use either CMIS Query Language or Lucene to define queries that identify the assets that are included in the collection. Either way, from the Web Quick Start API, a developer just says, “Give me this collection and let me iterate over the assets in it” to produce a list of the assets in a collection.

There are two sample data sets that ship with Web Quick Start. One is for a Finance example site and the other is for a Government site. We started with the Finance site (the choice was based on luck–there’s really not much of a difference between the two other than images and content). Once we imported the sample site, we deleted all of the sample content and then tweaked the collection definitions. The “latest” collection, for example, contains the most recently-added presentations. The “featured” collection is static out-of-the-box, but we wanted it to be dynamic. So, we simply edited the collection metadata to add a query that returns all of the presentations that have been categorized as “Featured”. Alfresco runs the query periodically so that Content Consumers don’t take the performance hit when the page is rendered.

The carousel works similarly. We wanted Content Managers to be able to specify which presentations appear in the carousel simply by applying the “Carousel” category to the content from within Share, so the carousel collection looks for content that has that category applied.

The other tweaks we made to the sample data set include telling Web Quick Start which rendition should be used for the various thumbnails in the site. We added a new rendition definition for the images shown in the carousel and a new rendition for the thumbnails used everywhere else. Web Quick Start has its own rendition definitions, but the thumbnails aren’t set to maintain the aspect ratio when they are resized and that looks a little weird for images of thumbnail presentations, hence the need for our own.

Step 3: Customize page layout

Once we had some sample data in place and our collections defined, it was time to start laying out the pages. As luck would have it, the requirements matched up fairly closely to the layout of the sample financial site. Surf pages and templates are used for layout, but you don’t have to be a Surf Guru to make changes. Surf templates are just FreeMarker, after all. Other than minor re-arranging, our template modifications were limited to adding additional regions to existing templates.

Once you define your page layouts, you use metadata on the section folders to tell Web Quick Start which page layout to use for a given piece of content. The mapping is type-based. For a given section, you might say, “All instances of ws:indexPage in this section should use the ‘list’ template” which is expressed like this:

ws:indexPage=list

The template mapping is hierarchical. That means for a given piece of content, Alfresco will look at that content’s section for the template mapping but if it isn’t specified, it will look in that section’s parent, and so on. So if we specify:

cmis:document=publicationpage1

in a high-level section folder, all instances of cmis:document in child sections will be layed out using publicationpage1 unless it is overridden by a child section.

Note: If you are familiar with Surf, don’t be confused by the use of the word “template” here. It’s used by Web Quick Start in a generic sense. In the example above, “list” and “publicationpage1” are actually page data objects in Surf. For the Metaversant client that used Spring 3 and Tiles, for example, the template mapping specified which Tile definition to use.

Step 4: Add custom components

We spent most of our time on components. In Surf, components are implemented as web scripts. A web script implements the Model-View-Controller (MVC) pattern. In this case, controllers are JavaScript. Here’s what one of the controllers looks like:


model.articles = collectionService.getCollection(context.properties.section.id, args.collection);

if (args.linkPage != null) 
{
	model.linkParam = '?view='+args.linkPage;
}

This one is simply grabbing a collection of articles and a parameter and sticking them on the model for the FreeMarker-based view to pick up. Behind the scenes, the Web Quick Start API is making RESTful calls to the repository to retrieve (and cache) objects. This is a typical controller–in this app, most of the controllers are less than 10 lines long.

Web Quick Start already has components for different types of content lists. So doing something like the carousel was relatively easy: The collection service returns the “carousel” collection and a FreeMarker template dumps the collection metadata into a list that the YUI carousel control can grok.

We had a couple of places where we had to write our own components. One was the Categories component and another was the Related Presentations component. The Categories component is rendered using a YUI tree control. It gets its data by invoking a Surf-tier web script which, in turn, invokes a repository-tier web script that returns a list of category metadata as JSON. YUI then styles the data as a tree.

The Related Presentations component similarly invokes a repository-tier web script, but its JSON contains only node references. It then asks the Web Quick Start API to convert those into CMIS objects. That allows us to take advantage of the cache if the object has been loaded before, and it means we can re-use the Web Quick Start view templates that already know how to format lists of CMIS objects.

But wait, there’s more

Web Quick Start also has a user feedback mechanism that you can use for comments, “contact us” forms, and, at some point, ratings, but none of that was a requirement for this solution at the time we built it. The capability is there, so we’ll probably expose it at some point. Refer to the wiki for more information on the user feedback functionality.

There’s also a workflow that is used to submit content for review. Once approved, the content is queued up for publishing to the live site. The workflow is the same jBPM engine you are probably already familiar with so the process can be modified to fit your needs.

Although it is a new product and there are still a few hitches here and there, I was happy with Web Quick Start and the time it saved for building a site like this. I’m looking forward to seeing its continued evolution.

DeckShare is a simple site. There is a lot more that could be done with it and, based on client demand (or contributions from others, hint, hint), new features will be added over time. I think it’s a good example of what you can do quickly with Web Quick Start.

Hopefully, this write-up gave you some ideas you can use on your own projects. If you want to dive deeper into the Web Quick Start web application, check out the Web Quick Start Developer Guide on the Alfresco wiki. Feel free to grab the DeckShare source from Google Code and use it on your own projects.

If you’re interested in major customizations to DeckShare or you have a project that might be a fit for Alfresco Web Quick Start, let me know.

Book Review: Alfresco 3 Web Services

I’ve just finished reading Alfresco 3 Web Services, by Ugo Cei and Piergiorgio Lucidi, which Packt sent me to read and comment on.

Alfresco 3 Web Services is meant for developers with a very specific focus: Remotely talking to Alfresco. That might mean communicating via SOAP-based Web Services, RESTful web services (Web Scripts), or either protocol through CMIS. Whichever you choose, Ugo and Piergiorgio have you covered. I really like that the authors set out to write such a focused piece and stuck to it–it allows them to go deep on their topic and keeps the chapter length and total book length digestable.

The book is written very clearly and follows a logical progression. It starts out with SOAP-based services (first 5 chapters), including a chapter on .NET, and then moves on to web scripts (3 chapters). After that, the authors discuss AtomPub, which is an interesting, but not critical, background for the remaining 5 chapters of the book which focus on CMIS.

The discussion on web scripts covers both Java and JavaScript controllers, but JavaScript definitely gets more attention. Something developers new to the platform will find helpful is the chapter on FreeMarker. Most of Alfresco’s documentation and the books that are out there (including mine) relegate the FreeMarker API to appendices or just show examples and assume you’ll look up details later when you need it. Because it is typically such a key component of web scripts and other aspects of Alfresco, it was a good call to include it in the book.

The book comes with several small examples and a couple of tie-it-together examples. None of it runs out-of-the-box without some tweaking which isn’t a surprise given how rapidly things are changing in this area, particularly with regard to CMIS. At the time the book was written, OpenCMIS, the Java API for CMIS available as part of Apache Chemistry, had not been officially released. As I write this, the Chemistry team is about to release their second tagged build. The differences aren’t significant enough to cause confusion–most readers will be able to fidget with the import statements and the other changes required to get the sample code running.

I thought there was a little too much time devoted to SOAP-based Web Services, but that’s just personal preference and the fact that nearly all of my clients over the last four years have gone the RESTful route. The authors note that others may have the same preference and they make it easy to skip over those chapters if the reader wants to.

Although the chapters logically progress and build on each other, the code samples don’t–for the most part, each chapter’s code samples are self-contained. For example, I thought it was kind of strange that the code built in Chapter 13 for the CMIS Web Services binding examples isn’t used in Chapter 14 to build the CMIS Wiki example.

Many Alfresco implementation teams are divided into at least back-end and front-end teams and when the project is large enough, you can definitely have people focused on the middle. This book is perfect for that middle-ware team or for anyone who’s got a handle on the back- and front-ends and just needs to learn how to stitch them together. Nice job, Ugo and Piergiorgio.

Updated Python CMIS library released

I’ve tagged and released a new version of cmislib, the Python CMIS client library. What’s cool about this release is that it is the first one known to work with more than one CMIS provider. Yea for interoperability! The beauty of CMIS, realized! Okay, it wasn’t that beautiful, it’s still “0.1”, and there are known issues. But I can now say the library works with both Alfresco and IBM FileNet and that’s a Good Thing.

IBM was a big help with this. Al Brown, one of the CMIS spec leads turned one of his colleagues, Jay Brown, onto cmislib. Jay called me up and asked, “If I give you access to a FileNet P8 server, can you test cmislib against it?” I was on it faster than you could say, “unittest.main()”.

I think the effort was valuable for all sides. Our little “mini plugfest” turned up issues in my client as well as both CMIS providers. Jay worked hard to chase down everything on the FileNet side. Dave Caruana chased a few down on the Alfresco side as well. Thanks to everyone for the team effort.

Anyway, give the new cmislib release a try and give me your feedback. If you want a feel for how easy it can be to work with CMIS repositories using the cmislib API, check out the documentation or dive right in. Installation is as easy as “easy_install cmislib” (easy_install instructions).

Next up is Nuxeo. Can the open source ECM vendor achieve cmislib Unit Test Greatness faster than Big Blue? We shall see!

cmislib: A CMIS client library for Python

I’ve started a new project on Google Code called cmislib. It is an interoperable client library for CMIS in Python that uses the Restful AtomPub Binding of a CMIS provider to perform CRUD and query functions on the repository.

I created it for a couple of reasons. First, it’s been bugging me that, unlike our Drupal Alfresco integration, our Django Alfresco integration does not use CMIS. After talking it over with one of our clients we decided it would make more sense to create a more general purpose CMIS API for Python that Django (and any other Python app) could leverage, rather than build CMIS support directly into the Django Alfresco integration.

Second, around the time I was putting together the Getting Started with CMIS tutorial, it struck me that there needed to be an API that didn’t have a lot of dependencies and was very easy to use. Otherwise, it’s too easy to get lost in the weeds and miss the whole point of CMIS: Easily working with rich content repositories, regardless of the underlying implementation.

Even if you’ve never worked with Python before, it is super easy to get started with cmislib. The install is less than 3 steps and the API should feel very natural to anyone that’s worked with a content repository before. Check it out.

Install

  1. If you don’t have Python installed already, do so. I’ve only tested on Python 2.6 so unless you’re looking to help test, stick with that.
  2. If you don’t have setuptools installed already, do so. It’s a nice tool to use for installing Python packages.
  3. Once setuptools is installed, type easy_install cmislib

That’s all there is to it. Now you’re ready to connect to your favorite CMIS-compliant repository.

Examples

There’s nothing in cmislib that is specific to any particular vendor. Once you give it your CMIS provider’s service URL and some credentials, it figures out where to go from there. But I haven’t tested with anything other than Alfresco yet, and this thing is still hot out of the oven. If you want to help test it against other CMIS 1.0cd04 repositories I’d love the help.

Anyway, let’s look at some examples using Alfresco’s public CMIS repository.

  1. From the command-line, start the Python shell by typing python then hit enter.
  2. Python 2.6.3 (r263:75183, Oct 22 2009, 20:01:16)
    GCC 4.2.1 (Apple Inc. build 5646)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>>
  3. Import the CmisClient and Repository classes:
  4. >>> from cmislib.model import CmisClient, Repository
  5. Point the CmisClient at the repository’s service URL
  6. >>> client = CmisClient('http://cmis.alfresco.com/s/cmis', 'admin', 'admin')
  7. Get the default repository for the service
  8. >>> repo = client.getDefaultRepository()
    >>> repo.getRepositoryId()
    u'83beb297-a6fa-4ac5-844b-98c871c0eea9'
  9. Get the repository’s properties. This for-loop spits out everything cmislib knows about the repo.
  10. >>> repo.getRepositoryName()
        u'Main Repository'
    >>> info = repo.getRepositoryInfo()
    >>> for k,v in info.items():
        ...     print "%s:%s" % (k,v)
        ...
        cmisSpecificationTitle:Version 1.0 Committee Draft 04
        cmisVersionSupported:1.0
        repositoryDescription:None
        productVersion:3.2.0 (r2 2440)
        rootFolderId:workspace://SpacesStore/aa1ecedf-9551-49c5-831a-0502bb43f348
        repositoryId:83beb297-a6fa-4ac5-844b-98c871c0eea9
        repositoryName:Main Repository
        vendorName:Alfresco
        productName:Alfresco Repository (Community)

Once you’ve got the Repository object you can start working with folders.

  1. Create a new folder in the root. You should name yours something unique.
  2. >>> root = repo.getRootFolder()
    >>> someFolder = root.createFolder('someFolder')
    >>> someFolder.getObjectId()
    u'workspace://SpacesStore/91f344ef-84e7-43d8-b379-959c0be7e8fc'
  3. Then, you can create some content:
  4. >>> someFile = open('test.txt', 'r')
    >>> someDoc = someFolder.createDocument('Test Document', contentFile=someFile)
  5. And, if you want, you can dump the properties of the newly-created document (this is a partial list):
  6. >>> props = someDoc.getProperties()
    >>> for k,v in props.items():
    ...     print '%s:%s' % (k,v)
    ...
    cmis:contentStreamMimeType:text/plain
    cmis:creationDate:2009-12-18T10:59:26.667-06:00
    cmis:baseTypeId:cmis:document
    cmis:isLatestMajorVersion:false
    cmis:isImmutable:false
    cmis:isMajorVersion:false
    cmis:objectId:workspace://SpacesStore/2cf36ad5-92b0-4731-94a4-9f3fef25b479
  7. You can also use cmislib to run CMIS queries. Let’s find the doc we just created with a full-text search. (Note that I’m currently seeing a problem with Alfresco in which the CMIS service returns one less result than what’s really there):
  8. >>> results = repo.query("select * from cmis:document where contains('test')")
    >>> for result in results:
    ...     print result.getName()
    ...
    Test Document2
    example test script.js
  9. Alternatively, you can also get objects by their object ID or their path, like this:
  10. >>> someDoc = repo.getObjectByPath('/someFolder/Test Document')
    >>> someDoc.getObjectId()
    u'workspace://SpacesStore/2cf36ad5-92b0-4731-94a4-9f3fef25b479'

Set Python loose on your CMIS repository

These are just a few examples meant to give you a feel for the API. There are several other things you can do with cmislib. The package comes with documentation so look there for more info. If you find any problems and you want to pitch in, you can check out the source from Google Code and create issues there as well.

Give this a try and let me know what you think.

[UPDATE: I had the wrong URL for the Alfresco-hosted CMIS service. It’s fixed now.]

New Tutorial: Getting Started with CMIS

I’ve written a new tutorial on the proposed Content Management Interoperability Services (CMIS) standard called, “Getting Started with CMIS“. The tutorial first takes you through an overview of the specification. Then, I do several examples. The examples start out using curl to make GET, PUT, POST, and DELETE calls against Alfresco to perform CRUD functions on folders, documents, and relationships in the repository. If you’ve been dabbling with CMIS and you’ve struggled to find examples, particularly of POSTs, here you go.

I used Alfresco Community built from head, but yesterday, Alfresco pushed a new Community release that supports CMIS 1.0 Committee Draft 04 so you can download that, use the hosted Alfresco CMIS repository, or spin up an EC2 image (once Luis gets it updated with the new Community release). If you don’t want to use Alfresco you should be able to use any CMIS repository that supports 1.0cd04. I tried some, but not all, of the command-line examples against the Apache Chemistry test server.

Once you’ve felt both the joy and the pain of talking directly to the CMIS AtomPub Binding, I take you through some very short examples using JavaScript and Java. For Java I show Apache Abdera, Apache Chemistry, and the Apache Chemistry TCK.

For the Chemistry TCK stuff, I’m using Alfresco’s CMIS Maven Toolkit which Gabriele Columbro and Richard McKnight put together. That inspired me to do my examples with Maven as well (plus, it’s practical–the Abdera and Chemistry clients have a lot of dependencies, and using Maven meant I didn’t have to chase any of those down).

So take a look at the tutorial, try out the examples with your favorite CMIS 1.0 repo, and let me know what you think. If you like it, pass it along to a friend. As with past tutorials, I’ve released it under Creative Commons Attribution-Share Alike.

[Updated to correct typo with Gabriele’s name. Sorry, Gab!]

Top Five Alfresco Roadmap Takeaways

Now that the last of the Alfresco Fall meetups has concluded in the US, I thought I’d summarize my takeaways. Overall I thought the events were really good. The informative sessions were well-attended. Everyone I talked to was glad they came and left with multiple useful takeaways.

Everyone has their own criteria for usefulness–for these events my personal set of highlights tend to focus on the roadmap. So here are my top five roadmap takeaways from the Washington, D.C., Atlanta, and LA meetups.

1. Repository unification strategy revealed

Now we know what Alfresco plans to do to resolve the “multiple repository” issue. In a nutshell: Alfresco will add functionality to the DM repository until it is on par with the AVM (See “What are the differences…“). What then? The AVM will continue to be supported, but if I were placing bets, I would not count on further AVM development past that point.

This makes a lot of sense to me. We do a lot of “WCM” for people using the Alfresco DM repository, especially when Alfresco is really being leveraged as a core repository. It also makes sense with Alfresco’s focus on CMIS (see next takeaway) because you can’t get to the AVM through CMIS.

2. CMIS, CMIS, CMIS

Clearly, CMIS is an important standard for Alfresco. (In fact, one small worry I have is that Alfresco seems to need CMIS more than any of the other players behind the standard, but I digress). Alfresco wants to be the go-to CMIS repository and believes that CMIS will be the primary way front-ends interact with rich content repositories. They’ve been on top of things by including early (read “unsupported”) implementations of the draft CMIS specification in both the Community and Enterprise releases, but there a number of other CMIS-related items on the roadmap:

  • When the CMIS standard is out of public review, Alfresco will release a “CMIS runtime”. Details are sketchy, but my hunch is that Alfresco might be headed toward a Jackrabbit/Day CRX model where Alfresco’s CMIS runtime would be like a freely-available reference CMIS repository (Alfresco stripped of functionality not required to be CMIS compliant) and the full Alfresco repository would continue as we know it today. All speculation on my part.
  • Today deployments are either FSR (Alfresco-to-file system) or ASR (Alfresco AVM to Alfresco AVM). The latter case is used when you have a front-end that queries Alfresco for its content but you want to move that load off of your primary authoring server. In 3.2, the deployment service has gotten more general, so it’s one deployment system with multiple extensible endpoint options (file system, Alfresco AVM, CouchDB, Drupal, etc.). Alfresco will soon add AVM-to-CMIS deployment. That means you can deploy from AVM to the DM repository. Does it mean you can deploy to any CMIS repository? Not sure. If not, that might be a worthwhile extension.
  • One drawback to using DM for WCM currently is that there is not a good deployment system to move your content out of DM. It’s basically rsync or roll-your-own. On the roadmap is the ability to deploy from DM instead of AVM. This is one of the features the DM needs to get it functionally equivalent to what you get with the AVM. I wouldn’t expect it until 4.0.

3. Shift in focus to developers

Alfresco WCM has always been a decoupled system. When you install Alfresco WCM you don’t get a working web site out-of-the-box. You have to build it first using whatever technology you want, and then let Alfresco manage it. So, unlike most open source CMS’, it’s never been end-user focused in the sense of, “I’m a non-technical person and I want a web site, so I’m going to install Alfresco WCM”. Don’t expect that to change any time soon. Even Web Studio, which may not ever make it to an Enterprise release, is aimed at making Surf developers productive, not your Marketing team.

Alfresco is realizing that many people discard the Alfresco UI and build something custom, whether for document management, web content management, or some other content-centric use case. To make that easier, Alfresco is going to rollout development tools like Eclipse plug-ins, Maven compatibility, and Spring Roo integration (Uzi’s Spring Roo Screencast, Getting Started with Spring Roo ).

Alfresco has also announced that web scripts, web studio, and the Surf framework will be licensed under Apache and there were allusions to “making Surf part of Spring” or “using Surf as a Tiles replacement”. I haven’t seen or heard much from the Spring folks on this and I noticed these topics were softened between DC and LA, but that could have just been based on who was doing the speaking (see “What do you think of Alfresco’s multi-event approach?“).

Essentially what’s going on here is that Alfresco wants all of your future content-centric apps and even web sites to be “CMIS applications”, and Alfresco believes it can provide the best, most productive development platform for writing CMIS apps.

4. Stuff that may never happen but would be cool if it did

This is a grab bag of things that are being considered for the roadmap, but are far enough out to be uncertain. Regardless of if/when, these are sometimes a useful data point for where the product is headed directionally.

  • Native XML support. Right now Alfresco can manage XML files, obviously, but, unlike a native XML database like eXist or MarkLogic, the granularity stops with the file. Presumably, native XML support would allow XML validation, XPath and XQuery expressions running against XML file content, and better XSLT support.
  • Apache Solr. I think the goal here is to get better advanced search capability such as support for faceted search, which is something Solr knows how to do.
  • Repository sharding. This would be the ability to partition the repository along some (arbitrary?) dimension. Sharding is attractive to people who have very, very large repositories and want to distribute the data load across multiple physical repositories, yet retain the ability to treat the federation as one logical repo.

5. Timeline

Talk to Alfresco if you need this to be precise, but here’s the general idea of the timeline through 4.0 based on the slides I saw:

  • 3.2 Enterprise 12/2009
  • CMIS 1.0 Release Spring 2010
  • 3.3 Enterprise 1H 2010
  • 4.0 Enterprise 12/2010 (more likely 2011)

Thanks, Alfresco, and everyone who attended

Lastly, thanks to Nancy Garrity and the rest of the team that put these events together. I enjoyed presenting on Alfresco-Drupal in Atlanta and giving the Alfresco Best Practices talk (Alfresco Content Community login required).

I always enjoy the informal networking that happens at these events. There’s such a diverse group of experience levels, use cases, and businesses–it makes for interesting conversations. And, as usual, thanks to the book and blog readers who approached me. It always makes me happy to hear that something on your project was better for having read something I wrote. It was good meeting you all and I’m looking forward to the next get-together.

Django + Alfresco was a winning combination for retailer’s intranet

Last week I spent some time with one of our clients talking about what it’s been like to live with their Intranet platform based on Django and Alfresco. The conversation got me really excited about what they’ve been able to do since the original implementation and where they are heading.

The client is a well-known, high-end retailer based in Dallas. About a year ago they engaged Optaros to replatform their intranet from a legacy Java portal product to something more agile. They had seen Alfresco and liked it as a core repository, but needed something for the presentation tier (See “Alfresco User Interface: What are my options?“).

The Optaros team worked with the client to consider many options, including open source Java portal servers. The client felt like they needed something lighter and more flexible than a portal server. They were willing to do a lot of the presentation work themselves in exchange for complete design freedom and yet still be enough of a framework to be highly productive. The winning solution turned out to be Django.

Python? No problem.

I was initially worried that introducing a Python-based framework into a Java shop was going to be a problem but they weren’t married to Java. Our team got them up-to-speed quickly and they never looked back. It also helped that the client’s intranet sites were very communication-centric which matched up well with Django’s newspaper heritage.

Here’s how they use the solution in a nutshell:

  • Content owners use Alfresco Explorer to upload HTML chunks, office documents, and images, set metadata, and submit content for review. This triggers any number of rules that automatically process the changed content (e.g., creating thumbnails, extracting metadata, converting images to a consistent type, creating PDFs from office documents).
  • Content owners and reviewers can use Alfresco’s “custom views” to preview the content chunk in the context of the front-end site.
  • Site designers lay out site pages and create components using the Django template system, CSS, JQuery, and other front-end libraries.
  • Content publishers use the Django administration UI to map areas on the site to categories, folders, and objects in the Alfresco repository–Alfresco has no idea where or how the chunks are being used. This means the repository tier is truly decoupled from the presentation tier, allowing the client to reuse content across multiple areas of the site and across multiple sites within the enterprise.
  • Designers leverage a Django tag library to create dynamic areas of a page (e.g., when the page is rendered, retrieve all of the content chunks in this particular category from the repository). Django calls Alfresco web scripts to get and post data. The web scripts respond with serialized Django XML which Django caches and then deserializes into Django objects that the front-end can work with.

Separate concerns, play to strengths

The thing to notice about the Alfresco piece is how it sticks to core Alfresco capabilities: Metadata, rules, search, basic workflows, transformers/extractors, presentation templates, web scripts, DM repository. This is straight out of the Alfresco best practices playbook and aligns the client well with Alfresco product direction. A nice enhancement would be to refactor the Django-Alfresco integration to use CMIS which is something we are considering for the open source version of the integration (Screencast, Code).

Agile intranet, happy team

Since the initial rollout, the client has been able to make changes and roll out new sites quickly and easily thanks to the productivity inherent in the Django framework and the clean separation between the front-end app and the repository. Unexpected benefits the client mentioned were how fast they can add new features to the administrative UI (a core admin UI gets built for you automatically by Django) and the ease with which the development team can stand up a new environment.

The language the client team used to describe their work since the rollout summed it up best. They were using words like “beautiful” and “a real pleasure to work with”. When was the last time you heard those sentiments expressed about a WCM implementation?