Category: CMIS

Content Management Interoperability Services (CMIS) is an OASIS standard that attempts to make rich content repositories work together better. The spec provides for both SOAP and RESTful bindings as well as a SQL-like query language.

cmislib: A CMIS client library for Python

I’ve started a new project on Google Code called cmislib. It is an interoperable client library for CMIS in Python that uses the Restful AtomPub Binding of a CMIS provider to perform CRUD and query functions on the repository.

I created it for a couple of reasons. First, it’s been bugging me that, unlike our Drupal Alfresco integration, our Django Alfresco integration does not use CMIS. After talking it over with one of our clients we decided it would make more sense to create a more general purpose CMIS API for Python that Django (and any other Python app) could leverage, rather than build CMIS support directly into the Django Alfresco integration.

Second, around the time I was putting together the Getting Started with CMIS tutorial, it struck me that there needed to be an API that didn’t have a lot of dependencies and was very easy to use. Otherwise, it’s too easy to get lost in the weeds and miss the whole point of CMIS: Easily working with rich content repositories, regardless of the underlying implementation.

Even if you’ve never worked with Python before, it is super easy to get started with cmislib. The install is less than 3 steps and the API should feel very natural to anyone that’s worked with a content repository before. Check it out.

Install

  1. If you don’t have Python installed already, do so. I’ve only tested on Python 2.6 so unless you’re looking to help test, stick with that.
  2. If you don’t have setuptools installed already, do so. It’s a nice tool to use for installing Python packages.
  3. Once setuptools is installed, type easy_install cmislib

That’s all there is to it. Now you’re ready to connect to your favorite CMIS-compliant repository.

Examples

There’s nothing in cmislib that is specific to any particular vendor. Once you give it your CMIS provider’s service URL and some credentials, it figures out where to go from there. But I haven’t tested with anything other than Alfresco yet, and this thing is still hot out of the oven. If you want to help test it against other CMIS 1.0cd04 repositories I’d love the help.

Anyway, let’s look at some examples using Alfresco’s public CMIS repository.

  1. From the command-line, start the Python shell by typing python then hit enter.
  2. Python 2.6.3 (r263:75183, Oct 22 2009, 20:01:16)
    GCC 4.2.1 (Apple Inc. build 5646)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>>
  3. Import the CmisClient and Repository classes:
  4. >>> from cmislib.model import CmisClient, Repository
  5. Point the CmisClient at the repository’s service URL
  6. >>> client = CmisClient('http://cmis.alfresco.com/s/cmis', 'admin', 'admin')
  7. Get the default repository for the service
  8. >>> repo = client.getDefaultRepository()
    >>> repo.getRepositoryId()
    u'83beb297-a6fa-4ac5-844b-98c871c0eea9'
  9. Get the repository’s properties. This for-loop spits out everything cmislib knows about the repo.
  10. >>> repo.getRepositoryName()
        u'Main Repository'
    >>> info = repo.getRepositoryInfo()
    >>> for k,v in info.items():
        ...     print "%s:%s" % (k,v)
        ...
        cmisSpecificationTitle:Version 1.0 Committee Draft 04
        cmisVersionSupported:1.0
        repositoryDescription:None
        productVersion:3.2.0 (r2 2440)
        rootFolderId:workspace://SpacesStore/aa1ecedf-9551-49c5-831a-0502bb43f348
        repositoryId:83beb297-a6fa-4ac5-844b-98c871c0eea9
        repositoryName:Main Repository
        vendorName:Alfresco
        productName:Alfresco Repository (Community)

Once you’ve got the Repository object you can start working with folders.

  1. Create a new folder in the root. You should name yours something unique.
  2. >>> root = repo.getRootFolder()
    >>> someFolder = root.createFolder('someFolder')
    >>> someFolder.getObjectId()
    u'workspace://SpacesStore/91f344ef-84e7-43d8-b379-959c0be7e8fc'
  3. Then, you can create some content:
  4. >>> someFile = open('test.txt', 'r')
    >>> someDoc = someFolder.createDocument('Test Document', contentFile=someFile)
  5. And, if you want, you can dump the properties of the newly-created document (this is a partial list):
  6. >>> props = someDoc.getProperties()
    >>> for k,v in props.items():
    ...     print '%s:%s' % (k,v)
    ...
    cmis:contentStreamMimeType:text/plain
    cmis:creationDate:2009-12-18T10:59:26.667-06:00
    cmis:baseTypeId:cmis:document
    cmis:isLatestMajorVersion:false
    cmis:isImmutable:false
    cmis:isMajorVersion:false
    cmis:objectId:workspace://SpacesStore/2cf36ad5-92b0-4731-94a4-9f3fef25b479
  7. You can also use cmislib to run CMIS queries. Let’s find the doc we just created with a full-text search. (Note that I’m currently seeing a problem with Alfresco in which the CMIS service returns one less result than what’s really there):
  8. >>> results = repo.query("select * from cmis:document where contains('test')")
    >>> for result in results:
    ...     print result.getName()
    ...
    Test Document2
    example test script.js
  9. Alternatively, you can also get objects by their object ID or their path, like this:
  10. >>> someDoc = repo.getObjectByPath('/someFolder/Test Document')
    >>> someDoc.getObjectId()
    u'workspace://SpacesStore/2cf36ad5-92b0-4731-94a4-9f3fef25b479'

Set Python loose on your CMIS repository

These are just a few examples meant to give you a feel for the API. There are several other things you can do with cmislib. The package comes with documentation so look there for more info. If you find any problems and you want to pitch in, you can check out the source from Google Code and create issues there as well.

Give this a try and let me know what you think.

[UPDATE: I had the wrong URL for the Alfresco-hosted CMIS service. It’s fixed now.]

New Tutorial: Getting Started with CMIS

I’ve written a new tutorial on the proposed Content Management Interoperability Services (CMIS) standard called, “Getting Started with CMIS“. The tutorial first takes you through an overview of the specification. Then, I do several examples. The examples start out using curl to make GET, PUT, POST, and DELETE calls against Alfresco to perform CRUD functions on folders, documents, and relationships in the repository. If you’ve been dabbling with CMIS and you’ve struggled to find examples, particularly of POSTs, here you go.

I used Alfresco Community built from head, but yesterday, Alfresco pushed a new Community release that supports CMIS 1.0 Committee Draft 04 so you can download that, use the hosted Alfresco CMIS repository, or spin up an EC2 image (once Luis gets it updated with the new Community release). If you don’t want to use Alfresco you should be able to use any CMIS repository that supports 1.0cd04. I tried some, but not all, of the command-line examples against the Apache Chemistry test server.

Once you’ve felt both the joy and the pain of talking directly to the CMIS AtomPub Binding, I take you through some very short examples using JavaScript and Java. For Java I show Apache Abdera, Apache Chemistry, and the Apache Chemistry TCK.

For the Chemistry TCK stuff, I’m using Alfresco’s CMIS Maven Toolkit which Gabriele Columbro and Richard McKnight put together. That inspired me to do my examples with Maven as well (plus, it’s practical–the Abdera and Chemistry clients have a lot of dependencies, and using Maven meant I didn’t have to chase any of those down).

So take a look at the tutorial, try out the examples with your favorite CMIS 1.0 repo, and let me know what you think. If you like it, pass it along to a friend. As with past tutorials, I’ve released it under Creative Commons Attribution-Share Alike.

[Updated to correct typo with Gabriele’s name. Sorry, Gab!]

Top Five Alfresco Roadmap Takeaways

Now that the last of the Alfresco Fall meetups has concluded in the US, I thought I’d summarize my takeaways. Overall I thought the events were really good. The informative sessions were well-attended. Everyone I talked to was glad they came and left with multiple useful takeaways.

Everyone has their own criteria for usefulness–for these events my personal set of highlights tend to focus on the roadmap. So here are my top five roadmap takeaways from the Washington, D.C., Atlanta, and LA meetups.

1. Repository unification strategy revealed

Now we know what Alfresco plans to do to resolve the “multiple repository” issue. In a nutshell: Alfresco will add functionality to the DM repository until it is on par with the AVM (See “What are the differences…“). What then? The AVM will continue to be supported, but if I were placing bets, I would not count on further AVM development past that point.

This makes a lot of sense to me. We do a lot of “WCM” for people using the Alfresco DM repository, especially when Alfresco is really being leveraged as a core repository. It also makes sense with Alfresco’s focus on CMIS (see next takeaway) because you can’t get to the AVM through CMIS.

2. CMIS, CMIS, CMIS

Clearly, CMIS is an important standard for Alfresco. (In fact, one small worry I have is that Alfresco seems to need CMIS more than any of the other players behind the standard, but I digress). Alfresco wants to be the go-to CMIS repository and believes that CMIS will be the primary way front-ends interact with rich content repositories. They’ve been on top of things by including early (read “unsupported”) implementations of the draft CMIS specification in both the Community and Enterprise releases, but there a number of other CMIS-related items on the roadmap:

  • When the CMIS standard is out of public review, Alfresco will release a “CMIS runtime”. Details are sketchy, but my hunch is that Alfresco might be headed toward a Jackrabbit/Day CRX model where Alfresco’s CMIS runtime would be like a freely-available reference CMIS repository (Alfresco stripped of functionality not required to be CMIS compliant) and the full Alfresco repository would continue as we know it today. All speculation on my part.
  • Today deployments are either FSR (Alfresco-to-file system) or ASR (Alfresco AVM to Alfresco AVM). The latter case is used when you have a front-end that queries Alfresco for its content but you want to move that load off of your primary authoring server. In 3.2, the deployment service has gotten more general, so it’s one deployment system with multiple extensible endpoint options (file system, Alfresco AVM, CouchDB, Drupal, etc.). Alfresco will soon add AVM-to-CMIS deployment. That means you can deploy from AVM to the DM repository. Does it mean you can deploy to any CMIS repository? Not sure. If not, that might be a worthwhile extension.
  • One drawback to using DM for WCM currently is that there is not a good deployment system to move your content out of DM. It’s basically rsync or roll-your-own. On the roadmap is the ability to deploy from DM instead of AVM. This is one of the features the DM needs to get it functionally equivalent to what you get with the AVM. I wouldn’t expect it until 4.0.

3. Shift in focus to developers

Alfresco WCM has always been a decoupled system. When you install Alfresco WCM you don’t get a working web site out-of-the-box. You have to build it first using whatever technology you want, and then let Alfresco manage it. So, unlike most open source CMS’, it’s never been end-user focused in the sense of, “I’m a non-technical person and I want a web site, so I’m going to install Alfresco WCM”. Don’t expect that to change any time soon. Even Web Studio, which may not ever make it to an Enterprise release, is aimed at making Surf developers productive, not your Marketing team.

Alfresco is realizing that many people discard the Alfresco UI and build something custom, whether for document management, web content management, or some other content-centric use case. To make that easier, Alfresco is going to rollout development tools like Eclipse plug-ins, Maven compatibility, and Spring Roo integration (Uzi’s Spring Roo Screencast, Getting Started with Spring Roo ).

Alfresco has also announced that web scripts, web studio, and the Surf framework will be licensed under Apache and there were allusions to “making Surf part of Spring” or “using Surf as a Tiles replacement”. I haven’t seen or heard much from the Spring folks on this and I noticed these topics were softened between DC and LA, but that could have just been based on who was doing the speaking (see “What do you think of Alfresco’s multi-event approach?“).

Essentially what’s going on here is that Alfresco wants all of your future content-centric apps and even web sites to be “CMIS applications”, and Alfresco believes it can provide the best, most productive development platform for writing CMIS apps.

4. Stuff that may never happen but would be cool if it did

This is a grab bag of things that are being considered for the roadmap, but are far enough out to be uncertain. Regardless of if/when, these are sometimes a useful data point for where the product is headed directionally.

  • Native XML support. Right now Alfresco can manage XML files, obviously, but, unlike a native XML database like eXist or MarkLogic, the granularity stops with the file. Presumably, native XML support would allow XML validation, XPath and XQuery expressions running against XML file content, and better XSLT support.
  • Apache Solr. I think the goal here is to get better advanced search capability such as support for faceted search, which is something Solr knows how to do.
  • Repository sharding. This would be the ability to partition the repository along some (arbitrary?) dimension. Sharding is attractive to people who have very, very large repositories and want to distribute the data load across multiple physical repositories, yet retain the ability to treat the federation as one logical repo.

5. Timeline

Talk to Alfresco if you need this to be precise, but here’s the general idea of the timeline through 4.0 based on the slides I saw:

  • 3.2 Enterprise 12/2009
  • CMIS 1.0 Release Spring 2010
  • 3.3 Enterprise 1H 2010
  • 4.0 Enterprise 12/2010 (more likely 2011)

Thanks, Alfresco, and everyone who attended

Lastly, thanks to Nancy Garrity and the rest of the team that put these events together. I enjoyed presenting on Alfresco-Drupal in Atlanta and giving the Alfresco Best Practices talk (Alfresco Content Community login required).

I always enjoy the informal networking that happens at these events. There’s such a diverse group of experience levels, use cases, and businesses–it makes for interesting conversations. And, as usual, thanks to the book and blog readers who approached me. It always makes me happy to hear that something on your project was better for having read something I wrote. It was good meeting you all and I’m looking forward to the next get-together.

Drupal + Alfresco webinar slides available

People want intranets that are fun and easy to use, full of compelling content relevant to their job, and enabled with social and community features to help them discover connections with other teams, projects, and colleagues. IT wants something that’s lightweight and flexible enough to respond to the needs of the business that won’t cost a fortune.

That’s why Drupal + Alfresco is a great combination for things like intranets like the one Optaros built for Activision and why we had a record-breaking turnout for the Drupal + Alfresco webinar Chris Fuller and I did today. Thanks to everyone who came and asked good questions. I’ve posted the slides. Alfresco recorded the webinar so they’ll make it available soon, I’m sure. When that happens, I’ll update the post with a link. Until then, enjoy the slides.

[UPDATE: Fixed the slideshare link (thanks, David!) and added the links to the webinar recording below]

1. Streaming recording link:
https://alfresco.webex.com/alfresco/lsr.php?AT=pb&SP=TC&rID=42774837&act=pb&rKey=b44130d69cc9ec5f

2. Download recording link:
https://alfresco.webex.com/alfresco/ldr.php?AT=dw&SP=TC&rID=42774837&act=pf&rKey=c50049ac82e1220a

Apache Chemistry: A Proposed Reference CMIS Implementation

Kas Thomas, over at CMSWatch, says a new Apache project is in the works. Chemistry is a reference CMIS implementation that has committers from Day, Alfresco, and Nuxeo. The goal is to provide a vendor-neutral reference implementation and compatibility tests around the proposed CMIS specification.

It looks like this will be a standard set of REST APIs and SOAP services that implement the CMIS spec and hook into a back-end repository. Not surprisingly, the first back-end the Chemistry team plans to support is Apache Jackrabbit, Apache’s reference implementation of the JCR API.