I’ll be in Chicago tomorrow for the Alfresco Meetup. I’ll be speaking during the Barcamp on Alfresco and Drupal integration with CMIS (module, screencast). I’ll also have the Alfresco-Django integration running on my laptop. I may not have time to show Alfresco-Django during my slot, but I’ll be happy to stick around and do informal demos and talk about either integration if you’re interested because I’d like your feedback on it.
Kas Thomas, over at CMSWatch, says a new Apache project is in the works. Chemistry is a reference CMIS implementation that has committers from Day, Alfresco, and Nuxeo. The goal is to provide a vendor-neutral reference implementation and compatibility tests around the proposed CMIS specification.
It looks like this will be a standard set of REST APIs and SOAP services that implement the CMIS spec and hook into a back-end repository. Not surprisingly, the first back-end the Chemistry team plans to support is Apache Jackrabbit, Apache’s reference implementation of the JCR API.
Alfresco Share is a team-centric collaboration tool. It’s really cool and our clients have been reacting very positively to it. When customers see the AJAXalicious UI, a common reaction is to want to take the next 5 projects on their list and “do them on Share”.
In cases where the functional requirements closely resemble team collaboration, that can be a great choice. In others, it’s an abuse of the tool. Just like a lot of things in software and life, just because you can doesn’t mean you should. (Remind me to tell you the story about building a tennis court reservation system in Lotus Notes some time).
Anyway, let’s assume you’ve got a set of requirements that reasonably resembles team-based collaboration, but some of Share’s tools (wiki, blog, document library, calendar, and recently, bookmarks) don’t work exactly the way you need them to. I’m not talking about adding new, self-contained custom components. This is specifically about customizing the out-of-the-box Share components. With that in mind, here are five areas where even simple Share customization efforts could take longer than you might think.
In its current incarnation, if you have custom metadata you want to display when looking at document detail, that’s code you have to write. Alfresco’s Mike Hatfield said, via Twitter, that the 3.2 Forms Service will make this better, so that’s good. If your Share sites contain simple documents that use only out-of-the-box metadata, this won’t be an issue for you.
Currently, in Share, there are a couple of places where the jBPM workflow engine is used. First, when you invite someone to a site, that kicks off a workflow. Second, you can “assign” an advanced workflow to a document in the document library.
The first issue is that the workflow submission dialog includes only the two out-of-the-box, document-centric workflows, “Ad hoc” and “Review and Approve”. It won’t show any custom workflows you’ve deployed. The workflow modal is not inspecting the workflow UI configuration like the web client does, so even if you got your workflows to show up in that list, the form wouldn’t have the custom workflow metadata you need to launch your custom workflow properly.
When you log in to Share, you’ll see a “My Tasks” dashlet. This gives you hope that maybe that dashlet could manage tasks for any workflow. Unfortunately, it only works with the “invite user” workflow and the two document-centric, OOTB workflows.
Long story short, Share isn’t set up to work with custom workflows out-of-the-box. If you’ve got custom workflows that need to work in the Share context, you’ll need to write your own dialogs for launching the workflow and your own component for managing workflow tasks.
YUI Bubbling Events
Share makes heavy use of YUI Bubbling Events. This results in a great end-user experience–the Share components communicate with each other and refresh themselves via AJAX without page refreshes. But it does mean there’s a bit of a learning curve when following the same pattern to implement your customizations if your team has never worked with the bubbling library before. It can get kind of thick in places.
Incidentally, all of the YUI stuff is part of Share, not Surf, which is the framework used to build Share. If you’re building your own Surf app you’ll need to grab the YUI libraries (or any other libraries you want to use) yourself. It’s the same for the Flash pieces (multi-document upload, document preview). It keeps Surf light, but if you want to incorporate that kind of functionality into your Surf app, some assembly will be required.
The main CSS file for Share is in the themes directory. But changing that will only affect the global dashboard header and the site dashboard header. If you want to change the theme for everything in Share, including individual tools, you have to change each tool’s CSS file. Those CSS files live in the “modules” directory. It would be nice if it were easier to implement site-wide or global themes.
Adding your own Components/Tools
The impact of these issues are lessened somewhat if you are adding your own components or tools instead of customizing what’s already there. It’s easy to write your own dashlets that show up on the global dashboard or the site dashboard. And with a little work, you can write dashlets that talk to each other using YUI Bubbling Events, just like the OOTB dashlets. The area for improvement is in skinning, configuring, and extending the out-of-the-box tools.
Share Your Thoughts on Share
There’s no doubt that Share is a cool application for team-based collaboration. I didn’t expect it to be configurable to the Nth degree right away, and we may be pushing the limits of its intended use. I’m curious to hear from others who have been tweaking the app: Have you worked through these issues? Are there other examples of specific extension points Alfresco could address to make your lives easier?
People often need to build a custom user interface on top of the Alfresco repository and I see a lot of people asking general questions about how to do it. There are lots of options to consider. Here are four options for creating a user interface on top of Alfresco, at a high level:
Option 1: Use your favorite programming language and/or framework to talk to Alfresco via REST or Web Services. PHP? Python? Java? Flex? Whatever, it’s up to you. The REST API is nice because if you can’t find a URL that does what you need it to out-of-the-box, you can always roll-your-own with the web script framework. This option offers the most flexibility and creative freedom, but of course you might end up building constructs or components that you may have gotten “for free” from a higher-level framework. Optaros‘ streamlined web client, DoCASU, built on Ext-JS, is one freely-available example of a custom UI on top of Alfresco but there are others.
Option 2: Use Alfresco’s Surf framework. Alfresco’s Surf framework is just that–it’s a framework. Don’t confuse it with Alfresco Share which is a team-centric collaboration client built on top of Surf. And, don’t assume that just because a piece of functionality is in Share it is available to you in the lower-level Surf framework. You may have to do some extra work to get some of the cool stuff in Share to work in your pure Surf app. Also realize that Surf is brand new and still maturing. You’ll be quickly disappointed if you hold it to the same standard as a more widely-used, well-established framework like Seam or Django. Surf is a good option for quick, Alfresco-centric solutions, especially if you think you might want to leverage Alfresco’s browser-based site assembly tool, Web Studio, at some point in the future. (See Do-it-yourself Alfresco Surf Code Camp).
Option 3: Customize the Alfresco “Explorer” web client. There are varying degrees to which you can customize the web client. On one end of the spectrum you’ve got Freemarker “presentation templates” followed closely by XML configuration. On the other end of the spectrum you’ve got more elaborate enhancements you can make using JavaServer Faces (JSF). Customizing the Alfresco Explorer web client should only be considered if you can keep your enhancements to an absolute minimum because:
- Alfresco is moving away from JSF in favor of Surf-based clients. The Explorer client will continue to be around, but I wouldn’t expect major efforts to be focused on that client going forward.
- JSF-based customizations of the web client can be time-consuming and potentially complex, particularly if you are new to JSF.
- For most solutions, you’ll get more customer satisfaction bang out of your coding buck by building a purpose-built, eye-catching, UI designed with your specific use cases in mind than you will by starting with the general-purpose web client and extending from there.
Option 4: Use a portal, community, or WCM platform. This includes PHP-based projects like Drupal (Drupal CMIS Screencast) or Joomla as well as Java-based projects like Liferay and JBoss Portal. This is a good option if you have requirements that match up well with the built-in (or easily added-on) capabilities of those platforms.
It’s worth talking about Java portal servers specifically. I think people are struggling a bit to find The Best Way to integrate Alfresco with a portal. Of course there probably is no single approach that will fit every situation but I think Alfresco (with help from the community) could do more to provide best practices.
Here are the options you have when integrating with a portal:
Portal Option 1: Configure Alfresco to be the replacement JSR-170 repository for the portal. This option seems like more trouble than it is worth. If all you need is what you can get out of JSR-170, you might as well use the already-integrated Jackrabbit repository that most open source portals ship with these days unless you have good reasons not to. I’m open to having my mind changed on this one, but it seems like if you want to use Alfresco and a portal, you’ve got bigger plans that are probably going to require custom portlets anyway.
Portal Option 2: Run Alfresco and the portal in the same JVM (post). This is NOT recommended if you need to scale beyond a small departmental solution and, really, I think with the de-coupling of the web script engine we should consider this one deprecated at this point.
Portal Option 3: Run the Alfresco web script engine and the portal in the same JVM. Like the previous option, this gives you the ability to write web scripts that are wrapped in a portlet but it cuts down on the size of the web app significantly and it frees up your portal to scale independently of the Alfresco repository tier. It’s a fast development cycle once you get it set up. But I haven’t seen great instructions for setting it up yet. Alfresco should document this on their wiki if they are going to support this pattern.
Portal Option 4: Write your own portlets that make services calls. This is the “cleanest” approach because it treats Alfresco like any other back-end you might want to integrate with from the portal. You write custom portlets and have them talk to Alfresco via REST or SOAP. You’ll have to decide how you want to handle authentication with Alfresco.
What about CMIS?
CMIS fits under the “Option 1: Use your favorite programming language” and “Portal Option 4: Write your own portlets” categories. You can make CMIS calls to Alfresco using both REST and SOAP from your own custom code, portlet or otherwise. The nice thing about CMIS is that you can use it to abstract the underlying repository so that (in theory) your front-end code will work with different CMIS-compliant back-ends. Just realize that CMIS isn’t a fully-ratified standard yet and although a CMIS implementation is in the Enterprise version of Alfresco, it isn’t clear to me whether or not you’d be supported if you had a problem. (The last response I saw on this specific question was a Peter Monks tweet saying, “I don’t think so”).
The CMIS standard should be approved by the end-of-the-year and if Alfresco’s past performance is an indicator of the future, they’ll be the first to market with a production-ready, fully-supported CMIS implementation based on the final spec.
Pick your poison
Those are the options as I see them. Each one has trade-offs. Some may become more or less attractive over time as languages, frameworks, and the state of the art evolve. Ultimately, you’re going to have to evaluate which one fits your situation the best. You may have a hard time making a decision, but you have to admit that having to choose from several options is a nice problem to have.
Curl is a useful tool for all sorts of things. One specific example of when it comes in handy is when you are developing Alfresco web scripts. On a Surf project, for example, you might divide into a “Surf tier” team and a “Repository tier” team. Once you’ve agreed on the interface, including both the URLs and the format of the data that goes back-and-forth between the tiers, the two teams can start cranking out code in parallel.
If you’re on the repo team, you need a way to test your API, and you probably don’t have a UI to test it with (that’s what the other team’s working on). There are lots of solutions to this but curl is really handy and it runs everywhere (on Windows, use Cygwin).
This post isn’t intended to be a full reference or how-to for curl, and obviously, you can use curl for a lot of tasks that involve HTTP, not just Alfresco web scripts. Here are some quick examples of using curl with Alfresco web scripts to get you going.
Get a ticket
It’s highly likely that your web script will require authentication. So the first thing you do is call the login web script to get a ticket.
curl -v "http://localhost:8080/alfresco/service/api/login?u=admin&pw=somepassword"
Alfresco will respond with something like:
<?xml version="1.0" encoding="UTF-8"?>
Now you can take that ticket and append it to your subsequent web script calls.
Any web script you’ve got that accepts GET can be tested using the same simple syntax.
Post JSON to your custom web script
If all you had were GETs you’d probably just test them in your browser. POSTs, PUTs and DELETEs require a little more doing to test. You’re going to want to test those web scripts so that when the front-end team has their stuff ready, it all comes together without a lot of fuss.
So let’s say you’ve got a web script that the front-end will be POSTing JSON to. To test it out, create a file with some test JSON, then post it to the web script using curl, like this:
curl -v -X POST "http://localhost:8080/alfresco/service/someco/someScript?alf_ticket=TICKET_e46107058fdd2760441b44481a22e7498e7dbf66" -H "Content-Type: application/json" -d @/Users/jpotts/test.json
var postedObject = eval('(' + json + ')');
logger.log("Customer name:" + postedObject.customerName);
Run a CMIS query
With 3.0 Alfresco added an implementation of the proposed CMIS spec to the product. CMIS gives you a Web Services API, a RESTful API, and a SQL-like query language. Once you figure out the syntax, it’s easy to post CMIS queries to the repository. You can wrap the CMIS query in XML:
<cmis:query xmlns:cmis="http://www.cmis.org/2008/05" >
<cmis:statement><![CDATA[select * from cm_content where cm_name like '%Foo%']]></cmis:statement>
Then post it using the same syntax as you saw previously, but with a different Content-Type in the header, like this:
curl -v -X POST "http://localhost:8080/alfresco/service/api/query?alf_ticket=TICKET_e46107058fdd2760441b44481a22e7498e7dbf66" -H "Content-Type: application/cmisquery+xml" -d @/Users/jpotts/cmis-query.xml
Alfresco will respond with ATOM, but it’s a little verbose so I won’t take up space here to show you the result. Also, I noticed this bombed when I ran it against 3.1 Enterprise but I haven’t drilled down on why yet.
Create a new object using CMIS ATOM
Issuing a GET against a CMIS URL returns ATOM. But CMIS URLs can also accept POSTed ATOM to do things like create new objects. For example, to create a new content object you would first create the ATOM XML:
<?xml version="1.0" encoding="utf-8"?>
<entry xmlns="http://www.w3.org/2005/Atom" xmlns:cmis="http://www.cmis.org/2008/05">
<title>Test Plain Text Content</title>
<summary>Plain text content created via CMIS POST</summary>
Note that the content has to be Base64 encoded. In this case, the content is plain text that reads, “Here is some plain text content.” One way to encode it is to use OpenSSL like “openssl base64 -in <infile> -out <outfile>”. The exact syntax of ATOM XML with CMIS is the subject for another post.
Once you’ve got the XML ready to go, post it using the same syntax shown previously, with a different Content-Type in the header:
curl -v -X POST "http://localhost:8080/alfresco/service/api/node/workspace/SpacesStore/18fd9821-42a5-4c6a-86d3-3f252679cf7d/children?alf_ticket=TICKET_e46107058fdd2760441b44481a22e7498e7dbf66" -H "Content-Type: application/atom+xml" -d @/Users/jpotts/testCreate.atom.xml
The node reference in the URL above is a reference to the folder in which this new child will be created. There’s also a similar URL that uses the path instead of a node ref if that’s more your thing.
Refreshing Web Scripts from Ant
One of the things you do quite frequently when you develop web scripts is tell Alfresco to refresh its list of web scripts. There are lots of ways to automate this, but one is to create an Ant task that uses curl to invoke the web script refresh URL. This lets you deploy your changes and tell Alfresco to refresh the list in one step (and makes sure you and your teammates never forget to do the refresh).
<target name="deploy-webscripts" depends="deploy" description="Refreshes the list of webscripts">
In this example, the “deploy” ant task this task depends on is responsible for copying the web scripts to the appropriate place in the exploded Alfresco WAR. (Thanks to my colleague Eric Shea (http://www.eshea.net/2009/01/30/alfresco-dev-survivors-kit-part-1/) for this tip).
So there you go. It’s not Earth-shattering but it might give you a productivity boost if you don’t already have curl or an alternative already in your bag of tricks.
I’ve been playing with the newly-released Java support in Google App Engine and it is pretty cool. You can do more than I expected you could:
- The Google App Engine Eclipse plug-in gives you a template project and associated config files, Ant build scripts, a deployment tool, and a local run-time environment that acts like GAE (user service, data store, limitations imposed by the platform).
- You’ve got full persistence and query capability via JDO. You pretty much just model your entities as POJO’s, then you annotate the fields in those classes as “persistent” and you’re good to go. You do JDOQL to query your objects. Queries will only return the first 1000 results.
- You can run cron jobs. A cron job wakes up on a schedule and invokes a URL you specify.
- Servlets and JSPs are supported but you can also use things like Struts and Spring (See Will it Work in Google App Engine?).
- You can take advantage Google’s User service, which means anyone with a Google account can sign-in to your app without creating a new account.
- You can take advantage of Memcache if you need it (JCache).
- You can fetch URLs via the URL Fetch service or java.net.URLConnection.
- You can send mail via JavaMail.
- You can use their Image service to resize, rotate, flip, and crop images.
- Both JDK 5 and JDK 6 are supported.
There are some limits:
- Execution of requests is limited to 30 seconds and that includes URLs invoked by cron jobs.
- You can’t write to the file system. If you need to write out files, I assume you’d use S3 or something.
- You can’t open sockets.
- Each developer can create up to 10 applications and apps can’t be deleted so don’t fill up on Hello Worlds.
- You can run an app that has up to 500 MB of storage and serves 5 million page views per month at no cost.
The beauty, obviously, is that as a developer, you get to focus on the code and let Google worry about scaling. For many applications, this Platform-as-a-Service (PaaS) will be preferred over Infrastructure-as-a-Service (IaaS). In an IaaS setup, you can use solutions like RightScale to automatically provision new nodes to handle spikes in demand, but you still have to set that up. Plus, you’ve got the additional cost and headache of installing, configuring, and maintaining the application server and database software (and making sure it is set up to work when new nodes are auto-provisioned). With the app engine, scaling globally is pretty simple: Step 1 – Write (Good) Code; Step 2 – Deploy Code to GAE.
UPDATE: Screencast now lives here:
I’ve created a new screencast that shows the Alfresco-Drupal CMIS integration in action over at Optaros Labs. The screencast shows content moving back-and-forth between Alfresco and Drupal, content being displayed in a Drupal site that lives in Alfresco, and a CMIS CQL query being executed against the Alfresco repository from Drupal.
The emerging Content Management Interoperability Services (CMIS) standard was the main buzz at the AIIM conference in Philadelphia this week (followed closely by social networking and Enterprise 2.0). Why does CMIS garner so much attention? Because as EMC’s David Choy said, “Unless they are starting from scratch, everybody implementing ECM has an interoperability problem.”
The impact of the problem depends on who you are and what you’re trying to do. If you use one ECM repository for email archiving and another to manage the content on your web site, it doesn’t really matter that the two repositories aren’t interoperable. But what if you have a portfolio of web sites and asset repositories across a variety of platforms and you need to share content across all of them? Then it’s a problem.
Customers in this situation have all kinds of issues they have to deal with, but several come down to a few basic needs having to do with interoperability:
- It needs to be easy for front-ends to store and retrieve content across multiple repositories (different platforms)
- It needs to be easy to move content between repositories
- It needs to be easy to find interesting and relevant content (which I may then want to access from the front-end or send to some other repository)
This multi-site/multi-repository problem is common, and at Optaros, we think we can help address it with tools and services that are, in many cases, driven by CMIS.
I’ve met with clients who’ve invested a lot of time and effort in a custom front-end for Vignette, for example. Over time, they’ve added additional systems to manage assets but the front-end is forever tied to Vignette. Sure, they could introduce a services layer that abstracts the repository, but each time they add a new kind of repository, that’s more development they have to do. CMIS provides a standard way of interfacing with all CMIS-compliant repositories, so the amount of repository-specific coding you have to do is reduced.
At Optaros, we’re developing CMIS adapters or integrations and releasing them as open source so that popular front-ends can get to content more easily. (See Drupal CMIS). We think there’s probably also a case to be made for a CMIS-based “content bus” that front-ends and compliant repositories could plug into.
What if you want to move content between repositories? I’ve talked to folks who have Drupal web sites, but they’d like to take some of the user-generated content that comes from Drupal and treat it more formally–like maybe they want to route it through an internal workflow, tag it, and then make it available to some of the other sites in their portfolio that might or might not be Drupal. CMIS gives us a common way to export and import content (through ATOM XML). Throw a transformation in the middle to handle schema differences and you’ve got yourself a CMIS-based replication engine that can move content between different kinds of repositories.
We’ve built a basic synchronization between Alfresco and Drupal as part of the Drupal CMIS and Alfresco modules, but we think this kind of functionality should be separated out and treated more generically–like a replication server that could sit between any number of CMIS-compliant endpoints.
Feeds & Search
How do you know when new content you might be interested in shows up in one of the numerous repositories you have in your environment? One answer is search. Initially, I was thinking a federated search server based on OpenSearch would be worthwhile but I think at this point, most people are just indexing everything rather than taking a federated approach.
Search is good when you are actively looking for specific content, but people (and systems) need to find content they are interested in as it becomes available. CMIS returns data as ATOM XML. That means you can use the same RSS/ATOM feed aggregation tools and techniques you use to track news to track content updates. This isn’t new–you’ve always been able to bolt on RSS feeds to your content repository by using the repository’s API to query for content and return it formatted as a feed, but CMIS gives us this “for free”.
So as CMIS becomes more widely adopted, we may see a big increase in the amount of ATOM/RSS flowing through the Enterprise as people begin to use feeds to discover new content. An Enterprise RSS server would seem like a good tool to leverage to aggregate the feeds coming from various content repositories across the content domain. It could also be used for other non-CMIS related feeds which will also surely increase as Enterprise 2.0 adoption spreads.
Most of the feed aggregation functionality available today is embedded within broader platforms (RSS portlets, for example). At Optaros, we think that, like replication, feed aggregation should be split out into a dedicated service. Think “Google Reader for the Enterprise”, essentially. There are proprietary systems out that that do this but no open source alternatives that are more than just personal feed aggregators (at least that I could find) so we may have to develop this.
Services-Oriented Content Management
These individual services–CMIS adapters, Replication, and Feeds–are part of a Services-Oriented approach to Content Management. The services are interrelated, and there are others I haven’t discussed, but the idea is that this type of approach can make a multi-silo’d content domain much more manageable and useful. Some of it depends on CMIS and some of it doesn’t. These ideas are still being hammered out so if you have interest in any of it, please let me know.
It’ll be interesting to see what happens as CMIS moves through additional iterations and ultimately becomes ratified hopefully by the end of the year.
For more info, see:
- Optaros’ Services-Oriented Content Management solution brief which summarizes some of these thoughts.
- John Newton’s AIIM presentation on CMIS
- David Choy’s AIIM presentation on CMIS (Presentation hadn’t been uploaded at the time of this writing)
- EMC’s CMIS forum
- AIIM’s “Federator” CMIS Demo
I don’t typically write a post every time a vendor releases a new version of software but Alfresco 3.1 Enterprise, which became available for download on Tuesday, is a significant release. Users of 2.2, who had previously been asked not to upgrade to 3.0, should now upgrade to either 2.2 SP3 or 3.1. All of the fixes in 2.2 SP3 are included in the 3.1 release. If you are currently on 2.2 SP3, the biggest reason to move to 3.1 would be to take advantage of Alfresco Share. Conservative users may decide to wait until 3.1.1.
One big change with 3.1 is that modified items are now merged with the Staging sandbox asynchronously. That means when submitting modified items, users are immediately returned to the sandbox list and will still see the modified items until they are committed. In a cursory test of this, I was surprised at how long it took for those changes to leave my Modified Items list. Maybe the polling interval is configurable.
This change affects the out-of-the-box WCM submission workflow. So if you created your own custom WCM submission workflow, you’re going to need to make some changes. The required changes are documented in the release notes, or you can take a look at the new submission workflow to get the gist.
Another new feature of Alfresco 3.1 is the REST API for WCM. A lot of URLs you probably already created on your own are included. The new API includes things like adding users to web projects, creating user sandboxes, creating, updating, and deleting assets, submitting assets, and more. Even if you already built these yourself, you should take a look to see if these meet your needs. Why continue to maintain your own custom code?
The 3.1 release also marks a shift in Alfresco’s “two flavors” approach. According to John Newton (post), Alfresco is looking for ways to entice large Enterprise users to migrate from Labs to Enterprise. So they’ve created functionality that they feel only appeals to large Enterprises and are making it available only to people paying for subscriptions. This includes things like monitoring (JMX, Hyperic plug-in), proprietary database extensions, and clustering.
Newton says 100% of the source code will still be available for both releases and that fixes made in Enterprise will be made available in the next Labs release (although he didn’t say how long Labs releases will lag behind Enterprise).
Other noteworthy 3.1 fixes or enhancements include:
- No one (not even admin) can write to a Staging sandbox in WCM.
- Share now includes a “Links” component (which means I don’t have to finish coding the “Bookmarks” component we started but never finished). There are numerous other Share enhancements around Calendar, rich text editing, and previewing.
- Actions can now have AND/OR conditions and can trigger on property values.
- A new group called ALFRESCO_ADMINISTRATORS has been created that makes it much easier to designate administrator users other than admin.
See the release notes for a full list of Jira tickets addressed by the release.