Category: Alfresco

Alfresco open source content management

Book Review: Alfresco 3 Records Management by Dick Weisinger

Packt Publishing sent me a copy of Dick Weisinger’s new book, Alfresco 3 Records Management, to read and review. Now, before I tell you what I thought of the book I have to say that I don’t perceive a huge demand for Alfresco’s Records Management offering and I haven’t heard them talk about it much lately. I don’t know if my perception matches reality and, if it does, why we don’t hear about it more often. An open source, DoD-certified, freely-available Records Management product definitely sounds unique and compelling. It could be such a focused niche that there’s a lot of activity around it that I’m simply not aware of.

Still, I was curious to learn more about the topic and Alfresco’s offering, so I gave it a go.

Let me start by saying that Dick does a fabulous job of defining a topic area and a target audience and sticking to that. The book is 457 pages without the online appendices but it feels concise. I think that’s because it is well organized, clearly written, and flows logically and seamlessly from one chapter to the next.

The book is written for both Records Managers and Software Developers. You’d think that having such disparate audiences would be a problem, but Dick handles it very well. Every chapter starts out with an end-user description of the Records Management functionality and then when everything’s been covered, shifts to a “How Does it Work?” section that dives into the technical details behind the functionality covered in the first half. Too often, books written for both technical and non-technical audiences munge their material together in such a way that is frustrating to both audiences. In this case, the content is separated cleanly so it works very well. If you are a Records Manager, and you have no desire to peek under the hood, you can read this book and easily skip the “How Does it Work?” section in each chapter without being confused or distracted.

But here’s the other thing that’s going on, which I thought was really cool. Even though the book is about Records Management, Dick’s managed to write a Share customization primer. As I was reading, I was thinking an alternate title for the book could be, “Learning Share Development by Deconstructing the Alfresco Records Management Application” (if you’re not into the whole brevity thing).

Alfresco’s Records Management add-on is actually just a set of repository tier and Share tier customizations. So, if you learn how the Records Management app is built, you learn how to build other Share-based solutions. In the book, each chapter’s “How Does it Work?” section covers a different example of Share functionality and how it works behind the scenes. So, someone who’s interested in learning about Share customizations can read this book from that perspective, and essentially use Records Management as one big sample application.

For example, in Chapter 5 the Records Manager learns how to set up a file plan. In the same chapter, a developer learns how the YUI data table that renders the file plan actually works. In Chapter 9 the Records Manager learns about holds/freezes, retention, and reviews while the developer learns how to configure scheduled jobs and how Share UI actions work. Every chapter works this way–I’m just picking out a couple of examples.

Don’t get me wrong. I’m not saying that you can go from zero to Badass Share Developer with this book alone. That’s not going to happen. But as Dick says in the summary of the last chapter, it’s a good start. The technical sections give you a pointer to the pieces that make up a particular area of functionality. That’s useful if you want to change how Records Management works but you can also use it as an example for adding similar functionality to your own Share-based app.

One thing that frustrates developers trying to customize Share (and Records Management) is the question, “Given what I’m looking at in the browser, where does the code reside that makes it work?”. The book shows the technical reader how to go from Share page to template to component to Spring Surf web script which is something you do over and over when you are first learning how Share is put together. Being pointed in the right direction and being shown the general pattern that the Share developers followed is really valuable.

So, if you’re thinking about implementing Records Management, and you need to know whether or not Alfresco’s offering will fit the bill, this would be a great book for you to read during your evaluation or after the selection has been made and you need to learn how to install and configure the product. If you are the technical person on the implementation team, the book will give you the end-user context as well as a peek under the hood. If you need to customize the product and you’ve never done Share customization work before, you’ll learn how Spring Surf works and, more importantly, given a piece of functionality, you’ll know where to look when you need to change it.

Well done, Dick!

Collaborative content creation with Amazon Mechanical Turk

Amazon’s Mechanical Turk has been intriguing to me since I first heard about it. I think it is because the idea of essentially having a workflow with tasks that can be handled by any one of potentially hundreds of thousands of people has mind-blowing potential.

If you’re not familiar, Mechanical Turk (MTurk) is essentially a marketplace that matches up work requests (called HITs) with human workers (called “Turkers”). The work requests are typically very short tasks that require human intelligence like identifying, labeling, and categorizing images or transcribing audio. Amazon is the middleman that matches up HITs with Turkers. From a coding standpoint, your app makes calls to Amazon’s Web Services API to submit requests and to respond to completed work. Turkers monitor the available HITs, select the ones that look interesting to them and then complete the tasks for which they are paid, usually pennies per task.

Sorting through images or performing other simple tasks is one thing, but what about more complex tasks, like, say, writing an article? Here’s a story about some guys who have created a framework called CrowdForge to do just that. CrowdForge is a Django implementation based on research that one of the authors did at Carnegie Mellon. In a nutshell, their approach splits complex problems into smaller problems until they are small enough to be successfully handled by MTurk, then aggregates the results to form the answer to the original problem. It’s Map Reduce applied to human tasks instead of data clusters.

You should read the original post, but to summarize it, the story talks about an experiment that the team did around collaborative content creation. They applied their framework to the task of writing travel articles. They split the task into 36 sub-tasks and gave each sub-task to an author, then aggregated the results into a coherent article. The partitioning, writing, and re-assembly (the “reduce” part of Map Reduce) was all done through Mechanical Turk by CrowdForge. Total cost for each article? About $3.26.

Then, for comparison, they assigned individual authors to write articles on the same topics using the traditional approach of one author per article paying roughly what they paid for the collaboratively created content. When the results were reviewed, the crowd sourced content beat the single author content in terms of quality. It’s important to note that in both cases, authors were Turkers. This wasn’t Mechanical Turk versus Rick Steves. But still, the researchers were able to use Mechanical Turk to break the problem down, perform each task, and then clean up the result, all for about the same cost without sacrificing quality. That’s pretty cool.

As you know, I’m a huge fan of Django, and I think it is more than okay for the presentation tier of a solution like this. But it seems like a workflow engine like Activiti or jBPM would be a better tool for implementing the actual process flow for a framework like CrowdForge because it could potentially mean less coding and maybe more accessibility by business analysts. Imagine using a process modeling tool to lay out your business process and then dropping in a “Mechanical Turk Partition Task” node, graphically connecting it with a “Mechanical Turk Map Task”, and then hooking that to a “Mechanical Turk Reduce Task”. In and around those you’re wiring up email notifications, internal review tasks, etc.

Metaversant has been working with a client who’s doing something very similar. Editors make writing assignments which are outsourced to Mechanical Turk. When the assignments are complete, they are published to one or more channels. Instead of the Django CrowdForge framework, we’re using Alfresco and the embedded jBPM workflow engine. Alfresco stores the content while the jBPM workflow engine orchestrates the process, making calls to Mechanical Turk and the publishing endpoints.

This approach can be generalized to apply to all kinds of problems beyond content authoring. If you are an Alfresco, jBPM, or Activiti user, and you have a business problem that might lend itself to being addressed by a micro task marketplace like Mechanical Turk, let me know. Maybe we can get my client to open source the specific integration between jBPM and Mechanical Turk. If you’ve already done something like this, let me know that too. I’m interested to hear how others might be integrating content repositories and BPM engines with Mechanical Turk.

Book Review: Alfresco 3 Business Solutions by Martin Bergljung

[UPDATED: To remove my comment about the absence of workflow config in Share, which Martin does cover. Sorry about that, Martin]

Packt Publishing sent me a copy of Martin Bergljung’s new book, Alfresco 3 Business Solutions. I just finished reading it, so I thought I’d write a quick review.

Overall, I think it is a good book with a lot of useful information across a variety of topics. The preface says the book is for “systems administrators and business owners”. Inclusion of “business owners” is a stretch–they’d have to be pretty technical to get something out of this book. I think the intent of including “business” in the title and the target audience is to set up the book as more of a solution-oriented look at Alfresco and less of an exhaustive technical how-to.

Bergljung attempts to organize the book around “business solution” focused chapters. For example, if your main concern is letting your users access the repository through file protocols, then Chapter 5, File System Access Solutions, is for you. If you are doing a content migration to Alfresco, Chapter 8, Document Migration Solutions outlines the different approach available for doing that. I think this is a good approach for the stated audience and most of the chapters fit the approach.

The book starts out with an overview of the Alfresco platform and various repository concepts. Although there are places that risk going into too much detail too early, that first chapter would be a good read for anyone new to Alfresco. The chapters on authentication and synchronization (Chapter 4) and CIFS/WebDAV (Chapter 5) are very thorough and provide some of the best coverage of those topics I’ve seen in any of the Alfresco books. The vigor with which Bergljung attacked those topics makes me think those areas are a particular passion for him. If you are strict about the target audience, Chapters 4 & 5 are definitely the strongest in the book.

However, at various points, the book strays from its intended audience and starts to go into developer topics. Don’t get me wrong, that’s the most interesting stuff to me, but I think it is potentially confusing (or, at best, superfluous) to system administrators. For example, the end of Chapter 1 covers the underlying database schema. Maybe it is a good idea to discuss what the tables are and how they are used so that a DBA gets a feel for the schema and can tune the database appropriately. But Alfresco’s schema is not public and shouldn’t be accessed directly unless you know what you’re doing and if you are willing to sign up for the inevitable maintenance down the road when the schema changes without warning. Bergljung gives a soft warning to this effect at the start of the section but the negative effects could have been emphasized more.

There are a few other developer-centric concepts in the book that I just simply don’t agree with. The first is about AMPs. In a couple of places, Bergljung implies that the Java Foundation API and AMPs are somehow dependent on each other. The statement, “The Foundation API is only used when deploying extensions as an AMP,” is just not true. Later, another statement compounds the problem by saying, “AMP extensions require Java instead of JavaScript”, which, again is not accurate. For some reason, the author is trying to link an API (Java, JavaScript) with a deployment approach (AMPs) which are not related or dependent on each other at all.

Another piece I don’t agree with is about $TOMCAT_SHARED. To be fair, I’ve seen this in other places and have heard certain Alfresco Engineers encouraging the use of $TOMCAT_SHARED for things I think belong in the web app instead. Regardless of where it comes from, I think it’s really bad advice to tell people to use $TOMCAT_SHARED for anything other than alfresco-global.properties and server- or environment-specific settings. Proponents of $TOMCAT_SHARED will say they like deploying their customizations there for two reasons. First, when Alfresco and Share are deployed in the same Tomcat instance, you can deploy your extensions as one package and both web apps will use it. Second, your extensions go into an extension directory external to the Alfresco WAR, which keep them well away from Alfresco’s code. In my option, both of these are actually reasons NOT to use $TOMCAT_SHARED. Why? As to the Alfresco/Share sharing bullet point, why unnecessarily couple those two web apps together? The Alfresco and Share WAR are built to run on completely separate nodes which is helpful. We shouldn’t ruin that by making them both rely on the same shared directory.

As for the “keep your extensions away from Alfresco’s” reason, that’s what the extension directory is for. I can keep my customizations separate and still have them reside in Alfresco’s WAR. In fact, most clients I’ve dealt with prefer that because they have IT Operations teams that only want to deal with self-contained WARs. Being a “special case” is not how you win the hearts and minds of your infrastructure team.

Now, with these picked nits out of the way, I should say that there are some developer-oriented topics that were very good. Bergljung has some good Java Foundation API and JavaScript API examples in Chapter 2 covering Node Service and other commonly-used services. And I like the section in Chapter 3 that talks about setting up Apache Hudson for continuous integration. Chapters 9 through 11 provide good coverage of Advanced Workflows, from designing workflows with swimlane diagrams to a lengthy example showing super states, sub-processes, and custom workflow management dashlets.

In keeping with the “solutions” approach, I think I would have combined the portlet chapter and the mobile app/Grails chapter into a single “integration solutions” chapter and talked less about the specific implementation details and more about the touch points: options for integration (Web Services, CMIS, custom web scripts), single sign-on approaches, what’s available out-of-the-box, caveats, etc. Most of this is covered one way or another between the two chapters. It just seems like a common thing people think through is “What’s the best approach for doing X on top of Alfresco” where X is a portal like Liferay, a community platform like Drupal, a mobile app, and so on.

So, if you’re a “business owner” and by that you mean “non-technical end-user”, you’d probably be better off with Munwar’s book, which is definitely end-user focused. If you are a “system administrator” or someone who needs to know the capabilities of Alfresco and how to integrate Alfresco with various touch points (LDAP, Active Directory, portals), it’s definitely worth a read, particularly if you need to deal with external authentication sources or you are responsible for getting CIFS working. Developers will benefit from the API examples and the chapters on Alfresco’s embedded jBPM engine.

Congrats to Martin and Packt. It’s good to see another title (I think we’re up to 8 or 9 now) added to the Alfresco bookshelf.

Top Takeaways from the Alfresco Kickoff

Alfresco kicked off their fiscal year with a meeting last week in Orlando. About 100 Alfresco employees and 50 partners attended two days of Alfresco-led talks on business and technical topics. As an aside, the conference food at the JW Marriott was maybe the best I’ve ever had at any tech event. Lobster Corn Dog. Enough said.

More importantly, the trip helped clarify in my mind Alfresco’s message around “social content management”. I now see it as taking two different forms: social content management and social content management. The first is basically just marketing what they’ve already got: content services, collaboration, task tracking, wikis, and blogs, exposed through a modern user interface that’s closer to the experience most users have come to expect from using consumer-facing sites and services. The idea is that when you add things like comments, tags, and ratings you go from boring, old-school Enterprise Content Management to fun and exciting social content management. Obviously, there’s other stuff going on here about how when people collaborate, they don’t do it in a vacuum, they do it around content.

The second form–social content management–is when you need to manage content that is published to one or more social channels. For example, Marketing might have a press release, a video, and a tweet that all need to go live at the same time. Order matters, and if one step fails, none of the steps should be performed. Alfresco is building a social publishing framework to handle this kind of use case. So, in this example, “social” doesn’t describe the features of the system–it describes the type of content being managed.

Alfresco didn’t explicitly differentiate between these two forms of social content management but they have current and future functionality that addresses both.

One of my other purposes of the trip was to find out what’s coming in Project Swift, which is the code name for an up-coming release of Alfresco. It sounds like Marketing has the final say on what specific release Swift will be, but after hearing what’s slated for the release, most of us in the room agreed it should be labeled as a major release (4.0). We’ll see.

So what’s going to be in Swift? Lot’s of cool stuff, but here are the top five technical takeaways from the Project Swift Roadmap that jumped out at me:

#5: CIFS, SharePoint, and FTP will be clusterable. CIFS and SharePoint performance are both issues at one of my clients so this one caught my eye.

#4: New Share extension points are coming in Swift including a framework for custom actions, dialogs, and evaluators. The goal is to reduce the amount of copy-and-paste that goes on during typical customizations of Share and to make upgrades a bit easier.

#3: Alfresco is developing a “social content publishing framework” with publishing endpoints for YouTube, Facebook, Twitter, Drupal, and more to address the social content management use case I described earlier in this post. I like this one a lot because I think a lot of people have this problem and because it leverages Alfresco in exactly the “right” way.

#2: Swift will sport a new Apache-licensed workflow engine called Activiti, which is a separate Alfresco-sponsored open source project founded by the creators of JBoss jBPM, which is currently the workflow engine embedded in Alfresco. With Swift, both engines will exist side-by-side. It sounds like you may be able to have jBPM continue to handle running workflow instances and use Activiti to handle new instances if you want to. Activiti will show up in Community soon for people to start playing with.

#1: Apache Solr will be implemented as an optional, separate shared search server and index. As part of this, Lucene will no longer be updated in the same transaction. Instead, the index will be eventually consistent. This should result in a huge performance gain and easier clustering. You’ll also get better control over what gets indexed. In Swift you’ll be able to configure full-text indexing by things like content type and path. The Solr server will accept CMIS Query Language and Alfresco FTS queries but not the current raw Lucene syntax so it might make sense to start moving your queries over to one of these two options if you anticipate leveraging Solr when it is available. Note that it is possible Alfresco may choose to make the Solr server an Enterprise-only feature. It didn’t sound like a final decision had been reached on that.

A Community release of Swift should happen some time in August, but we should start seeing a lot of activity in subversion starting in April. The Enterprise release is slated for mid-November. I predict some late nights ahead for QA and Engineering between Thanksgiving and Christmas. I know there’s not a huge difference between November and January but I’d love to see Swift go GA before year-end.

One comment about Share customizations: I get asked a lot about when I will be updating the Alfresco Developer Guide to include a chapter on Share. I have most of the SomeCo examples ported to Share in an as-yet-unshared code base but, as you can see from some of the changes coming in Swift, Share is still changing a lot with respect to customizations, so I’ve been hesitant to update the book. If you’re looking for Share examples you should take a look at Will Abson’s Share Extras project on Google Code. He’s got about 18 different examples of varying complexity and type. I believe each one is individually deployable.

Not bad for the price of a couple of days in Orlando.

Now available: Alfresco Fivestar Ratings add-on for Alfresco Share

A couple of weeks ago I posted a survey asking if anyone saw any value in a five star ratings widget for Alfresco Share. Honestly, it would have only taken one or two positive responses–even if no one needed one, there’s value in it for example’s sake. It turns out about 20 readers of this blog voted positively, so I went ahead and knocked it out.

This Alfresco Share customization makes it possible for any document in the repository to become “rateable”. When a document is rateable, the Alfresco Share user interface will show a clickable five star ratings widget. The stars light up to indicate the average rating for that document. Users simply click one of the stars to post their own rating. When clicked, the widget refreshes itself with the updated average.

Here is a short screencast that demonstrates the customization. You’ll want to make it full screen.

To implement this, I took the Someco Ratings Service from the Alfresco Developer Guide, moved it to the Metaversant namespace, and changed the names of my Spring beans and JavaScript root variable. Even though my initial target Alfresco version is 3.3, I didn’t want the code to conflict with Alfresco’s new back-end-only ratings service in 3.4 which uses some of the same names that were in the book. I also changed the JSON that the ratings web scripts use to be closer to what exists in 3.4. That way, when I do make a version that works with 3.4, it could potentially work with either my ratings back-end or Alfresco’s.

I then went to work on the UI side, integrating the widget into Share’s document details page, document library (both Share and repository views), search results page, and document-related dashlets. To go from what was in the book to a working integration I revamped the client-side ratings JavaScript from a set of functions to an actual object. Then, I started injecting my own methods into Alfresco’s client-side object prototypes to drop my widget in where appropriate.

Alfresco is still working to make customizations like this more modular and easier to plug in alongside their code and code from the community. Until then, be aware that if your Alfresco implementation already has customizations that override some of the same web scripts and client-side components this module does, there may be some manual integration needed. If you have an out-of-the-box installation (or a set of customizations that won’t conflict with this one) you can deploy the AMP to the Alfresco WAR and the Share customizations to the Share WAR and you’ll be set.

The Alfresco Fivestar Ratings project lives at Google Code. Feel free to check out the source, try it out, and use it on your projects. If you find a bug, log it, then fix it!

Thoughts on Alfresco’s Recognized Developer Accreditation Test

Alfresco is rolling out a new accreditation program. A certain subset of partners (I’m not sure what the criteria is–ask your rep) can take an online test that validates whether or not you know a thing or two about Alfresco. If you pass, you’re a “Recognized Alfresco Developer”, which ostensibly comes with a secret handshake and a map to the club house.

When I first heard about the program I was a little skeptical. Don’t get me wrong–anything that my company and I can use to help differentiate ourselves from more, um, what’s the word I’m looking for–casual? less-experienced? opportunistic?–partners is a good thing. And, as we continue to grow, I can use accreditation as a potential data point when making hiring decisions. My concern was that the test would lack both depth and breadth and we’d end up where we are now: A bunch of partners of varying capability lumped into the same “Gold” category (or “Platinum” if you fork over the big dough).

After taking the test (and passing–c’mon, was there any doubt?) I feel better about it. The test actually appeared to do a decent job of covering many different aspects of the product including configuration and developer issues we see on real world implementations, so it’s not an easy test. Kudos to Carlos Miguens and the Alfresco Training team–I know it must have been a big project to get the accreditation program pulled together and construct a test that, at least to me, feels like the right level of detail and difficulty.

I do think the test could be improved. I think there was way, way too much emphasis on the WCM product. The value of the 30 or so questions asked on WCM out of the 100 or so total asked is that it really does take someone who’s been around the product a while to get those right. I just think the test is over-weighted toward WCM compared to the proportion of actual “project share” the WCM product gets in real life versus the core repository.

I think there were also too many questions on the Explorer client. Again, maybe the goal is to be slightly biased towards those partners who have been working with Alfresco long enough to have done an Explorer customization or even simply to know certain details about the Explorer UI. But most clients are now using and customizing Share as their primary interface versus Explorer. On the other hand, I think it is a bit early for “Share customization” to be an accreditation test topic, so let’s not get ahead of ourselves!

I also think it would be helpful feedback to give the test taker an idea of what was missed. It doesn’t have to be the exact question (although there were several answers I’d love to vigorously defend if I did indeed miss them) but other similar tests I’ve taken in the past have offered up the “weak” areas so the student could shore those up. Thanks for the passing grade and everything, but, not giving any hint as to what I missed is like starting a great joke with a huge build-up and then dropping dead before you can tell me the punch line.

What I don’t know is how well formal training prepares you for the test. I would hope that, in the aggregate, real world implementers score as good or better than groups of test takers who are fresh out of Alfresco training but lack on-the-ground experience. Really, it has to be that way for accreditation to be meaningful. I haven’t taken any of Alfresco’s training, so I’m curious to hear feedback on the test from those that have.

If you haven’t taken your test yet, good luck! I wish I could tell you what to study that will help you do well, but, honestly, I can’t think of any one reference that’s going to do the trick. And, really, I guess that’s a good thing.

Apache Chemistry cmislib 0.4 incubating now available

Apache Chemistry LogoThe Apache Chemistry development team is pleased to announce that the 0.4 incubating release of cmislib, the Python client API for CMIS, is now available for download. You may have to use one of the backup servers until the mirrors fully update. Alternatively, you can use easy_install to install cmislib by typing “easy_install cmislib”.

This release has various fixes and enhancements that the community has contributed since cmislib joined the Apache Chemistry project with its 0.3 release. If you are using Alfresco, you might be interested in an enhancement in cmislib 0.4 that makes it possible to use ticket-based authentication instead of basic auth.

For those who haven’t used it, cmislib makes it easy to work with CMIS-compliant repositories from Python.

Any interest in a pre-3.4 five star ratings widget for Alfresco Share?

One of the examples I put in the Alfresco Developer Guide is a five star ratings widget. The example company in the book, SomeCo, uses it to let their web site users rate whitepapers published on the site. Metaversant recently took that code and implemented it in a production web site, and that client gave us permission to take the code (which wasn’t really changed that much from the book anyway) and release it as open source.

But in Alfresco 3.4 Community and Enterprise, Alfresco has implemented partial ratings functionality. Right now, it consists of the back-end, repository tier services and it is darn close to what’s in the book.

With the back end in place, I expect Alfresco to release ratings functionality in the Share front-end in an upcoming release. So, I’m somewhat reluctant to package up my ratings stuff and release it as it is partially obsolete in 3.4 anyway. On the other hand, it would give those of you on releases earlier than 3.4 a functional ratings widget, and, with some adjustments, I could make the front-end work with Alfresco’s rating service in 3.4, which would mean functional ratings in Share for those of you that have already upgraded.

If the code was ready to go, I’d just package it up and if only one person used it, great. But, the client we did the widget for was using a heavily customized version of Share–we didn’t integrate the widget into the out-of-the-box document library or document details page. That’s the last bit of work that needs to be done. Even if there is no interest, I might do it anyway just to have as an example. But, I’d be even more motivated if there were folks out there who would use it in their production implementations.

So what do you say? Any interest? Take the survey below and let me know.

Create your free online surveys with SurveyMonkey, the world’s leading questionnaire tool.

Alfresco Developer Guide source updated, uploaded to Google Code

Back in July of 2009, I did a reorg of the source code that accompanies the Alfresco Developer Guide. Shortly after that I validated against 3.2 Enterprise, but I didn’t keep up with the Alfresco releases after that. I’m happy to say that I’ve caught up. And, to make it easier for you to get to the source code, I’ve checked it all in to a Google Code project.

So, if you’re reading the book and you’re finding a couple of problems with the source code on Alfresco versions after 3.2 Enterprise, go get a version that matches up with the chapter you are on and the release of Alfresco you are running, and see if it fixes the problem.

I’ve now validated that the code works with the following versions:

  • Alfresco 2.2.4 Enterprise
  • Alfresco 3d Labs
  • Alfresco 3.0.1 Enterprise
  • Alfresco 3.1.1 Enterprise
  • Alfresco 3.2 Community
  • Alfresco 3.2 Enterprise
  • Alfresco 3.3 Community
  • Alfresco 3.3 Enterprise
  • Alfresco 3.4c Community
  • Alfresco 3.4 Enterprise

Deconstructing DeckShare: A brief look at Alfresco Web Quick Start

Last Fall, just before the Developer Conference in New York City, Alfresco approached Metaversant with a small project–they needed a web site to share presentations from the DevCon sessions and other events. There are several generic slide-sharing sites out there–Alfresco and I have both used SlideShare for that sort of thing pretty extensively. But Alfresco was looking for something they could have complete control over, plus they were looking to exercise their new Web Quick Start offering. So I rounded up a sucker–I mean a collaborator–named Michael McCarthy from over at Tribloom, and we knocked it out.

Although it makes good marketing sense for Alfresco to use its own offering for the site, I do think the “private presentation-sharing” use case is also generally applicable to many other businesses out there. Technology companies, of course, but also any company with a large sales force or even more modest extranet needs could benefit from a solution like this. SlideShare is great when what you want to share is public. When presentations need to be securely shared, many companies use expansive portals to share sales collateral, marketing presentations, or company communications, but those often come with a very low signal-to-noise ratio. The solution we built–we call it “DeckShare”–is laser-focused on one thing: Making it easy for content consumers to find the presentations they need, quickly.

DeckShare is built on top of Alfresco’s new Web Content Management (WCM) offering called Web Quick Start. Honestly, Web Quick Start, as the name implies, provides a good starting point for building a dynamic web site on top of Alfresco, and the sample site was so close to what a basic slide-sharing site needed, we didn’t have to do a whole lot of work. Even so, if you want to do slide-sharing on top of Alfresco, you can save even more time by starting with DeckShare.

I thought it might be cool to walk you through how we got from the Web Quick Start sample app to DeckShare as a way of getting you familiar with Web Quick Start and to help you understand how DeckShare works in case you want to use it on your project.

Some of this write-up is also included in the “About” page within the DeckShare site. Alfresco is currently branding DeckShare to meet their needs so I have no idea if the About page will still be there when the site goes live, so I may or may not be repeating myself somewhat.

What is DeckShare?

Alfresco’s DeckShare implementation isn’t live yet, so let me give you the nickel tour. DeckShare is a solution for quickly creating a self-hosted presentation-sharing site on top of Alfresco without having to write any code. DeckShare lets non-technical users manage and categorize presentations that are then consumed by end-users. End-users can find presentations by browsing one or more hierarchies (“Topics”, “Events”, “Audience”, for example), performing a full-text search, or by browsing the list of the latest and “featured” presentations. Content managers can associate presentations with “related” presentations and can link supporting files with a presentation.

In this solution, Content Managers are a different set of users than Content Consumers. As it stands, the solution doesn’t accomodate user-contributed presentations, unlike SlideShare. Obviously, it could be customized to do that.

Here’s what it looks like when a Content Manager edits the metadata for a particular presentation (click the image for full-size):

DeckShare Manage Details Screenshot

Those familiar with Alfresco can tell from the screenshot that Content Managers use Alfresco Share as the “content administration” user interface. Optionally, you could turn on Alfresco’s Web Editor Framework and allow Content Managers to edit the site in-context, but that will require some minor tweaking if you go that route. Using Share for content administration works really well. What started out as a team collaboration web app has essentially turned into Alfresco’s uber client.

Now let’s take a look at the Content Consumer user interface:

DeckShare Home Screenshot

If you’ve seen either of the sample Web Quick Start applications, the layout will look familiar. The home page includes a featured content carousel, a recently added document list, and a document category tree. We’re using a different carousel widget than the Web Quick Start sample apps and the category tree is also something we’ve added.

The set of categories is almost completely arbitrary and can be managed by DeckShare administrators. In fact, if you are running DeckShare on top of an existing Alfresco repository, you can specify the subset of categories you want to show in the category tree. When you click a specific category, the resulting page looks like this:

DeckShare Category Page Screenshot

The category page shows only the presentations for a specific category and features the same category tree browser that appears on the home page.

Ultimately, an end-user will either download a presentation or they’ll click on the details page link to learn more. The details page, shown below, shows high-level metadata about the presentation, supporting files that accompany the presentation, related presentations, and a flash-based document preview. The flash-based preview allows end-users to browse the presentation without downloading the entire file.

DeckShare Presentation Details Screenshot

That’s really it from a Content Consumer perspective. The site also has full-text search and that page is very similar to the category list page.

From a Content Manager’s point of view, DeckShare, like all Web Quick Start sites, is built on concepts familiar to anyone who has built or managed a web site: A site consists of collections of assets that get categorized and tagged and then presented in various ways across one or more pages. The look-and-feel of the end-user web site can be completely customized to match client branding needs. And, both the metadata model and category hierarchy can be extended to support specific client requirements.

Assets are stored in Alfresco’s Document Management repository and managed through the Alfresco Share User Interface. Alternatively, Content Managers have several other options for getting presentations into the repository, including FTP, drag-and-drop via Windows Explorer or Mac Finder, emailing content into the repository, drag-and-drop from within Microsoft Outlook or Lotus Notes, or saving directly from Microsoft Office as if the repository were a Microsoft SharePoint Server. Regardless of how the files arrive, Alfresco automatically takes care of creating thumbnails and PDF renditions of the presentations.

What is Web Quick Start?

Web Quick Start is a sample web application built with Spring, Spring Surf, and Apache Chemistry’s OpenCMIS library. It is essentially a sample web application that sits on top of the Web Quick Start API (Java), some presentation tier services, some repository tier services, and an extended content model.

Assets are stored in Alfresco’s Document Management repository and managed through the Alfresco Share User Interface. That means you can use all of the familiar building blocks present in the repository such as custom types and aspects to model your data, behaviors, and web scripts.

The presentation tier uses Alfresco Surf to lay out pages and to define regions on those pages. Regions get their content from presentation tier web scripts. What’s different with Web Quick Start as opposed to previous Surf-based web application examples is the use of the Apache Chemistry OpenCMIS library. Instead of using Surf’s object dispatcher to load and persist objects, the Web Quick Start API uses OpenCMIS to make CMIS requests between the front-end and the repository tier. There are some places where the Web Quick Start API uses non-CMIS web scripts, so it is not a pure CMIS implementation, however.

The Web Quick Start API is exposed to the Alfresco JavaScript API and Freemarker API on the presentation tier, so everything you’ve already learned about Spring Surf is immediately leverageable when you build the front-end. However, if you’ve decided on another framework, you can still use the Web Quick Start API, the services, the content model, and Share for editing content. For example, Metaversant recently worked with a client that chose Spring 3 and Apache Tiles for the front-end because that was their standard, but they used Web Quick Start for everything else from the API, back.

Web Quick Start sites have a flexible deployment configuration which can be boiled down to “single-server” or “multi-server”. In the single-server approach, the same Alfresco server is used for content authoring and content serving whereas in the multi-server approach, Alfresco’s transfer service is used to move content from the “authoring” or “editorial” web server to the “Live” server (it doesn’t have to be one hop, you could throw in one or more “QA” servers as well if you want to, for example).

High-level steps

Web Quick Start is meant as a starter application. It’s functional out-of-the-box, but you’re expected to use as much (or as little) of it as you need. Here are the high-level steps we took to reshape the starter app into DeckShare.

Step 1: Extend the content model

Web Quick Start has a simple content model that provides for articles, images, and user feedback. For DeckShare, we added two new aspects to our own model. One is used to associate a presentation with zero or more related presentations. The other is used to associate a presentation with zero or more supporting files (like source code downloads). At some point we may also add an aspect to track fine-grained event metadata such as session time, speaker, etc. Extending the model with your own aspects is pretty easy–it’s some XML to define the model and then some XML to expose the model to the Share interface.

Step 2: Set up site sections and collections

Web Quick Start has a default folder structure that lives in the Document Library of the Share site being used to manage the content. Within that there is one folder for editorial content and one for live content. Then, it breaks down into “section” folders for site assets (publications, in this case) and collections. Each section will typically map to a section of a site and will therefore show up in the site navigation, but you can exclude sections from the navigation.

Every section folder has a set of collections. A collection is an arbitrary grouping of site assets. Collections can either be static or query based. For a static collection, Content Managers choose the assets that belong in that section with standard Share “picker” components. For a query-based collection, Content Managers can use either CMIS Query Language or Lucene to define queries that identify the assets that are included in the collection. Either way, from the Web Quick Start API, a developer just says, “Give me this collection and let me iterate over the assets in it” to produce a list of the assets in a collection.

There are two sample data sets that ship with Web Quick Start. One is for a Finance example site and the other is for a Government site. We started with the Finance site (the choice was based on luck–there’s really not much of a difference between the two other than images and content). Once we imported the sample site, we deleted all of the sample content and then tweaked the collection definitions. The “latest” collection, for example, contains the most recently-added presentations. The “featured” collection is static out-of-the-box, but we wanted it to be dynamic. So, we simply edited the collection metadata to add a query that returns all of the presentations that have been categorized as “Featured”. Alfresco runs the query periodically so that Content Consumers don’t take the performance hit when the page is rendered.

The carousel works similarly. We wanted Content Managers to be able to specify which presentations appear in the carousel simply by applying the “Carousel” category to the content from within Share, so the carousel collection looks for content that has that category applied.

The other tweaks we made to the sample data set include telling Web Quick Start which rendition should be used for the various thumbnails in the site. We added a new rendition definition for the images shown in the carousel and a new rendition for the thumbnails used everywhere else. Web Quick Start has its own rendition definitions, but the thumbnails aren’t set to maintain the aspect ratio when they are resized and that looks a little weird for images of thumbnail presentations, hence the need for our own.

Step 3: Customize page layout

Once we had some sample data in place and our collections defined, it was time to start laying out the pages. As luck would have it, the requirements matched up fairly closely to the layout of the sample financial site. Surf pages and templates are used for layout, but you don’t have to be a Surf Guru to make changes. Surf templates are just FreeMarker, after all. Other than minor re-arranging, our template modifications were limited to adding additional regions to existing templates.

Once you define your page layouts, you use metadata on the section folders to tell Web Quick Start which page layout to use for a given piece of content. The mapping is type-based. For a given section, you might say, “All instances of ws:indexPage in this section should use the ‘list’ template” which is expressed like this:

ws:indexPage=list

The template mapping is hierarchical. That means for a given piece of content, Alfresco will look at that content’s section for the template mapping but if it isn’t specified, it will look in that section’s parent, and so on. So if we specify:

cmis:document=publicationpage1

in a high-level section folder, all instances of cmis:document in child sections will be layed out using publicationpage1 unless it is overridden by a child section.

Note: If you are familiar with Surf, don’t be confused by the use of the word “template” here. It’s used by Web Quick Start in a generic sense. In the example above, “list” and “publicationpage1” are actually page data objects in Surf. For the Metaversant client that used Spring 3 and Tiles, for example, the template mapping specified which Tile definition to use.

Step 4: Add custom components

We spent most of our time on components. In Surf, components are implemented as web scripts. A web script implements the Model-View-Controller (MVC) pattern. In this case, controllers are JavaScript. Here’s what one of the controllers looks like:


model.articles = collectionService.getCollection(context.properties.section.id, args.collection);

if (args.linkPage != null) 
{
	model.linkParam = '?view='+args.linkPage;
}

This one is simply grabbing a collection of articles and a parameter and sticking them on the model for the FreeMarker-based view to pick up. Behind the scenes, the Web Quick Start API is making RESTful calls to the repository to retrieve (and cache) objects. This is a typical controller–in this app, most of the controllers are less than 10 lines long.

Web Quick Start already has components for different types of content lists. So doing something like the carousel was relatively easy: The collection service returns the “carousel” collection and a FreeMarker template dumps the collection metadata into a list that the YUI carousel control can grok.

We had a couple of places where we had to write our own components. One was the Categories component and another was the Related Presentations component. The Categories component is rendered using a YUI tree control. It gets its data by invoking a Surf-tier web script which, in turn, invokes a repository-tier web script that returns a list of category metadata as JSON. YUI then styles the data as a tree.

The Related Presentations component similarly invokes a repository-tier web script, but its JSON contains only node references. It then asks the Web Quick Start API to convert those into CMIS objects. That allows us to take advantage of the cache if the object has been loaded before, and it means we can re-use the Web Quick Start view templates that already know how to format lists of CMIS objects.

But wait, there’s more

Web Quick Start also has a user feedback mechanism that you can use for comments, “contact us” forms, and, at some point, ratings, but none of that was a requirement for this solution at the time we built it. The capability is there, so we’ll probably expose it at some point. Refer to the wiki for more information on the user feedback functionality.

There’s also a workflow that is used to submit content for review. Once approved, the content is queued up for publishing to the live site. The workflow is the same jBPM engine you are probably already familiar with so the process can be modified to fit your needs.

Although it is a new product and there are still a few hitches here and there, I was happy with Web Quick Start and the time it saved for building a site like this. I’m looking forward to seeing its continued evolution.

DeckShare is a simple site. There is a lot more that could be done with it and, based on client demand (or contributions from others, hint, hint), new features will be added over time. I think it’s a good example of what you can do quickly with Web Quick Start.

Hopefully, this write-up gave you some ideas you can use on your own projects. If you want to dive deeper into the Web Quick Start web application, check out the Web Quick Start Developer Guide on the Alfresco wiki. Feel free to grab the DeckShare source from Google Code and use it on your own projects.

If you’re interested in major customizations to DeckShare or you have a project that might be a fit for Alfresco Web Quick Start, let me know.