Category: Alfresco

Alfresco open source content management

Enjoyed the Atlanta Alfresco Meetup last week

Last week I joined about 15 other Alfresco fanatics for the Atlanta Alfresco Meetup. The attendees braved some seriously crappy weather to attend. We’re talking about trees falling on roads and power outages so I was pleasantly surprised it was more than just me and the guys who work in the building that showed up.

I gave a talk on my high-level plan for the Alfresco Community. Then, Dimy Jeannot of Armedia gave a project walkthrough based on some work they did with Alfresco Web Quick Start, the Web Editor Framework, Google Fusion Tables, and Salesforce.com. It was a good progression from business need to code. I think this particular meetup group is looking to get even more hands-on in the future–they’ve got a hack-a-thon style get together in the works.

Thanks to Dimy and Doug Bock for organizing the meetup and to Jim Nasr of Armedia for providing the location and snacks. I look forward to more great events from this group in the future.

This meetup was what I hope will be the start of several locally-driven Alfresco meetups happening around the world (see “Getting Involved with a Local Alfresco Community“). I know that the Boston, Washington, D.C., and Southern California groups are all planning on getting together soon. I’ll be at “Alfresco Day” in Madrid on June 22nd, which is an Alfresco-led event. I’m hoping to see (and attend) locally-driven Alfresco meetups in Spain and other parts of Europe later this year. South Africa is also planning an event that I’m really excited about.

If there’s not already a meetup in your corner of the world, put your name on the Alfresco Meetups Everywhere page and you can collaborate with others to get one started.

Screencasts highlight upcoming Alfresco Share features

[UPDATE: Due to the popularity of this post, Thorsten had to relocate his screencasts to this page, so I’ve removed the individual links and added a link to the entire collection. I’m also adding bullets for two new features he recorded screencasts for since the original post]

I came across this today via Twitter. Thorsten Schminkel (@schminkel) has posted a collection of screencasts that highlight new Alfresco Share features currently checked in to Alfresco Community head. Each is about a minute or less–just enough to get the gist of the following features:

  • Trashcan
  • In-place file name editor
  • Video preview
  • Manage system users
  • Change logo
  • Image preview dashlet
  • Drag-and-drop import
  • User-defined document library sorting
  • Like counter
  • Email notification configuration
  • User import via CSV

The latest official release of Alfresco Community is 3.4.d. Last month Alfresco released a preview release called 3.4.e meant to give everyone a first look at the new Activiti integration. None of the features Thorsten shows in these screencasts are available yet in an official release. Look for an official Community release to happen in late Summer or early Fall with an Enterprise release by the end of the year. If you can’t wait, you can do what Thorsten did: Check out the latest Community code from subversion, build it, and have fun playing with these and other new features.

Alternatively, functionally similar versions of some of these features have already been implemented by members of the community for versions of Alfresco in use today:

Thanks for doing those screencasts, Thorsten, and let me know if you create any more so we can add them to the list.

Alfresco DevCon 2010 presentations now available

I’ve uploaded most of the presentations from Alfresco DevCon 2010 to SlideShare. The easiest way to get to them is to use the DevCon 2010 tag.

You may be thinking, “Damn, the conference was seven months ago, why do I care?” and to that I have two responses. First, sorry. We’ll do better this year. Second, the collection includes some really helpful resources on a variety of topics. I think every one of them could help someone out there on projects today.

Here are some of my favorites:

Okay, that’s half of the sessions, but it is hard to narrow them down. Anyway, take a look and favorite the ones you really like. Also, if you are planning on attending DevCon this year, feel free to give me feedback like, “More sessions like this would be great,” or “Maybe not so much of this one this year”. That will help me plan the conference tracks and content.

Getting involved with a local Alfresco community

Even though there are still two weeks to go in this year’s Alfresco Community Survey, I couldn’t help but start to review the 1200 or so responses we’ve received so far. There are some great insights and suggestions coming through, but there’s one I wanted to jump on right away: It’s clear that a significant portion of the Community would like to see more local, Alfresco-focused, non-marketing,  gatherings (aka, meetups). And I’m right there with you. I think it is extremely important that local groups of people interested in Alfresco are able to get together regularly to share tips and tricks, to network, and to have fun. In this post I want to outline my perspective on events, my plan for local meetups, and some ideas on how to get involved with a local Alfresco community.

Alfresco Community Meetups are different from other events

Alfresco drives many types of events worldwide, including presence at third-party conferences, lunch-and-learns, training, and webinars. We also do an annual developer’s conference called Alfresco DevCon. Last year DevCon was in New York and Paris. We’re starting to plan for this year’s DevCon. We’re still finalizing cities and dates and I’ll let you know when that happens.

The events I’ve listed so far are completely driven by Alfresco. But there are several groups around the world that get together and talk about Alfresco on their own. These are grassroots, locally-organized meetups. Some meet more regularly than others. Some are a handful of people getting together for an informal happy hour while others are large groups with formal agendas, name tags, and everything.

In addition to these locally-run meetups, in the past, Alfresco has conducted “Community Meetups” that were really more like mini-conferences that happened in multiple geographies. These were fun and informative events, but they can’t happen with the frequency and scale that locally-driven meetups can.

Going forward, I’d like you, the community, to drive local meetups. And I’d like to see these happening more frequently, in more parts of the globe, for technical and non-technical audiences regardless of the Alfresco product they use. I want more people to feel that sense of family that I feel when I walk into a room full of people who share the same hopes, joys, and frustrations with Alfresco.

Local Alfresco communities should be driven by the local community

In short, I don’t want Alfresco to own, control, or constrain local Alfresco communities in any way. Ideally, anywhere there are two or more people that care about Alfresco, a local meet-up would form and those people would get together fairly regularly and, hopefully, grow to include others over time.

Alfresco’s role is to foster and support these local communities. I think we can add value in the following ways:

  • Alfresco can serve as a “connector”, matching up groups of interested community members with people willing to organize the local community
  • Alfresco can supply presentation content and, in some cases, people to deliver it in-person
  • Alfresco can help promote your meetup and drive attendance
  • Alfresco can support communities with Alfresco-branded giveaways and other small incentives

What we lack is the hyper-local perspective into the topics the local community is most interested in, the ability to know all of the cool projects going on in your area, and the feet on the ground to make every meeting a success. That’s where you come in. Local community events shouldn’t be driven by Alfresco’s Marketing team–they should be driven by you, the community, and Alfresco will do everything we can to support you.

So, as part of this, I’ve been reaching out to various communities around the world. If they haven’t met in a while, I’m encouraging them to get together, even if it is an informal meet-and-greet. If it is a group that was just thinking about getting together, I’m asking them to take that first step. And, if it is a group that has been meeting a while, I’m asking what, if anything, you need from me to keep it going.

How can you get involved?

This wiki page is the master list of existing local communities we know about as well as communities that people are interested in forming. If you are participating in a local community or are interested in forming one and that’s not reflected on the list, please update the wiki page.

Take the first step

If you are lucky enough to live near an established community, sign up and attend. If there isn’t a meeting happening any time soon ask the innocent question, “Why isn’t there a meeting happening any time soon?”. Maybe you’ll be the spark that gets it going again.

If you want to organize a meetup, it’s pretty easy. Decide on a time and a place, then let everyone know about it. You can use sites like Meetup.com or Google Groups to facilitate sign-up and collaboration, but that’s not a requirement.

If there isn’t a meetup already organized near you and you’d like to find out if others are interested, go to http://www.meetup.com/Alfresco, search for your city, and add your name to the list.

Decide where to take it from here

That first meeting doesn’t have to be a big production. It isn’t much work to get together and talk about what you are doing with Alfresco. While you’re talking, you may want to:

  • Set a focus. Is the goal to network, to learn from others, or something more specific? For example, I have been talking to multiple communities about organizing Alfresco-focused hack-a-thons/code sprints that would have a goal of creating new or contributing to existing Alfresco community projects.
  • Decide how often you want to get together. Meet too often and you’ll burn out the group. Don’t meet often enough and your group will lose interest. Somewhere in the neighborhood of monthly or quarterly is probably best.
  • Decide on an agenda for future meetings (or whether to have an agenda at all). You might have an end-user focused group that discusses tips/tricks for using the product and walks through case studies. Or, you might have a more technical group that dives into the details of a different part of the platform each meeting.
  • Establish ground rules. Maybe for your group, the rules are there are no rules. Or maybe a couple of common sense ground rules would help. It depends on the focus you’ve set. For example, you might want to ban blatant sales pitches and recruiters.
  • Pick an organizer. Someone needs to be on point for reminding the group about upcoming meetings. If you’ve decided on a more formal sort of group, that person will also need to facilitate setting the agenda and find people to speak. I’d recommend rotating this responsibility every 3 to 6 months, but you can decide.

Keep me posted

If you get a meetup going I want to know about it so I can support your group in the ways I’ve outlined above. Who knows, maybe I’ll even show up in person at one of your meetings.

Take the 2011 Alfresco Community Survey

When I announced my new role as Alfresco’s Chief Community Officer I mentioned that I would be asking you, the community of end-users, developers, partners, and Alfresco employees, for your input on how to make the Alfresco Community the example for all other commercial open source companies to follow. Obviously, I’ll take feedback in any form I can get it, but what would be great is if you would take 15 minutes to complete this survey. If you complete the survey by May 31, 2011, you could win one of two $250 Amazon gift cards.

Now, I know surveys can be a beating. But the information it helps me gather will allow us to plan all kinds of great things for the Alfresco community, from events to community tools. So, please speak up and give me your opinion and I’ll promise to listen and then to push for changes that matter most to the community. I’ll also summarize and present this data back to the community. If we do this year after year, we can hopefully see some cool trends emerge as we make progress.

Will Abson’s Wonderful World of Dashlets

Back when Alfresco first launched Surf, the framework on which Alfresco Share is based, a handful of us went to Chicago to hang out in a conference room on a ship in the harbor where we did a deep dive on the framework and then came up with proposed add-ons that would leverage it. I was at Optaros at the time. Our add-on was the Alfresco Share microblogging component and we also did some Surf Code Camps. The goal, of course, was to get the word out about Surf and encourage others to develop and contribute Share customizations.

The deep dive was great and the code camps that followed were valuable and well-attended. What I think the approach missed was that you don’t need to be a Surf expert to code some simple dashlets. We were handing out “How to Fly the Space Shuttle” when we probably should have started with “Building and Launching Your First Model Rocket”.

That’s why Will Abson is my current Alfresco community hero. At this year’s Alfresco Kickoff meeting in Orlando (notes), Will showed a project he and a few others have been working on called Share Extras. Share Extras is a collection of small projects ranging from “Hello World” dashlets to custom theme, data lists, and document action examples.

For example, the list of what I’d call simple, mash-up examples includes things like:

  • Twitter Feed Dashlet – Shows a specific Twitter user’s feed.
  • Twitter Search Dashlet – Shows a Twitter feed based on a hashtag.
  • BBC Weather Dashlet – Shows weather feed from BBC.
  • Flickr Dashlets – Shows flickr photos in a slideshow.
  • Google Site News – Shows the last ten blog posts from Google News.
  • iCal Feed – Shows entries from an iCal feed.
  • Notice Dashlet – Stores/shows arbitrary text, like what you’d use for a maintenance message or an announcement.
  • Train times – Shows the National Rail train schedule.

From there, you can move on to more extensive examples. For example, rather than simply displaying data from public services, these examples start to store/retrieve data in the underlying Alfresco repository:

  • Site Tags Dashlet – Displays a tag cloud consisting of tags used in your site.
  • Site Poll Dashlet – Uses a custom data list type called Poll to configure a simple poll. Shows results in bar chart.
  • Document Geographic Details – Adds a map using the document’s geocoding metadata just below the permissions section.
  • Sample Data Lists – A simple data list example that lets you capture info on Books (author, title, ISBN).
  • Execute Script Custom Document Action – Shows an example of adding a custom action to the action list that runs server-side JavaScript against a node.

The nice thing is that (almost) every one of these extensions deploys as a self-contained JAR file. Will’s build assumes you are running the repository and the Share web apps in the same container, so it deploys the JAR to $TOMCAT_HOME/shared/classes/lib, but you can obviously tweak that if your config is different. The ability to run everything out of a JAR, including what would normally be file system based resources like CSS, client-side JavaScript, and images is a relatively new feature (3.3, I think). It’s much nicer than fooling with AMPs.

Here is a list of my five favorites from the collection:

  • Node Browser – A port of the Explorer client’s node browser to the Share UI. I like this one because it brings an extremely useful developer tool into Share, which is where most of us are spending time these days. It also shows how you can plug your own tools into Share’s admin console.
  • Red Theme – A simple custom theme example. This is on my favorites list because creating a custom theme is something that is requested often and should be easy to do. Follow this example to create your own.
  • Site Geotagged Content Dashlet – Adds a dashlet that shows a map of geotagged content contained in the document library. I can’t help it. I like maps.
  • Site Blog Dashlet – Dashlet that shows site blog posts. This is a favorite because it plugs a hole in the product. If you’re going to use the blog tool in a Share site, you’re going to want to show those posts somewhere and a dashlet makes a lot of sense.
  • Wiki Rich Content – Automatically puts a table of contents at the top of a wiki page based on the headings contained within the page. Also does a nice job with pre-formatted text. This is another example of a feature that should probably be in the core product.

The Google Code project includes screenshots for each of these projects, but it is really easy to do a checkout on the code, import the projects into Eclipse, create a build.properties file in your home directory to override the tomcat.home prop, then run “ant hotcopy-tomcat-jar” to deploy one and see it in action for yourself. I tried them all out on Alfresco 3.4d Community and they worked great. I think all but one or two will work on 3.3.

The Share Extras project includes a Sample Project with a folder structure and Ant build that you can clone and use as a starting point for your own development. If you create something cool, you should share it on Google Code and then let me know about it. Or give it to Will and he can add it to his ever-growing pile of cool Share add-on examples.

Trying out Activiti: Examples that leverage Alfresco’s new workflow engine

I’ve been playing with Activiti. It’s an open source, BPMN 2.0 compliant business process engine. The project is sponsored by Alfresco, who hired Tom Baeyens and Joram Barrez, the founders of jBPM, to create the Apache-licensed engine (take a look at the rest of Activiti’s all-star cast).

The first thing I did was head over to Activiti’s site and read through the user guide. I followed the tutorial and got a standalone instance of Activiti going with very little fuss. The concepts and terminology aren’t terribly different from jBPM, so if you’ve used jBPM, you’ll be familiar with the basics of Activiti in no time. The user guide is well-written so I urge everyone to start there.

Last week, Alfresco released a preview release of their Community product, labeled 3.4.e. This release, which I stress is only for preview purposes, was made available to let everyone get a first look at Alfresco’s integration of Activiti. If you watched the screencast showing an Alfresco workflow based on Activiti you may have thought, “Gee, that looks just like a jBPM-based workflow,” and you’re right–from a user standpoint, it is nearly identical. The difference, of course, is how the processes are described and the underlying implementation that executes the processes.

The screencast showed that the end users won’t see much of a change. That’s good, but I was anxious to find out how big a deal this transition will be from a developer’s perspective. The 3.4.e release gave me the perfect opportunity to dig in. I decided to take the examples from the Advanced Workflow chapter in the Alfresco Developer Guide (2008, Packt) and make them work with Alfresco’s embedded Activiti engine in 3.4.e. In this post, I’ll talk about how that went and I’ll give you the code so you can try it out yourself.

The code that accompanies this blog post includes the same set of four workflows implemented both in jBPM and Activiti as well as a readme that explains how to install and run everything. I’ll let you inspect that to see what the exact differences are rather than go over them here. Instead, I’ll spend the rest of the post covering the major differences in general.

Before we go any further, I guess we should have a quick terminology discussion. First, in jBPM, everything is a node. Specialized node types do different things like joins, splits, decisions, wait-states, sub-processes, and enclose tasks that get assigned to humans. In Activiti (and really, in BPMN) there are essentially events (start, stop, timer), tasks, and gateways. Of course, I’m simplifying greatly here–you should read the spec and the Activiti user guide. The important thing to note for people coming from jBPM is that in Activiti a “task” might be something a human does (“userTask”) or it could be automated (“scriptTask”, “serviceTask”, etc.). In jBPM connections between nodes are called “transitions” while in Activiti they are called “sequenceFlows”.

Designing Processes

I use Eclipse, so the first step was to get the Activiti BPMN 2.0 Designer plug-in working. Installation is well-documented on the Activiti wiki and it installs just like any other Eclipse plug-in, so it went fairly smooth. I had some sort of dependency conflict that I had to deal with, but nothing major.

All in all, designing processes in Activiti works just like it does in jBPM. The tool is different, but you’re still laying out a business process graphically, connecting steps in the workflow, and setting properties on those objects.

There are some known issues with the Designer that made creating and editing processes painful at times. I’m not going to call every one of those out in this post because this is a preview release–I expected to work through a few bumps. I will warn you of a few to hopefully save you some time:

  • You cannot save the diagram until it is syntactically correct. This means the BPMN 2.0 XML will not get generated until the diagram is correct. On a new process, when the editor complains about the diagram, you’d kind of like to just drop in to the XML source and fix what needs fixing. If that’s what you want to do, you have to open the .activiti file in the XML editor, make the change, re-open in the diagram, and then make a change and save to force the generation of the BPMN 2.0 XML.
  • You cannot change things like IDs, names, form keys, and task assignment in the BPMN 2.0 XML. You have to change these in the Activiti diagram. If you change the BPMN 2.0 XML the settings in the Activiti diagram will overwrite the BPMN XML. This doesn’t sound like a big deal until you come across the next issue.
  • There is a known problem enabling the properties for an object in the diagram: clicking an object in the diagram doesn’t refresh the properties view. I worked around it by first clicking some other tab in the properties view, then double-clicking on the object (and sometimes repeating that) until the properties view refreshed with the appropriate property set.

Again, I didn’t expect everything to be fully functional, so I am not complaining. I just want you to have your expectations properly set when you play with this on your own.

I should mention that the overall look-and-feel of the Activiti Designer seems a lot crisper and more visually appealing than the JBoss Graphical Process Designer (GPD) Eclipse plug-in. As an example, I loved the alignment helper rules. And I liked that you can bend sequence flows.

Adding Business Logic to Processes

My goal was to take four Alfresco jBPM processes and port them to Activiti. The first three are variations on Hello World. The fourth is a more real-life process that is used to review and approve whitepapers. In the book, the Publish Whitepaper workflow uses an action to set properties on the approved whitepaper. And I show how to combine a wait state with a mail action and a web script to allow third parties without direct access to Alfresco to participate in a workflow. For the initial cut at this exercise, I skipped all of that. For now, I really wanted to focus on the basics of the workflow engine. But the state idea and the web script interaction are interesting so I’ll do that later and will provide the update in a future blog post.

Challenge 1: Alfresco JavaScript in automated steps

The first problem I came to was how to handle workflow steps that have no human intervention. In jBPM those steps are implemented as nodes. Alfresco JavaScript can live inside events within the node or on transitions between nodes. Tasks assigned to users are typically enclosed in a task-node. In Activiti, tasks assigned to users are called userTasks. All of Alfresco’s sample Activiti workflows consist entirely of userTasks. But Activiti includes several node types that aren’t user tasks: a scriptTask uses JavaScript or Groovy to implement its logic and a serviceTask delegates to a Java class. My helloWorld processes consist entirely of automated steps, so a scriptTask sounded good to me. The problem was that scriptTask uses Activiti’s JavaScript implementation, not Alfresco’s JavaScript. So doing something simple like invoking the “logger” root object doesn’t work in a scriptTask.

Fine, I thought, I’ll use one of Alfresco’s listener classes to wrap my logger call and stick that listener in the scriptTask. But that didn’t work either because in the current release Alfresco’s listener classes don’t fully implement the interface necessary to run in a scriptTask.

After confirming these issues with the Activiti guys I decided I’d put my Alfresco JavaScript in listeners either on a userTask or on a sequenceFlow (we called those “transitions” in jBPM) depending on what I needed to do. Hopefully at some point we’ll be able to use scriptTask for Alfresco JavaScript because there are times when you need automated steps in your process that can deal with the Alfresco JavaScript root objects you’re used to.

Challenge 2: Processes without user tasks

As I mentioned, my overly simple Hello World examples are nothing but automated steps. I could implement those without userTasks by placing my Alfresco JavaScript on sequenceFlows. But Alfresco complained when I tried to run workflows that didn’t contain at least one user task. I didn’t debug this, and it is possible I could have worked through it, but I decided for now, the Activiti versions of my Hello World examples would all have at least one userTask.

Challenge 3: Known issue causes iBatis exceptions

In 3.4.e, there is a known issue in which user tasks will cause read-only iBatis exceptions unless you set the due date and priority. Search my examples for “ACT-765” to find the workaround.

Challenge 4: Letting a user pick between multiple output paths

Suppose you have a task in which a human must decide whether to “Approve” or “Reject”. In Alfresco jBPM, you’d simply have two transitions and you’d set the label for those transitions in a properties bundle. In Alfresco Activiti that is handled a bit differently. Instead of having two transitions leaving the task, you have a single transition to an “exclusive gateway” (called a “decision”, in polite company). The task presents the “outcome” options–in this case “Approve” and “Reject”–to the user in a dropdown, as if it were any other piece of metadata on the task. Once the user picks an outcome and completes the task, the exclusive gateway checks the outcome value and takes the appropriate sequence flow. This difference will impact your business process logic, your workflow content model, and your end user experience so it is a significant difference.

For comparison, here’s what this looks like in the Alfresco Explorer UI for jBPM (click to enlarge):

And here is what it looks like in the Alfresco Explorer UI for Activiti (click to enlarge):

So in Explorer, with jBPM, the user can just click “Approve” or “Reject” while in Activiti, the user must make a dropdown selection and then click “Next”.

Here is the same task managed through the Alfresco Share UI for jBPM:

Versus Alfresco Share for Activiti:

Similar to the Explorer differences, in Share, with jBPM, the user gets a set of buttons while with Activiti, the user makes a dropdown selection.

One open question I have about this is how to localize the transition steps for Activiti workflows if the steps are stored as constraints in the content model. On a past client project we implemented a Share-based customization to localize constraint list items but our approach won’t work in Explorer. Maybe the Activiti guys can help me out on that one.

Exposing Process to the Alfresco User Interface

And that brings us to user interface configuration. Overall, the process is exactly the same. First, you work on your process definition, then you create a workflow content model. Once the workflow content model is in place, you expose it to the user interface through the normal Alfresco user interface configuration approach. For the Explorer client that means web-client-config-custom. For the Share client that means share-config-custom. Labels, workflow titles, and workflow descriptions are localized via properties bundles.

One minor difference is that in jBPM, task names are identical to corresponding type names in your workflow content model. In Activiti, a userTask has an attribute called “activiti:formKey” that is used to map the task to the appropriate content type in the workflow content model.

Assigning Tasks to Users and Groups

The out-of-the-box workflows for both jBPM and Activiti show how to use pickers to let workflow initiators assign users and groups to workflows. My example workflows use hardcoded references rather than pickers so that you’ll have an example of both approaches. In my Hello World examples, I assign the userTask to the workflow initiator. This is done by using the “activiti:assignee” attribute on userTask, like this:

<userTask id="usertask3" name="User Task" activiti:assignee="${initiator.properties.userName}" activiti:formKey="bpm:task">

If you need to use a more complex expression there’s a longer form that uses a “humanPerformer” tag. See the User Guide.

In the Publish Whitepaper example I use pooled group assignment by using the “activiti:candidateGroups” attribute on userTask, like this:

<userTask id="usertask7" name="Operations Review" activiti:candidateGroups="GROUP_Operations" activiti:formKey="scwf:activitiOperationsReview">

Again, if you need to, there’s a longer form that uses a “potentialOwner” tag.

In my jBPM examples I use swimlanes for task assignment. I didn’t get a chance to use the equivalent in Activiti.

Deploying Processes

In standalone Activiti there are multiple options for deploying process definitions to the engine, including uploading a BAR (Business Archive) file into the running engine. I couldn’t find the equivalent of that in Alfresco’s embedded Activiti implementation or the equivalent of the jBPM deployer servlet, so for this exercise I used Spring configuration for both Activiti and jBPM processes. I hope by the time the code goes into Enterprise there will be a dynamic deployment option because that’s really helpful during development.

Workflow Console

Alfresco’s workflow console is a critical tool for anyone doing anything with advanced workflow. It has always been a puzzle to me as to why the workflow console (along with others) can only be navigated to directly using an unpublished URL. That head-scratcher still remains, but rest assured, all of your favorite console commands now work for both jBPM and Activiti workflows.

Summary

I hope this post has given you a small taste of the new Activiti engine embedded in Alfresco. I haven’t spent any time talking about the higher level benefits to Activiti. And there are many more details and features I didn’t have time to go into. My goal was to give all of you who have experience with Alfresco jBPM some start at getting your head around the new option for advanced workflow.

If you haven’t done so, grab a copy of Alfresco 3.4.e, download these examples, and play around. The zip is an Eclipse project that will deploy the workflows and associated configuration to your Alfresco and Share web applications via ant. The included readme file has step-by-step directions for running through each jBPM and Activiti example.

It is entirely possible that I’ve done something boneheaded. If so, do let me know so that all of us can benefit.

Resources

Three watershed moments in my career (Hint: One just happened)

I’ve recently made a big shift in the career department. But rather than tell you what it is right off, I want to build up to it. I think it’s kind of a cool story, so if you’ll bear with me, here are the three watershed moments of my career thus far…

Watershed moment #1: Specialization leads to consulting

In 1992, I graduated college and went to work for Texas Instruments working on mainframes. Somehow, I got exposed to Lotus Notes development. I loved it. I dove in deep, eventually leaving for a job where I could be completely focused on Notes. Notes taught me a lot about managing unstructured data and how people collaborate to get work done. I learned that, for me, interesting IT problems are those where humans and systems have to work together to get something done. And it taught me a lot about what a passionate technical community looks like. Ultimately it led to a job at a small, but up-and-coming consulting firm where I would spend the next nine years. That decision to focus on Notes development was a watershed moment.

Watershed moment #2: My blog gets me a job in open source

Fast-forward to 2001. My content management practice was making a shift. Notes was falling out of favor and many of our clients were looking at WCM and DM solutions from large proprietary vendors. We started looking at open source technologies as well, but it was a tough sell to our traditional clients who had never heard of open source, and if they had, were skeptical or even fearful. We started implementing Documentum-based solutions and did that for the next three years, but I continued to dabble in open source. A revolution seemed afoot, but I couldn’t figure out the best way to jump in.

I started blogging in 2001, stopped, then started again in 2002. My rationale was simple: Writing helped me learn. And, for virtually no added cost, I could multiply the benefit by sharing what I learned–particularly with coworkers, but if others got value out of it, that was okay too. The idea that if my writing helped enough people it might help the open source movement in some tiny way was a romantic notion, but seemed remote.

Then I came across Alfresco. In October of 2005 I wrote my first Alfresco-related blog post. It said simply, “Alfresco is an open source enterprise content management solution founded by one of the co-founders of Documentum,” and then included a lengthy excerpt from a Gilbane post on Alfresco’s release candidate. A month later I published a more detailed review of the product. After three or four years of blogging, I was starting to find my voice. Little did I know that I had also found a passion.

By 2006, my firm had been acquired and Alfresco was starting to look like it had legs. I looked back on my past Documentum projects and realized that Alfresco was a viable alternative as the underlying repository in every case. Open source had been around for years but it had been sneaking quietly in the back doors of my clients in the form of operating systems, developer libraries, databases, and tooling. Alfresco, and other commercial open source companies, were poised to crash through the front door with business-facing open source applications. I wanted in. I left my firm to join Optaros, an open source consultancy I had discovered through fellow content management blogger and then Optaros employee, Seth Gottlieb. My blog had gotten me a job working with a technology I loved. That was the second watershed moment.

Watershed moment #3: Wait for it…

My four years at Optaros gave me the opportunity to focus on Alfresco full-time. Not just implementing projects, although there were many. Just as important, I was able to fully-engage with the Alfresco community. I wrote blog posts and tutorials. I created add-ons and integrations and released those as open source projects. I wrote a book. I conducted code camps. I attended every event Alfresco ever put on and gave talks at most of those. I didn’t set out to be an evangelist, but that’s what I became. Did it benefit me, Optaros, and later, my own start-up, Metaversant? Of course it did. But, here’s the kicker: Acting in my own self-interest turned out to be a huge benefit to the greater Alfresco community. And I’m not alone. Many people all around the world are participating in the community in all kinds of ways to everyone’s mutual benefit.

Which brings us to the next watershed moment: Alfresco has hired me as their new Chief Community Officer. My mission is essentially to make the Alfresco community an example for all other commercial open source companies to follow. It’s a significant challenge, and I’m going to need your help. Alfresco may sign my check, but I work for the community. Therefore, you’ve got to tell me where we should take this thing. We have our ideas but yours are critical.

What this means

I’ll give specifics on how you can help in a future post. I expect that the specific strategies we undertake together will fall roughly into these buckets:

  • Motivating community members, regardless of skill set or relationship to Alfresco to engage more deeply in the community
  • Enabling the community with tools, resources, and product enhancements that leverage community contributions
  • Exposing the greatness already existing in the community, whether that’s in the form of contributions that have been made that people just don’t know about or shining a light on community contributors doing awesome things

And, of course I get to continue to work on my own community contributions like my work with Apache Chemistry, my Google Code projects, the blog, and new stuff I haven’t even thought of yet.

It was a tough decision to put the growth of my content management-focused consulting firm, Metaversant, on hold, but when Alfresco approached me about this opportunity, I had to take it. My career and my passion are already dovetailed. I do what I love, and for that I am very lucky. Who wouldn’t take the opportunity to make that an even tighter fit?

I am very excited about what this means for the community and the importance Alfresco places on its growth and well-being. I hope you are excited too. Actually, “hope” is the wrong word–I need you to be excited. Who’s with me? Ready to pitch in?

Improving Alfresco Share performance by using getChildren

I had a client that was seeing response time in the neighborhood of several seconds for the Alfresco Share document library and data list pages across all of his sites. The client’s Share install had just over 1,000 Share sites. The volume of the data lists in each site was insignificant. This is the story of how we resolved the issue, but note that the resolution may not be appropriate for everyone in all cases.

The Symptoms

The client was seeing slow performance of the document library and data list pages in Share. They noticed that the folder tree in the document library view responded quickly but the actual document list itself took a long time to render. This was happening for all users in all sites.

Looking for a quick resolution, even if that meant solving the symptom but not necessarily the underlying cause, we decided to see if we could optimize the repository tier web script that returns the document library contents to see if we could get it to perform a little closer to what we were seeing with the folder tree. I made a copy of the repository tier’s doclist.get web script into our project’s extension directory and started tweaking.

The Bunny Trail

First, I’ll fess up to a mistaken assumption I had: I thought that everything in Alfresco always went through Lucene. I knew that separating out full-text index searches and property searches into Lucene queries and DB queries, respectively, was on the roadmap, but I had it in my head that in 3.4, even a call to something like ScriptNode.getChildren() was ultimately a Lucene index hit. If everything is a Lucene hit, I figured, there had to be a different reason for the folder list control to perform so much better than the document library list.

So, instead of starting with what, in hindsight, would have yielded the most bang for the buck, I started tuning what turned out to be little things. For example, our app didn’t use favorites, so I removed any references to the preferences service. Our app didn’t allow users to check out documents so out went any logic that dealt with that. Goodbye, Google Docs code blocks. Adios, type and aspect checking for types and aspects our app doesn’t support. Farewell, filters. I hardcoded permissions to avoid the lookup. I set created by and modified by values to empty strings to avoid the lookup to the person object. I jettisoned anything that wasn’t crucial to simply producing a list of the folder contents. All of this did speed up the repository-tier web script, but only a little bit. I needed an order of magnitude improvement.

The Lightbulb

Next, I did what I should have done initially: Add some simple log statements to see which part of the code was taking the longest to execute. Of course, it was the query. As it turns out, it is much, much faster to ask a node for its children (which is what the treenode web script does) than it is to do a Lucene search with a PARENT clause that yields the same result set (which is what the doclist web script does). On a dev machine with a small dataset, you don’t notice the difference. But on our integration and prod servers the difference is huge.

The Fix

The Share document library page uses a YUI data table to produce the list of documents for the currently selected folder. The data table is bound to a web script that lives on the repository tier that is responsible for returning the requested data as JSON. Out-of-the-box, the repository tier web script that returns the document list calls a function called getFilterParams which is responsible for setting up a bunch of query predicates based on the document library filter the user has selected in the Share UI. The script then asks the filterParams object for the Lucene query it needs to run to return the document list. It then uses the search service to invoke the query and return the results.

My optimization was to bypass building and executing the query completely because, in our case, we don’t care about filters. All we want is the list of children in the current folder, and ScriptNode already has a function to do that called getChildren. So instead of performing a Lucene search, we ask the current “root node” for its children. We then iterate over the results and filter out a couple of content types that otherwise would have been excluded had we used the Lucene query instead of getting all children.

Oh man, that did it. The document library went from rendering in 6+ seconds to rendering in less than 1 second.

I gave the data lists web script the same treatment. In that case, our customized Share app still makes use of filters, so the “getChildren bypass” is only used when the “All” filter is selected. When any other filter is selected the original out-of-the-box Lucene query is used.

Now, again, I completely acknowledge that we may have succeeded in speeding up performance for those two cases, but failed to resolve the underlying issue, and addressing that may result in a system-wide performance boost, but it was good to get the quick fix in place and it should be easy enough to revert if and when we resolve the underlying index issue, if one exists.

Here’s a code snippet from the custom doclist.get.js controller if you are curious:


if (parsedArgs.path == "")
{
    parentNode = parsedArgs.rootNode;
}
else
{
    parentNode = parsedArgs.rootNode.childByNamePath(parsedArgs.path);
}      
// We are iterating over the parent node's children instead of iterating
// over search results... 
for each (node in parentNode.getChildren())
{
   try
   {
      // ...so we need to filter out some system types that would have otherwise been
      // filtered out by the lucene query
      if (node.typeShort == "cm:systemfolder" || node.typeShort == "cm:thumbnail")
      {
         // do nothing. we don't want these.
      }
      else if (node.isContainer || node.typeShort == "app:folderlink")
      {
         folderNodes.push(node);
      }
      else
      {
         documentNodes.push(node);
      }
   }
   catch (e)
   {
      // Possibly an old indexed node - ignore it
   }
}