Category: General

General thoughts that defy categorization.

Tutorials updated for Alfresco SDK 3.0.0

A week or so ago Alfresco released version 3.0.0 of their Maven-based SDK. The release was pretty significant. There are a variety of places you can learn about the new SDK release:

I recently revised all of the Alfresco Developer Series tutorials to be up-to-date with SDK 3.

Upgrading each project was mostly smooth. The biggest change with SDK 3.0.0 is that the folder structure has changed slightly. At a high level, stuff that used to go in src/main/amp now goes in src/main/resources. These changes are good–they make the project structure look like any other Maven-based project which helps other developers (and tools) understand what’s going on.

I did have some problems getting the repo to build with an out-of-the-box project which I fixed by using the SNAPSHOT rather than the released build.

Support for integration tests seems to have changed as well. I actually still need to work that one out.

Other than that, it was mostly re-organizing the existing source and updating the pom.xml.

A small complaint is that the number of sample files provided with the SDK has grown significantly, which means the stuff you have to delete every time you create a new project has increased.

I understand that those are there for people just getting started, but no one, not even beginners, needs those files more than once, so I’d rather not see them in the main archetype. Maybe Alfresco should have a separate archetype called “sample project” that would include those, and configure the normal archetype as a completely empty project. Just a thought.

The new SDK is supposed to work with any release back to 4.2 so you should be able to upgrade to the new version for all of your projects.

This is How I Work

On any given day I might be writing code, designing the architecture for a content-centric solution, meeting with clients, collaborating with teammates, installing software on servers, writing blog posts, answering questions in forums, writing and signing contracts, or doing bookkeeping. Some days I do all of that. My day is probably similar to that of anyone else doing professional services work as a small business. Here are some of the tools I use to keep all of that going smoothly.

Hardware & Operating System

In 2006 I left Windows behind and never looked back. I ran Ubuntu as my primary desktop for three years, then switched to Mac OS X. I still love Linux, and all of my customers run Linux servers, but for my primary machine, I am most productive with my MacBook Pro and OS X.

I’ll admit that the latest MacBooks shook my faith by incorporating a shiny feature I’ll never use and not upgrading the CPU or RAM. I briefly considered moving back to Linux on something like a System76 or a Lenovo, but I am so deep into the Apple ecosystem in both home and office it may not be practical to switch. I’m hopeful the new MBP’s will get beefed up in terms of RAM and CPU later this year.

Collaborating with teammates and clients

I’ve been using Trello for project and task tracking for a long time and it really agrees with me. I set up Trello teams with some of my clients and it helps keep us all on track.

For real-time collaboration I tend to use Slack. It doesn’t always make sense to do so, but for selected projects, I invite my clients to my corporate Slack team and we use private channels to work on projects. We use Slack’s integrations to get notified when changes happen in Trello or in our codebase which resides in either Git or Bitbucket.

For real-time collaboration via IRC, Jabber, and GChat I use Adium.

Creating content & code

For plain text editing, I use either Aquamacs or Atom depending on what I’m doing. If it’s just a free-form text file, like maybe a blog post or just some rough meeting notes or something like that I’ll use Aquamacs, which is a Mac-specific distribution of Emacs. If I am doing non-compiled coding in something like JSON, XML, Python, Groovy, or JavaScript I’ll typically use Atom.

For more intense JavaScript projects I will often switch to WebStorm and sometimes for Python I’ll use PyCharm instead of Atom.

For Java projects I’ve recently moved from Eclipse to IntelliJ IDEA. I’ve used Eclipse for many, many years, but IntelliJ feels more reliable and polished. I’ve been pretty happy with it so far.

IntelliJ, WebStorm, and PyCharm are all available from JetBrains and the company offers an all-in-one subscription that is worth considering.

I do a lot with markdown. For example, if I’m taking notes on a customer’s installation and those notes will be shared with the customer, rather than just doing those notes in plain text with no structure, I’ll use markdown. Then, to preview the document and render it in PDF I use Marked. This is also handy for previewing Github readme files, but Atom can also be used for that.

Another time saver is using markdown to produce presentations. I wouldn’t necessarily use it for marketing-ready pitches, but for pulling together a quick deck to review thoughts with a customer or presenting a topic at a meetup, markdown is incredibly fast. To actually render and display the presentation from markdown I use Deckset. It produces beautiful presentations with very little effort while maintaining the editing speed that plain text markdown provides.

Sometimes I’ll create a video to illustrate a concept, either to help the community understand a feature or extension technique or to demo some new functionality to a customer when schedules won’t align. Telestream is a wonderful tool for creating such screencasts.

My open source projects live at Github while my closed source projects live at Bitbucket. The primary draw to Bitbucket is the free private repositories, but I really like what Atlassian offers. When git on the command-line just won’t do, I switch to Sourcetree, Atlassian’s visual git client.

Automation, Social & News Feeds

I have a client where I have to manage over 60 servers across several clusters. To automate the provisioning of new nodes, upgrades, and configuration management, I use Ansible. It’s much easier to learn than Chef and it requires no agents to be installed on the servers being managed. Plus, it’s Python-based.

Lots of servers, lots of customers, and lots of business and personal accounts means password management can become an issue. I use KeePassX on my desktop and iKeePass on my iOS devices to keep all of my credentials organized.

I have a ton of different news sources I try to keep up with. Feedly helps a lot. I use it on my mobile devices and on the web.

I have a personal Twitter account and a corporate Twitter account, plus I help out with other accounts from time-to-time. Luckily, Hootsuite makes that easy, and it handles more than just Twitter.


When you run your own business there are two things that can be a time suck without the right tools: contracts/forms and bookkeeping. PDF Expert lets me fill in PDF forms and sign contracts and other documents right on my mobile device. It has direct integrations with Google Drive, Dropbox, and other file share services so it is easy to store signed documents wherever I need to.

I used QuickBooks Desktop Professional Services Edition to handle my bookkeeping for several years. But that edition was only for Windows, so I ran it in a Windows VM on my Mac with VMWare, which was kind of a drag. I finally got tired of that and migrated to QuickBooks Online. The migration was completely painless. I just called them up, and within about an hour they had taken my money and moved all of my data without anything getting screwed up. It was a pretty awesome experience.

Physical Office

I’ve been working from home for over ten years, so I appreciate the importance physical space plays in a productive working environment. My desk is a custom-built UPLIFT standing desk with a solid cherry top. I love it and haven’t had a problem with it. I do need to make myself stand up more often, though.

I found that even with the desk raised completely, my cinema display wasn’t quite high enough. So I snagged a Humanscale M8 articulating mount with an Apple VESA adapter. Now I just grab my display and put it exactly where it needs to be.

My Mac hooks up to my display via Thunderbolt, which makes connecting and disconnecting a breeze. But I didn’t like how much real estate the laptop took up on my desk. There are a variety of solutions for this. I went with a Twelve South BookArc vertical desktop stand and that’s worked really well. I have a minor concern about whether or not using a MacBook in a vertical position is bad with regard to heat dissipation but I’ve decided to roll the dice on that.

I love my home office setup, but I do get tired of the same four walls sometimes. To combat that and to just change things up a bit, every week I try to spend some time in a co-working space. Here in my hometown there’s one right on the square in the historic part of downtown that has got a good vibe and is close to good food. You might check Sharedesk to see if there’s something similar near you.

So there you have it. Those are some of the tools I use every day. Got any favorites you’d like to share?

The plain truth about Alfresco’s open source ethos

There was a small flare-up on the Order of the Bee list this week. It started when someone suggested that the Community Edition (CE) versus Enterprise Edition comparison page on put CE in a negative light. In full disclosure, I collaborated with Marketing on that page when I worked for Alfresco. My goal at the time was to make sure that the comparison was fair and that it didn’t disparage Community Edition. I think it still passes that test and is similar to the comparison pages of other commercial open source companies.

My response to the original post to the list was that people shouldn’t bother trying to get the page changed. Why? Because how Alfresco Software, Inc. chooses to market their software is out-of-scope for the community. As long as the commercial company behind Alfresco doesn’t say anything untrue about Community Edition, the community shouldn’t care.

The fact that there is a commercial company behind Alfresco, that they are in the business of selling Enterprise support subscriptions, and at the same time have a vested interest in promoting the use of Community Edition to certain market segments is something you have to get your head around.

Actually there are a handful of things that you really need to understand and accept so you can be a happy member of the community. Here they are:

1. CE is distributed under LGPLv3 so it is open source.

If you need to put a label on it and you are a binary type of person, this is at the top of the list. Alfresco is “open source” because it is distributed under an OSI-approved license. A more fine-grained description is that it is “open core” because the same software is distributed under two different licenses, with the enterprise version being based on the free version and including features not available in the free version.

2. Committers will only ever be employees.

There have been various efforts over the years to get the community more involved in making direct code contributions. The most recent is that Aikau is on github and accepting pull requests. Maybe some day the core repository will be donated to Apache or some other foundation. Until then, if you want to commit directly to core, send a resume to Alfresco Software, Inc. I know they are hiring talented engineers.

3. Alfresco Software, Inc. is a commercial, for-profit business.

Already mentioned, but worth repeating: The company behind the software earns revenue from support subscriptions, and, increasingly, value-added features not available in the open source distribution. The company is going to do everything it can to maximize revenue. The community needs this to be the case because a portion of those resources support the community product. The company needs the community, so it won’t do anything to aggressively undermine adoption of the free product. You have to believe this to be true. A certain amount of trust is required for a symbiotic relationship to work.

4. “Open source” is not a guiding principle for the company.

Individuals within the company are ardent open source advocates and passionate and valued community members, but the organization as a whole does not use “open source” as a fundamental guiding principle. This should not be surprising when you consider that:

  1. “Drive Open Innovation” not “Open Source” is a core value to the company as publicly expressed on the Our Values page.
  2. The leadership team has no open source experience (except John Newton and PHH whose open source experience is Alfresco and Activiti).
  3. The community team doesn’t exist any more–the company has shifted to a “developer engagement” strategy rather than having a dedicated community leadership or advocacy team.

Accept the fact that this is a software company like any other, that distributes some of its software under an open source license and employs many talented people who spend a lot of their time (on- and off-hours) to further the efforts of the community. It is not a “everything-we-do-we-do-because-open-source” kind of company. It just isn’t.

5. Alfresco originally released under an open source license primarily as a go-to-market strategy

In the early days, open source was attractive to the company not because it wanted help building the software, but because the license undermined the position of proprietary vendors and because they hoped to gain market share quickly by leveraging the viral nature of freely-distributable software. Being open was an attractive (and highly marketable) contrast to the extremely closed and proprietary nature of legacy ECM vendors such as EMC and Microsoft.

I think John and Paul also hoped that the open and transparent nature of open source would lend itself to developer adoption, third-party integrations and add-ons, and a partner ecosystem, which it did.

I think it is this last one–the mismatch between the original motivations to release as open source and what we as a community expect from an open source project–that causes angst. The “open source” moniker attracts people who wish the project was more like an organic open source project than it can or ever will be.

For me, personally, I accepted these as givens a long time ago–none of them bother me any more. I am taking this gift that we’ve been given–a highly-functional, freely-distributable ECM platform–and I’m using it to help people. I’m no longer interested in holding the company to a dogmatic standard they never intended to be held to.

So be cool and do your thing

The “commercial” part of “commercial open source” creates a tension that is felt both internally and externally. Internal tension happens when decisions have to be made for the benefit of one side at the expense of the other. External tension happens when the community feels like the company isn’t always acting in their best interest and lacks the context or visibility needed to believe otherwise.

This tension is a natural by-product of the commercial open source model. It will always be there. Let’s acknowledge it, but I see no reason to antagonize it.

If you want to help the community around Alfresco, participate. Build something. Install the software and help others get it up and running. Join the Order of the Bee. If you want to help Alfresco with its marketing, send them your resume.

Quick Hack: Creating default folder structures in Alfresco Share sites

Someone recently asked how to create Alfresco Share sites with a default folder structure. Currently, out-of-the-box, when you create an Alfresco Share site, the document library is empty. This person instead wanted to define a set of folders that would be created in the document library when a new Alfresco Share site is created.

Although this exact functionality is not available out-of-the-box, there is something similar. It’s called a “space template”. Space templates have been in the product since the early days but haven’t yet been exposed to Alfresco Share. In the old Alfrexsco Explorer client you could specify a space template when you created a new folder, and the resulting folder would have the same set of folders and documents that were present in the template.

With a little bit of work using the out-of-the-box extension points we can get Alfresco to use space templates when creating new sites in Share. I’ve created this as an add-on and the code lives on GitHub. I thought it might be instructive to review how it works here.


The first thing to realize is that a space template isn’t special. It’s just a folder that happens to live in Data Dictionary/Space Templates. In fact, there aren’t any API calls specific to space templates. If you go looking for a createFolderFromTemplate() method on a Folder object you’ll be disappointed. When the Alfresco Explorer client creates a folder from a template, it simply finds the template folder and copies its contents into the newly-created folder.

On the Alfresco Share side, a site isn’t that special either. It’s just a special type of folder. The document library that sits within a Share site is also just a folder, albeit a specially-named folder with an aspect. Normally, the document library folder does not get created until the first user actually opens the site and navigates to the document library.

So all we really need to do is write some code that gets called when a site is created, looks up the space template, and then creates the document library folder with the contents of the space template.

What’s the best way for the code to know which space template to use? One way would be to use a specially-named space template for all Share sites. But using a single space template for all Share sites seems limiting. Alfresco Share already has a mechanism for selecting the “type” of site to create–it’s called a preset. So a better approach is to use the preset’s ID to determine which space template to use.


We need to run some code when a site is created. One way to do this is with a behavior. The behavior is Java code that will be bound to the onCreateNode policy for nodes that are instances of st:site. The init() method does that:

// Create behaviors
this.onCreateNode = new JavaBehaviour(this, "onCreateNode", NotificationFrequency.TRANSACTION_COMMIT);

// Bind behaviors to node policies
this.policyComponent.bindClassBehaviour(QName.createQName(NamespaceService.ALFRESCO_URI, "onCreateNode"), TYPE_SITE, this.onCreateNode);

So any time a node is create that is an instance of st:site, the onCreateNode() method in this class will get called.

The first thing the onCreateNode() method needs to do is find the space template. To keep things simple, I’m going to assume that the space template is named the same thing as the site preset ID, so all I need to do is grab that preset ID and do a Lucene search to find the template:

NodeRef siteFolder = childAssocRef.getChildRef();

if (!nodeService.exists(siteFolder)) {
logger.debug("Site folder doesn't exist yet");

//grab the site preset value
String sitePreset = (String) nodeService.getProperty(siteFolder, PROP_SITE_PRESET);

//see if there is a folder in the Space Templates folder of the same name
String query = "+PATH:\"/app:company_home/app:dictionary/app:space_templates/*\" +@cm\\:name:\"" + sitePreset + "\"";
ResultSet rs = searchService.query(StoreRef.STORE_REF_WORKSPACE_SPACESSTORE, SearchService.LANGUAGE_LUCENE, query);

If there aren’t any space templates the method can return, otherwise, the space template should be copied into the site folder as the new document library folder:

documentLibrary = fileFolderService.copy(spaceTemplate, siteFolder, "documentLibrary").getNodeRef();

//add the site container aspect, set the descriptions, set the component ID
Map<QName, Serializable> props = new HashMap<QName, Serializable>();
props.put(ContentModel.PROP_DESCRIPTION, "Document Library");
props.put(PROP_SITE_COMPONENT_ID, "documentLibrary");
nodeService.addAspect(documentLibrary, ASPECT_SITE_CONTAINER, props);

I used the fileFolderService to perform the copy, then I set the properties and aspect on the new document library folder with the values that Alfresco Share expects a document library to have.

I’ve left out some insignificant code bits here and there–you can look at the source if you need to. The project was bootstrapped using the Alfresco Maven SDK and includes a unit test that makes sure the behavior works as expected.


The project creates an AMP file. If you’ve checked out the source you can build the AMP using mvn install. Then you can install the AMP into the Alfresco WAR by placing the AMP in the amps directory and run Alternatively, you can use mvn alfresco:install. After installing the AMP into the Alfresco WAR you can test it out.

Out-of-the-box Alfresco Share has a single site type called “Collaboration Site”. The preset ID for that type of site is “site-dashboard”. You might have additional site types configured in your installation. To set up the template for the default collaboration site, navigate to Data Dictionary/Space Templates and create a folder called “site-dashboard”. Then add whatever folders and documents you want to that folder. Now, anytime a “Collaboration Site” gets created in Alresco Share, the document library will automatically be created with the same folders and documents you’ve set up in your space template.

This was not a mind-blowing, deeply-technical extension to Alfresco. Hopefully, it shows you that, with even a few lines of code, you can easily add useful functionality to Alfresco.

Alfresco 4.2.d: The Public API & some CMIS changes you should be aware of

Well, Alfresco Community Edition 4.2.d has been out for a couple of weeks now and everyone I’ve talked to seems pretty excited about the new release. The new ribbon header in the Share user interface, the filmstrip view, and the table view for the document library seems to have gotten all of the attention (and it does look sharp) but my favorite new feature is actually behind-the-scenes: The Alfresco Public API has finally made its way from Alfresco in the cloud to on-premise.

What is the Alfresco Public API?

We’ve talked about it a lot over the last year, but it’s always been cloud-only, so you’re forgiven if you need a refresher. Basically, it means that we now have a documented, versioned, and backward compatible API that you can use to do things like:

  • Get a list of sites a user can see
  • Get some information about a site, including its members
  • Add, update, or remove someone from a site
  • Get information about a person, including their favorite sites and preferences
  • Get the tags for a node, including updating a tag on a node
  • Create, update, and delete comments on a node
  • Like/unlike a node
  • Do everything that CMIS can do

The Public API is comprised of CMIS for doing things like performing CRUD functions on documents and folders, querying for content, modifying ACLs, and creating relationships plus non-CMIS calls for things that CMIS doesn’t cover. When you run against Alfresco in the cloud you use OAuth2 for authentication. When you run against Alfresco on-premise, you use basic authentication.

Some Examples

I created some Java API examples for cloud a while ago. This week I spent some time on those examples to get them cleaned up a bit here and there and to have a really clear separation between what’s specific to running against Alfresco on-premise versus Alfresco in the cloud and what’s common to both. The result is a single set of code that runs against either one.

Here’s how the project is set up:

The remaining classes in com.alfresco.api.example are the actual examples. These would probably be better structured as unit tests but for now, they are just runnable Java classes. Currently there are only four:

  • simply fetches the short names of up to 10 sites the current user has access to.
  • is the simplest CMIS example you can write. It retrieves some basic information about the repository.
  • creates a folder, creates a document, likes the folder, and comments on the document. This is a mix of CMIS and non-CMIS calls.
  • imports some images, uses Apache Tika to extract some metadata from the binary image file, then uses the OpenCMIS Extension to store those metadata values in properties defined in aspects. The extension has to be used because CMIS 1.0 doesn’t support aspects.

Take a look at the file to see what changes you need to make to run these examples in your own environment.

New CMIS URLs in Alfresco Community Edition 4.2.d

One thing to notice is that starting with this release there are two new CMIS URLs. Which one you use depends on which version of CMIS you would like to leverage. If you want to use CMIS 1.0, the URL is:

If you would rather use CMIS 1.1, the URL is:

New with 4.2.d is the CMIS 1.1 browser binding. Currently Alfresco in the cloud only supports Atom Pub, but if you are only running against on-premise, you might decide that the browser binding, which uses JSON instead XML, is more performant. If you want to use the browser binding, the service URL is:


CMIS Repository ID and Object ID Changes

When using the new URLs, you’ll notice a couple of things right away. First, the repositoryId is coming back as “-default-” instead of the 32-byte string it used to return.

Second, the format of the object IDs has changed. The CMIS specification dictates that object IDs are supposed to be opaque. You shouldn’t build any logic on the content of a CMIS object ID. However, if you have built a client application that stores object IDs, you may want to pay particular attention to this when you test against the new version. The object IDs for the same object will be different than they were when using the old CMIS URLs.

You can see this in 4.2.d. If I use the old “/alfresco/cmisatom” URL, and I ask for the ID of the root folder, I’ll get:

But if I use either the new CMIS 1.0 URL or the new CMIS 1.1 URL and ask for the ID of the same root folder, I’ll get:

In order to minimize the impact of this change, you can issue a getObject call using either the old ID or the new ID and you’ll get the same object back. And if you ask for the object’s ID you’ll get it back in the same format that was used to retrieve it, as shown in this Python example:

>>> repo.getObject('ae045ec3-5578-4c7b-b5a6-f893810e63dc').id
>>> repo.getObject('ae045ec3-5578-4c7b-b5a6-f893810e63dc').name
u'Company Home'
>>> repo.getObject('workspace://SpacesStore/ae045ec3-5578-4c7b-b5a6-f893810e63dc').id
>>> repo.getObject('workspace://SpacesStore/ae045ec3-5578-4c7b-b5a6-f893810e63dc').name
u'Company Home'

Still Missing Some CMIS 1.1 Goodness

The browser binding is great but 4.2.d is still missing some of the CMIS 1.1 goodness. For example, CMIS 1.1 defines a new concept called “secondary types”. In Alfresco land we call those aspects. Unfortunately, 4.2.d does not yet appear to have full support for aspects. You can see what aspects a node has applied by inspecting the cmis:secondaryObjectTypeIds property, which is useful. But in my test, changing the value of that property did not affect the aspects on the object as I would expect if secondary type support was fully functional.


The Public API is new. There are still a lot of areas not covered by either CMIS or the rest of the Alfresco API. Hopefully those gaps will close over time. If you have an opinion on what should be the next area to be added to the Alfresco API, please comment here or mention it to your account rep.

Don’t get tripped up by MySQL/MariaDB skip-networking default in MacPorts

I recently decided to switch one of my MacBooks from MySQL installed with the binary package to MariaDB installed through MacPorts. I’ve used MacPorts for years and it has worked great for me, although I realize that isn’t the case for everyone.

After installing MariaDB, when I started Alfresco, it couldn’t talk to the database. I thought it might be a MariaDB problem, so I poked around a bit and then installed MySQL from MacPorts. Same problem.

I noticed that the problem was only with JDBC connections. Python didn’t have any trouble connecting nor did any of my non-JDBC database tools. A clue.

Then I noticed that “netstat -an|grep 3306” came back with nothing indicating that the database wasn’t listening on the port at all. I thought maybe it was a networking problem. Maybe permissions. Maybe a my.cnf issue. I tried tweaking all of that but wasn’t making any progress.

I decided there must be a .cnf file somewhere that was hosing me. It turns out that MacPorts installs a default .cnf file for both MySQL and MariaDB that has “skip-networking” turned on. That turns off all network-based connections to the server and that includes JDBC.

I have no idea why that is turned on by default but if you don’t know to look for it, you may chase your tail for a while. The fix is simply to edit the default .cnf file and comment out skip-networking. For MySQL, the file lives here:


And for MariaDB the file lives here:


Alfresco Roma Meetup Agenda

UPDATE: I’ve updated this post with links to the presentations from the meetup. We had a great turnout. Thanks for coming, everyone!

We did a half-day meetup in Rome on Friday, March 22. The agenda with links to what was presented is as follows:

15:00 to 15:15 Welcome (Jeff)

15:15 to 15:45 Introducing the Alfresco API (Jeff)

15:45 to 16:15 Customer Case Study (Fastweb)

16:15 to 16:45 Alfresco WebScript Connector for Apache ManifoldCF (Piergiorgio Lucidi, Sourcesense)

16:45 to 17:00 BREAK

17:00 to 17:30 Part 1: Alfresco & Semantics, Part 2: RedLink (Ainga Pillai, Zaizi)

17:30 to 18:00 The New Maven SDK (Gab Columbro, Alfresco)

18:00 to 18:15 Invitation to Join the Community (Jeff)

18:15 to 19:00 Pizza, Beer, & Networking

Parenting Hack: Even and Odd Days

I have a sister three years younger than I am. When we were kids we fought over everything. Who gets the front seat. Whose turn is it to take out the garbage. Or to feed the dog. Or to do the dishes. And on and on. As a parent of two kids I now know how crazy that must have made my parents.

Their solution to this was usually along the lines of “If you can’t work it out I’ll work it out for you,” followed by a forced compromise neither of us liked, delivered in exasperation, leaving each of us seething.

Luckily, with my two kids we’ve been using a solution that has worked great for many years, and one that came to us from my sister, ironically enough. We call it “even and odd days”. The way it works is that each child gets a day. When it is that child’s day, they have to do all of the chores, but they also get to reap whatever perks might occur that day. To keep track of whose day it is, one child takes even days and the other takes odd days. In our case, my son’s birthday falls on an odd day and my daughter’s on an even, so making the assignment (and remembering who has even and who has odd) was easy.

For example, the 24th is an even day so it is my daughter’s day. She’s got to feed the dog (both times), empty the dishwasher if it needs it, set the table, and rinse and load the dinner dishes into the dishwasher. But, if there’s a choice about what to have for dinner, she gets to pick. If we’re playing a game, she decides who goes first. So when it is your day you work hard, but you also reap the benefits.

The even and odd days approach takes the emotion and guesswork out of it. Who was the last person to clean the dog poop out of the back yard? Doesn’t matter–today’s an odd day, which means it is your turn, pal!

We started doing this when the kids were probably 5 and 8 and after sticking to it for six years I can say we’ve never had the kind of sibling strife over chores and perks like my sister and I experienced. Every now and then, when one kid’s schedule prevents them from doing a chore on their day and we ask the other one to do it, we’ll get a, “Sorry, not my day” response, particularly if that child has felt like the system hasn’t treated them fairly of late, but that can be dealt with. And when we first started out my son felt a little aggrieved because there are times when odd days occur twice in a row (31st, followed by 1st), but other than those blips, we’re happy with it.

The system even works for chores that are not daily tasks. For example, in our neighborhood, the trash and recycle bins have to be put out on the curb Sunday night. If a Sunday night falls on an even day, it’s my daughter’s task and when it falls on an odd it’s my son’s. Works like a charm.

Tips on Working with Google Fusion Tables

We had a need to see Alfresco forum users by geography. Google Fusion Tables provides the capability to see any geographic location stored in one or more columns on a map. We had successfully used this before for smaller batches of mostly static data, so I decided to see if it would work well for our forum data. This blog post is about what I did, including some useful tips for working with the Google Fusion Table API.

Determining the Location

First, I needed a city and country for each forum user. In our forums, users can declare their location, but not everyone does. So I wrote a little Python script that uses the MaxMind GeoLite database to determine a location for each user based on IP address. The script then compares the IP-determined location with the user’s declared location, and if they are different, it asks the person running the script to choose which one is likely to be more accurate. For example, the IP address based lookup might come back with “Suriname” but the user’s declared location is “Paramaribo, Suriname”, so you’d choose the latter. The script saves each decision so that it doesn’t have to ask again for the same comparison on this run or subsequent runs.

Loading the Data into Google Fusion Tables with Python

Once I had a city and country for each forum user I had to get those loaded into a Google Fusion Table. I found this Python-based Fusion Tables client and it worked quite nicely.

Here are a few tips that might save you some time when you are working with Google Fusion Tables, regardless of the client-side language…

Don’t Update–Drop, then Add

I started by trying to be smart about updating existing records rather than inserting new ones. But this meant that for each row, I had to do a query to test for the existence of a match and then do an update. This was incredibly slow, especially because you can’t do bulk updates (see next point).

So every time I run an update, the script first clears out the table. That means I load the entire dataset every time there is an update, but that is much faster than the update-if-present-otherwise-insert approach.

Batch Your Queries

The Google Fusion Tables API supports bulk operations. You can execute up to 500 at-a-time, if I recall correctly. This is a huge time-saver. My script just adds the insert statements to a list, and when it gets 500 (or runs out of inserts) it joins the list on “;” and then executes the batch with a single call to the Fusion Tables API.

The one drawback, as mentioned in the previous point is that it does not support bulk updates–only inserts are supported. But with the performance gain of bulk operations, I don’t mind clearing out the table and re-inserting.

Throttle Your Requests

If the script exceeds 30 requests per minute it is highly likely you will get rate-limited. So it is important to throttle your requests. I found that a 2.5 second wait between queries was fine and because the queries are batched 500 at-a-time, it really isn’t a big deal to wait.

Geocoding Takes Time

So the whole thing is pretty slick but there is a small pain. Because all rows get dropped every time I load the table, every row has to be geocoded and that takes time. I believe there is an API call to ask the table to be geocoded but I haven’t found that to work reliably. Instead, I have to go to the table in my browser and tell Fusion Tables to geocode the table. This takes a LONG time. For a table of about 10,000 rows it could easily take 45 minutes or more. At least it is something I can kick off and let run. I only update the table once a month. If it were more often, it would be an issue.


That’s it! Thanks to Python and Google Fusion Tables, I now have an interactive map of forum users. Not only is it useful to use interactively, it also lets me run geographic queries against it from Python, such as, “find me the 20 forum users with more than X posts who work within a 20 mile radius of this spot” which can be handy for doing local community outreach.

Thoughts on the Alfresco forums

Back in 2009 I wrote a post called, “The Alfresco forums need your help.” It was about how I happened to come across the “unanswered posts” page in the Alfresco forums and noticed, to my horror, that it was 40 pages long. I later realized that the site is configured to show no more than 40 pages so it was likely longer.

Now that I’m on the inside I’ve got access to the data. As it turns out, as of earlier this month, in the English forums we had a little over 1100 topics created over the past year that never got a reply. That represents about 27% of all topics created for that period.

Last year I ran a Community Survey that reported that 55% of people have received responses that were somewhat helpful, exactly what they were looking for, or exceeding expectations. A little over 10% received a response that wasn’t helpful. About 34% said they never saw a response. If you look at the actual numbers for the year leading up to the survey, there were about 1500 topics created that never got a reply, which is again about 28% of all topics created for the same period.

That day in 2009 I suggested we start doing “Forum Fridays” to encourage everyone to spend a little time, once a week, helping out in the forums. I kept it up for a while. The important thing for me was that even if I didn’t check in every Friday, I did form a more regular forum habit. It felt good to see my “points” start to climb (you can see everyone’s points on the member list) and I started to feel guilty when I went too long without checking in.

Since joining Alfresco I’ve been in the forums more regularly. In fact, this month, I decided to make February a month for focusing on forums. I spent a significant amount of time in the forums each day with a goal of making a dent in unanswered posts. I also wanted to see if I could understand why posts go unanswered.

Some topics I came across were unanswered because they were poorly-worded, vague, or otherwise indecipherable. I’d say 5% fit this category. More often were the questions that were either going to require significant time reproducing and debugging or were in highly-specialized or niche areas of the platform that just don’t see a lot of use. I’d say 20% fit this category. These are questions that maybe only a handful of people know the answer to. But at least 50% or maybe more were questions a person with even a year or two of experience could answer in 15 minutes or less.

Alfresco is lucky. Our Engineering team spends significant time in the forums. The top posters of all time–Mike Hatfield, Mark Rogers, Kevin Roast, Gavin Cornwell, Andy Hind, David Caruana, Derek Hulley–are the guys that built the platform. Somehow they manage to do that and consistently put up impressive forum numbers. We also have non-Alfrescans that spend a lot of time in the forums racking up significant points. Users such as zaizi, Loftux, OpenPj, savic.prvoslav, and jpfi, just to name a few, are totally crushing it. It isn’t fair or reasonable for me to ask either of these groups to simply spend more time in the forums. And, while I have sincerely enjoyed Focus on Forums February, I’m not a scalable solution. Instead, I’d like to mobilize the rest of you to help.

I think if we put our minds to it, we should be able to address every unanswered post:

  • Questions that are essentially “bad questions” need a reply with friendly suggestions on how to ask a better question.
  • Time-consuming questions need at least an initial reply that suggests where on, the wiki, other forum posts, or blogs the person might look to learn more, or even a reply that just says, “What you’re asking can’t easily be answered in a reasonable amount of time because…”. People new to our platform don’t know what is a big deal and what isn’t, so let’s explain it.
  • Highly-specialized or niche questions should be assigned to someone for follow-up. If you read a question and your first thought is, “Great question, I have absolutely no idea,” your next thought should be, “Who do I know that would?”. Rather than answering the question your job becomes finding the person that does know the answer. Shoot them a link to the thread via email or twitter or IRC. Some commercial open source companies I’ve spoken to about this topic actually assign unanswered posts to Jira tickets. That’s food for thought.
  • Relatively easy questions have to be answered. Our volume is manageable. We tend to get about 400 new topics each month with 100 remaining unanswered, on average. With a company of our size, with a partner network as big as we have, with as many community members as there are in this world, I see no good reason for questions of easy to medium difficulty to go without a reply.

So here are some ideas I’ve had to improve the unanswered posts problem:

  • Push to get additional Alfrescans involved in the forums, including departments other than Engineering.
  • Continue to encourage our top posters and points-earners to keep doing what they are doing.
  • Identify community members to become moderators. Task moderators with ownership of the unanswered post problem for the forums they moderate. This doesn’t mean they have to answer every question–but it does mean if they see a post that is going unanswered they should own finding someone who can.
  • Continue to refine and enhance forums reporting. I can post whatever forums metrics and measures would help you, the community, identify areas that need the most help or that motivate you to level up your forums involvement. Just let me know what those are.

What are your thoughts on these ideas? What am I missing? Please give me your ideas in the comments.

By the way, while I’m on the subject, I want to congratulate and thank the Top 10 Forum Users by Number of Posts for January of this year:

1. mrogers* 74
2. amandaluniz_z 36
3. MikeH* 30
4. jpotts* 25
5. fuad_gafarov 22
6. zomurn 21
7. Andy* 19
8. RodrigoA 16
9. ddraper* 16
10. mitpatoliya 16

As noted by the asterisk (*) half of January’s Top 10 are Alfresco employees.

And if you are looking for specific forums that need the most help in terms of unanswered posts, here are the Top 10 Forums by Current Unanswered Post Count (as of 2/16):

1. Alfresco Share 210
2. Configuration 169
3. Alfresco Discussion 89
4. Alfresco Share Development 85
5. Installation 82
6. Repository Services 54
7. Workflow 47
8. Development Environment 44
9. Web Scripts 44
10. Alfresco Explorer 40

I’ll post the February numbers next week, and will continue to do so each month if you find them helpful or inspiring.