Category: General

General thoughts that defy categorization.

Tips on Working with Google Fusion Tables

We had a need to see Alfresco forum users by geography. Google Fusion Tables provides the capability to see any geographic location stored in one or more columns on a map. We had successfully used this before for smaller batches of mostly static data, so I decided to see if it would work well for our forum data. This blog post is about what I did, including some useful tips for working with the Google Fusion Table API.

Determining the Location

First, I needed a city and country for each forum user. In our forums, users can declare their location, but not everyone does. So I wrote a little Python script that uses the MaxMind GeoLite database to determine a location for each user based on IP address. The script then compares the IP-determined location with the user’s declared location, and if they are different, it asks the person running the script to choose which one is likely to be more accurate. For example, the IP address based lookup might come back with “Suriname” but the user’s declared location is “Paramaribo, Suriname”, so you’d choose the latter. The script saves each decision so that it doesn’t have to ask again for the same comparison on this run or subsequent runs.

Loading the Data into Google Fusion Tables with Python

Once I had a city and country for each forum user I had to get those loaded into a Google Fusion Table. I found this Python-based Fusion Tables client and it worked quite nicely.

Here are a few tips that might save you some time when you are working with Google Fusion Tables, regardless of the client-side language…

Don’t Update–Drop, then Add

I started by trying to be smart about updating existing records rather than inserting new ones. But this meant that for each row, I had to do a query to test for the existence of a match and then do an update. This was incredibly slow, especially because you can’t do bulk updates (see next point).

So every time I run an update, the script first clears out the table. That means I load the entire dataset every time there is an update, but that is much faster than the update-if-present-otherwise-insert approach.

Batch Your Queries

The Google Fusion Tables API supports bulk operations. You can execute up to 500 at-a-time, if I recall correctly. This is a huge time-saver. My script just adds the insert statements to a list, and when it gets 500 (or runs out of inserts) it joins the list on “;” and then executes the batch with a single call to the Fusion Tables API.

The one drawback, as mentioned in the previous point is that it does not support bulk updates–only inserts are supported. But with the performance gain of bulk operations, I don’t mind clearing out the table and re-inserting.

Throttle Your Requests

If the script exceeds 30 requests per minute it is highly likely you will get rate-limited. So it is important to throttle your requests. I found that a 2.5 second wait between queries was fine and because the queries are batched 500 at-a-time, it really isn’t a big deal to wait.

Geocoding Takes Time

So the whole thing is pretty slick but there is a small pain. Because all rows get dropped every time I load the table, every row has to be geocoded and that takes time. I believe there is an API call to ask the table to be geocoded but I haven’t found that to work reliably. Instead, I have to go to the table in my browser and tell Fusion Tables to geocode the table. This takes a LONG time. For a table of about 10,000 rows it could easily take 45 minutes or more. At least it is something I can kick off and let run. I only update the table once a month. If it were more often, it would be an issue.


That’s it! Thanks to Python and Google Fusion Tables, I now have an interactive map of forum users. Not only is it useful to use interactively, it also lets me run geographic queries against it from Python, such as, “find me the 20 forum users with more than X posts who work within a 20 mile radius of this spot” which can be handy for doing local community outreach.

Thoughts on the Alfresco forums

Back in 2009 I wrote a post called, “The Alfresco forums need your help.” It was about how I happened to come across the “unanswered posts” page in the Alfresco forums and noticed, to my horror, that it was 40 pages long. I later realized that the site is configured to show no more than 40 pages so it was likely longer.

Now that I’m on the inside I’ve got access to the data. As it turns out, as of earlier this month, in the English forums we had a little over 1100 topics created over the past year that never got a reply. That represents about 27% of all topics created for that period.

Last year I ran a Community Survey that reported that 55% of people have received responses that were somewhat helpful, exactly what they were looking for, or exceeding expectations. A little over 10% received a response that wasn’t helpful. About 34% said they never saw a response. If you look at the actual numbers for the year leading up to the survey, there were about 1500 topics created that never got a reply, which is again about 28% of all topics created for the same period.

That day in 2009 I suggested we start doing “Forum Fridays” to encourage everyone to spend a little time, once a week, helping out in the forums. I kept it up for a while. The important thing for me was that even if I didn’t check in every Friday, I did form a more regular forum habit. It felt good to see my “points” start to climb (you can see everyone’s points on the member list) and I started to feel guilty when I went too long without checking in.

Since joining Alfresco I’ve been in the forums more regularly. In fact, this month, I decided to make February a month for focusing on forums. I spent a significant amount of time in the forums each day with a goal of making a dent in unanswered posts. I also wanted to see if I could understand why posts go unanswered.

Some topics I came across were unanswered because they were poorly-worded, vague, or otherwise indecipherable. I’d say 5% fit this category. More often were the questions that were either going to require significant time reproducing and debugging or were in highly-specialized or niche areas of the platform that just don’t see a lot of use. I’d say 20% fit this category. These are questions that maybe only a handful of people know the answer to. But at least 50% or maybe more were questions a person with even a year or two of experience could answer in 15 minutes or less.

Alfresco is lucky. Our Engineering team spends significant time in the forums. The top posters of all time–Mike Hatfield, Mark Rogers, Kevin Roast, Gavin Cornwell, Andy Hind, David Caruana, Derek Hulley–are the guys that built the platform. Somehow they manage to do that and consistently put up impressive forum numbers. We also have non-Alfrescans that spend a lot of time in the forums racking up significant points. Users such as zaizi, Loftux, OpenPj, savic.prvoslav, and jpfi, just to name a few, are totally crushing it. It isn’t fair or reasonable for me to ask either of these groups to simply spend more time in the forums. And, while I have sincerely enjoyed Focus on Forums February, I’m not a scalable solution. Instead, I’d like to mobilize the rest of you to help.

I think if we put our minds to it, we should be able to address every unanswered post:

  • Questions that are essentially “bad questions” need a reply with friendly suggestions on how to ask a better question.
  • Time-consuming questions need at least an initial reply that suggests where on, the wiki, other forum posts, or blogs the person might look to learn more, or even a reply that just says, “What you’re asking can’t easily be answered in a reasonable amount of time because…”. People new to our platform don’t know what is a big deal and what isn’t, so let’s explain it.
  • Highly-specialized or niche questions should be assigned to someone for follow-up. If you read a question and your first thought is, “Great question, I have absolutely no idea,” your next thought should be, “Who do I know that would?”. Rather than answering the question your job becomes finding the person that does know the answer. Shoot them a link to the thread via email or twitter or IRC. Some commercial open source companies I’ve spoken to about this topic actually assign unanswered posts to Jira tickets. That’s food for thought.
  • Relatively easy questions have to be answered. Our volume is manageable. We tend to get about 400 new topics each month with 100 remaining unanswered, on average. With a company of our size, with a partner network as big as we have, with as many community members as there are in this world, I see no good reason for questions of easy to medium difficulty to go without a reply.

So here are some ideas I’ve had to improve the unanswered posts problem:

  • Push to get additional Alfrescans involved in the forums, including departments other than Engineering.
  • Continue to encourage our top posters and points-earners to keep doing what they are doing.
  • Identify community members to become moderators. Task moderators with ownership of the unanswered post problem for the forums they moderate. This doesn’t mean they have to answer every question–but it does mean if they see a post that is going unanswered they should own finding someone who can.
  • Continue to refine and enhance forums reporting. I can post whatever forums metrics and measures would help you, the community, identify areas that need the most help or that motivate you to level up your forums involvement. Just let me know what those are.

What are your thoughts on these ideas? What am I missing? Please give me your ideas in the comments.

By the way, while I’m on the subject, I want to congratulate and thank the Top 10 Forum Users by Number of Posts for January of this year:

1. mrogers* 74
2. amandaluniz_z 36
3. MikeH* 30
4. jpotts* 25
5. fuad_gafarov 22
6. zomurn 21
7. Andy* 19
8. RodrigoA 16
9. ddraper* 16
10. mitpatoliya 16

As noted by the asterisk (*) half of January’s Top 10 are Alfresco employees.

And if you are looking for specific forums that need the most help in terms of unanswered posts, here are the Top 10 Forums by Current Unanswered Post Count (as of 2/16):

1. Alfresco Share 210
2. Configuration 169
3. Alfresco Discussion 89
4. Alfresco Share Development 85
5. Installation 82
6. Repository Services 54
7. Workflow 47
8. Development Environment 44
9. Web Scripts 44
10. Alfresco Explorer 40

I’ll post the February numbers next week, and will continue to do so each month if you find them helpful or inspiring.

Screencasts highlighting a few new Alfresco 4 Community features

Alfresco 4 Community was released last week. There’s a nice presentation on slideshare that summarizes what’s new in Alfresco 4, so I’m not going to give a comprehensive list here. And we’re going to be covering the technical details on all of the new features at DevCon in San Diego and London so I’ll save the code snippets for DevCon.

Next week, people all over the world will be celebrating the Alfresco 4 release with informal meetups so I thought in this post I’d prime the pump a bit with a brief list of the more buzz-worthy features and record some short screencasts of those so that if you aren’t able to join one of the worldwide release parties, you can have your own little soiree at your home or office. Just try not to let it get out of control. If the cops do show up, you might mention that the New York Police Department uses Alfresco.


I’ve been showing Alfresco 4 at JavaOne all week and drag-and-drop was pretty popular. You can drag one or more files from your machine into the repo. And you can move them from one folder to another by dropping onto the folder hierarchy. You’ll need an HTML5-enabled browser for this to work. Here it is in action (this one didn’t get created in HD for some reason):

Document Library In-Line Edit

It’s a little thing, but it’s handy. You can change file names and add tags from the document list without launching the edit metadata panel.

Configurable Document Library Sort Order & Better Site Config

How many times has a customer asked you to change the document library sort order? I know, I know. Now they can do it themselves. Also, you can now brand sites individually, so each site can have its own theme. And components can be renamed to things like your document library don’t have to be called “Document Library”.

Better Administration

The Share Administration panel now has a Node Browser, a Category Manager, and a Tag Manager. The Node Browser and the Category Manager were actually direct community contributions. Tell me again why you are still using the old Alfresco Explorer client?

DM to File System Publish

Last year at DevCon in New York, a bunch of us tackled Brian Remmington, wrestled him to the ground, and refused to let him up until he agreed to add this to the product. Once security was able to break up the scrum we apologized and had a good talk. I think deep down he appreciated our passion. I’m joking, of course, but what’s not a joke is that the DM-to-file system publish functionality is now in there. I’ll update this post with a screencast as soon as I figure out how it works.

So take a look at the presentation for a more complete summary. I didn’t show Activiti or Solr, which are two much-anticipated additions to the product, because the value they add is hard to convey in a short screencast. Feel free to record your own screencasts of your favorite new features and point me to them.

Seven tips after five years working from home

Yesterday morning I was enjoying a bike ride before work and it occurred to me that this month marks my fifth year of working from home. As with all things in life, there are both good and bad aspects to working remotely, but on the whole I think working from home nets out to a Good Thing: I see more of my family, I spend less time and money on driving, and I’m healthier.

Most of these didn’t take five years to figure out, but here they are anyway: Seven Tips on Working From Home:

Tip #1: Take a shower and get dressed, for crying out loud

I know there are a lot of people that like to work from home in their pajamas, but I don’t see how they do that consistently. Can you really have a serious conference call about global strategy when you’ve got Yoda staring up at you from your lounge pants? Plus, I need that shower to wake me up. And, while they may be shorts and a t-shirt, putting clothes on is part of getting into the “I’m working” mindset for you and a good external signal to others around you.

Tip #2: Set expectations with your kids

My kids were 5 and 8 when I started working from home. That meant both were in school for most of the day for most of the year. The other key factor is that they were old enough to understand that Dad’s at home during the day to work, not to play. Younger kids don’t get that at all. And little kids don’t quickly grasp the all-critical Signs of Interrupt-ability:

  • Door Open = Come on in.
  • Door Closed = Think twice!
  • Door Closed with Headphones On = Interrupt only if you are bleeding uncontrollably or the house is on fire, also realizing that it may take several minutes for Dad to come out of The Zone such that he can form words and coherent thoughts.

Tip #3: Set expectations with your spouse/partner/roommate

Similar to the previous point, you’ve got to set some ground rules with your mate. For example, I don’t answer the door or the home phone during work hours. Or work on honey-do’s. Or tell one sibling to stop bugging the other. Or figure out why the printer doesn’t work. When I’m in work mode, I’m at work. Sure, I’m happy to have lunch with the rest of the fam or take a quick break to find out how the kids’ day at school went–that’s part of the appeal to working from home–but my family understands the limits of what they can get away with when I’m in the home office.

Tip #4: Establish a clean break between work and non-work modes

A common complaint from the families of people who work from home is that “they work all of the time”. It is easy to fall into that pattern. I think you’ve got to already have a handle on work-life balance before you start working from home or it can become a bigger problem. It helps if you have space you can dedicate as your work area and a time window you can designate as work-time and try to stick to that. When you are in serious work mode, don’t work from the couch. And on the weekends, don’t hang out in your office. Sure, in crunch times you’ll burn the midnight oil, but don’t let that be all of the time. And, if it is any consolation to your family, at least when you are working all night, you don’t have to drive home in the wee hours.

Tip #5: Collaborate with co-workers/clients in-person from time-to-time

It’s important to form bonds with the rest of your teammates. You can do this when you collaborate with remote tools like Skype and Webex, but it happens much faster in-person. My job involves a lot of travel, so I get plenty of opportunities for face-time with colleagues. When I collaborate remotely with people I don’t see in-person often, I make sure some part of our online collaboration is spent talking about non-work stuff. On client projects, we always tried to be on-site at the start of a project and again at major milestones.

Tip #6: Get out of the house

When your commute is measured in steps, not miles, it is easy to get cabin fever. Staring at the same four walls every day can be a drag. In America, our average day contains an appallingly low amount of walking or other physical activity. Working from home can compound the problem–you’re not getting that vigorous walk from the parking garage to the cubicle twice a day, after all! I try to go out to lunch with friends or family, ride my bike or go for a walk, or attend meetups or networking events. Anything to get out of the house, interact with people, and get the blood flowing. A nice thing about working from home is that it is easier to do a mid-day exercise break, whereas most people in a traditional office have to settle for working out before or after work. If you can take advantage of the opportunity for more exercise and combine that with less eating out, I think working from home can have positive health effects.

Tip #7: Invest in tools

If your company relies on a remote workforce you need to make sure you are providing top-notch tools and infrastructure to facilitate that (disclaimer: I work for a software company that produces content management and collaboration tools). At Optaros, we were a globally distributed team. We used Alfresco for document management, but for project collaboration we used Trac because, although Alfresco Share is awesome for content collaboration, it lacks some of the tools critical for collaborating on code-based projects, like source code control integration and automatically-logged real-time chat. (Those would actually make good community contributions, by the way, hint, hint). Regardless of what you use, the point is, there are a lot of great tools out there (both on-premise and SaaS) that can really make remote teams hum, and this ought to be considered critical infrastructure at your company.

So, overall, it’s been a productive and happy five years working from home and it would be hard to change now. I do miss the higher level of face-time with my teammates, and actually, sometimes I miss the drive–that’s when I did most of my music listening and thinking about the day. But the pros outweigh the cons, for sure.

How about you? Got any working from home tips I’ve missed?

After more than a year, I’m still in love with my Bacchetta recumbent

It’s been about a year and a half since I bought my Bacchetta recumbent. I’ve put about a thousand miles on it so far and I still love it. I ride as much as my schedule will allow, which, unfortunately, isn’t enough.

Jeff on his Bacchetta Giro 26

People are always curious about the bike. Other cyclists chat me up, pedestrians shout questions at me as I ride by, and one couple in a car even pulled out a camera and snapped a bunch of pics while we sat at a light. I have seen other recumbents in the ‘hood and on large organized rides, but they are a very well kept secret around here so the bike tends to get noticed.

To celebrate my Bacchetta-versary, I’ve compiled a list of the most common questions I’ve been asked about the bike over the last year. Maybe it’ll motivate you to give one a spin the next time you’re at your local bike shop.

“What is that thing?”

It’s a recumbent bicycle. It’s called a recumbent because you ride in a reclined or semi-reclined position.

Recumbents come in all shapes and sizes. There are Long Wheel Base (LWB) models where the front wheel is in front of the pedals and Short Wheel Base (SWB) models where the front-pedals are above or in front of the front wheel. There are some with Above-Seat Steering (ASS; seriously, that’s what they call it) and Below- or Under-Seat Steering (BSS/USS). Many recumbents have a standard-sized wheel in the back and a smaller wheel in the front. Others have the same sized-wheel front and back. There are tandem recumbents. There are recumbent tricycles. There are tandem recumbent tricycles. It is unbelievable how many different styles there are, with significant differences in application (heavy loads, long-distance rides, racing), performance, and cost.

Bacchetta Giro 26

My bike is a Bacchetta Giro 26 (ba-KET-a is the Italian pronunciation, but ba-SHET-a is okay too). It’s a “high racer” with a Short Wheel Base, Above-Seat Steering, and 26 inch wheels on the front and back. The Bacchetta Giro and similar models have a unique-looking frame design: the bike essentially has only one main tube running from the rear wheel to the crank on the front-end. The word Bacchetta is actually Italian for “stick” and that name is certainly fitting when you look at the bike. I chose the Giro 26 because it is exciting to ride and looks sportier than its Long Wheel Base cousins.


I could try to make an argument about better efficiency or aerodynamics but that’s not why I ride it. I ride it because it’s comfortable. Try this: Sit in a comfortable chair, put your feet up, extend your arms, but keep them relaxed, and lay back a little bit. That’s what it’s like. While my upright cycling friends are crunched over, staring at their front tires, I ride reclined in comfort, my head naturally positioned to take a look around and enjoy the scenery. The other day a red-tailed hawk flew next to me for a short stretch. I don’t think I would have seen it had I been heads-down on an upright. It’s this open, laid-back style of riding that makes me want to ride as often as possible, as far as possible.

When people ask “why” the question often includes either an implied or an explicit question-within-a-question: “Is there a medical reason you ride a recumbent?” It’s funny that these bikes are so different, people assume something must have happened to you physically to persuade you to not go with an upright bike by default.

In my early teens I did long-distance rides with my father and uncle. I did Freewheel, a week-long ride across the State of Oklahoma, several times. On those rides, it wasn’t uncommon to end a 60- or 70-mile day with various sore spots–hands, ass, crotch, and sometimes knees. On Freewheel, you camp out, so heading to dinner, the shower, anywhere, usually means getting back on the bike. There were times when that was the last thing I wanted to do.

Hardcore upright cyclists will say that on a properly fitted bike with the right gear, you should be able to ride pain-free. That may be the case and to each his own. But for me, on my recumbent, pain is just not an issue. Sure, on a long ride I get tired, but nothing hurts. I’m not an athlete in any sense of the word, but I really feel like time is my only limiting factor when I’m riding my Bacchetta.

(When I first began riding the recumbent I did have some problems with foot numbness but some Specialized shoe inserts fixed that right up).

“Is it hard to ride?”

Honestly, it can take some getting used to. Starts are the biggest issue. On my Bacchetta, my pedal is 27 inches from the ground at its lowest point and 41 inches from the ground at its highest. On an upright, when you first start, your weight pushes down on the pedal and gives you enough forward momentum to get going. On a recumbent your weight gives you no vertical advantage. When starting, you have to push hard on the pedal with one foot while you get your planted foot off the ground and up to the pedal fast enough to get that second stroke in before you lose momentum and fall over. I’ve had some awkward starts and some close calls but I haven’t dumped it over yet.

Heel touch

Tight turns are a little exciting. Because the crank is in front of the front wheel, and because I’ve got giant size 12 feet, it is possible for my heel to come in contact with the front tire. So turns take a little bit of foresight. Even with that in mind, tight turns can be a little unsettling. I can easily turn within the width of my neighborhood street, but our neighborhood bike paths aren’t happening for me. For that reason, you may want to think twice about a recumbent if you primarily ride on tight bike paths rather than on the road.

When I first starting riding it, I found the balance to be a little twitchy. When I was really stroking, it felt like my pedaling was throwing my balance off and I had to consciously correct. Even twisting the grip shift added a wobble. Over time, my body learned to compensate automatically and now it’s a very stable, comfortable ride. When you’re on a long, flat road, maybe with a little tail wind, clipping along at about 18-20, a quiet purr coming from the wheels on the road, you’re not thinking about balance any more–you’re in some sort of recumbent Zen state.

“Is it slow/is it fast/is it hard to climb hills?”

I’m not a hardcore cyclist. I try to ride 50 miles a week, but I don’t always hit that. And I’m definitely not a fast rider. Garmin says my average speed (with stops) is about 15 mph. The Giro, a touring/commuting bike, weighs around 32 lbs, depending on seat choice and pedals. Bacchetta does offer a performance line of bikes, including the all-carbon Aero that weighs about 20 lbs and costs three times as much as the Giro. So I’m a slow rider on a heavy bike. Are you faster than me? Probably. Am I faster on my recumbent than I am on an upright? I think so, but I kind of don’t care.

Hill-climbing is a drag on a recumbent. You can’t stand up, so there’s not much to do about it other than down shift and pedal hard. At least there’s a back rest to push against. When you do make it to the top, though, the reward is oh so sweet. There’s nothing quite like flying down a hill feet first.

“Have you had any problems with it?”

The usual bike stuff. I used to throw my chain all of the time until I figured out how to adjust the derailleur. I’ve had to adjust the disc brakes a bit. From time to time, it sounds like my brake pads are dragging, but it’s intermittent. Every once in a while the chain grabs a leg hair or two, but I figure that’s a self-correcting problem.

Transporting a recumbent can be challenging. I can get it into the back of a mini-van with no problem. I’ve got a Thule roof rack with standard trays. My Bacchetta just barely fits in the tray, but the clamp that would normally hold the down tube on an upright bike doesn’t work for the Bacchetta so I bought an arm that holds the front wheel instead. I’ve seen a Thule rack with a really long tray that actually pivots to make it easier to get a long and heavy bike (like a tandem) onto the roof, which is probably what you’d need for a recumbent with a Long Wheel Base. I’ve seen recumbents on hitch- and trunk-mounted racks as well.

I bought a Garmin Edge 705 GPS cycling computer that came with a magnetic pace/speed sensor add-on. The add-on works great when the magnet on your crank arm and the magnet on your wheel spokes can pass by the same sensor. I think making that work on my Bacchetta would require some soldering, so I don’t monitor my pace and I use the GPS for speed. The included maps still made the bundle worth it, though.

“Can I ride your bike?”

If you’re thinking about buying a recumbent, you need to ride as many different styles as you can. They all feel different. And, at least when you are starting out, it helps to have someone that knows what they are doing give you some pointers and fit the bike to you.

I’m 6’1″, so I ride the “large” version of the Giro 26. Once you’ve got the right frame size, there are still several things you can tweak. The seat moves forward and backward and the recline angle is adjustable. The handle bars can be brought forward or backward and the handle bar tilt can be adjusted. Because it is so adjustable, it is a bit of a pain to let someone with different dimensions ride it. But bike shops deal with that all of the time so don’t be afraid to ask for that test ride.

“Where can I get one?”

Some bike shops specialize in recumbents. These will have the best selection and the most knowledgeable staff. For example, if you’re in the Austin area, I highly recommend a visit to Easy Street Recumbents. Mike is super friendly and very helpful. Unfortunately, there’s not a shop that specializes in recumbents here in North Texas. Shops around here tend to have a limited selection or none at all. I bought my Bacchetta at Plano Cycling and Fitness. They were great and I’d recommend them in a heartbeat.

Sometimes you can buy directly from the manufacturer. Bacchetta has that option if a dealer isn’t in your area. Also check out used bikes on Craig’s List, recumbent forums, etc.

Jeff comes home after a ride

I hate cold weather but I rode straight through winter this year. I couldn’t let it sit. I wanted to be out there, kicked back and pedaling hard. There’s something about riding that bike that I miss terribly when I don’t do it. I suspect many recumbent riders feel this way. The next time you’re about to yell, “Hey, nice bike!” at someone on a ‘bent, check to see if he isn’t already smiling.

Review: Professional Alfresco, Wrox

I recently finished reading Professional Alfresco, the new Alfresco book written by some of the Alfresco engineers and John Newton, Alfresco’s CTO. Before I share my thoughts, a few disclaimers: First, I wrote my own book on Alfresco called the Alfresco Developer Guide (Packt, 2008). Second, Wrox provided the book to me free of charge in the hopes that I’d write something about it here. Third, I have a strategic relationship with Alfresco, part of which includes bringing each other business.

Okay, with that out of the way, let’s get to it. Professional Alfresco is a new book written about Alfresco 3.2 by a team of authors with a unique insider perspective: all are Alfresco employees. It’s not an end-user focused book nor is it strictly for developers–it’s actually aimed at several different audiences:

  • Part 1, “Getting to Know Alfresco”, is aimed at IT managers or other folks who might be evaluating Alfresco. It covers the business benefits of Alfresco and provides a high-level overview of the platform.
  • Part 2, “Getting Technical”, looks at the platform’s components and services at a closer level. These chapters are directed at Technical Architects or anyone who’s trying to figure out the technical capabilities of Alfresco.
  • Parts 3 & 4 are aimed squarely at Developers. More specifically, these 6 chapters cover Web Scripts (primarily JavaScript, but a Java example is given) and Alfresco Share. The last 3 chapters of Part 4 provide a step-by-step example of building a Knowledgebase application by customizing Alfresco Share, including a few (brief) pages on the new Form Service. A lot of people are working on Share customization projects these days so many will find this a welcome set of material.

While the majority of the technical how-to in the book is focused on Web Scripts and Share, I particularly liked the chapter on Advanced Workflows. It did a good job of explaining what you can do with the jBPM engine without getting too far into the weeds. The section on the Authentication subsystem, LDAP config, and chaining was also very good, particularly as the subsystem setup is a fairly recent development that not everyone is familiar with.

While I did find a lot to like about the book, there were a few things to pick on. First, if you read the book front-to-back, you’ll notice a significant amount of repetition. I suppose that could be a good thing when one of the three audiences the book is written for picks up the book and goes directly to their area of interest. I wonder, though, if some of it was due to having so many authors collaborating on the writing project. The repetition left me feeling like I was really slogging through the material rather than cutting to the chase.

I thought the content modeling chapter was thorough, but I had to wonder why the author chose to step us through the modelSchema.xsd file instead of providing example content model XML. It’s good to know the content model schema is there if I need it, but I think examples of what I can do with the XML are far more illustrative than walking through the schema.

The form service and Share/Surf aren’t covered in nearly enough detail. Other aspects of the platform simply aren’t addressed at all. I think some of that may be because of timing. The form service, for example, continues to evolve with version 3.3, and when you undertake a project this big, you have to draw a line somewhere. Plus, the focus on web scripts and Share is aligned with where Alfresco is focused right now.

The organization of the book is good. It follows a logical progression through the platform. And I like that the end-to-end Knowledgebase example is placed at the end as a sort of capstone applying the concepts learned earlier in the book. If you’re looking for a tutorial-style book though, you may be frustrated by the amount of theory up-front. It’s just not that kind of book. One side note on organization, Chapter 17 is a bit of an odd duck. It’s got interesting content–the chapter discusses various patterns of Alfresco implementation and integration with other systems. I just thought it was weird that it was at the end of the book instead of in one of the first two parts. Not a huge deal, and I’m glad they included it, even if its placement makes it seem like an after-thought.

Overall, Professional Alfresco is a good book appropriate to several different types of readers. Even though there were several authors that wrote it, other than the repetition issue I noted, I didn’t feel like the transitions between authors were very noticeable–the editors did a great job stitching everything together and making it seem like one voice.

The bottom line is that if you are evaluating Alfresco and are trying to understand the architecture of the platform, or if you are a developer focused on web scripts and Share, you’ll find this book to be a valuable resource.

New Tutorial: Getting Started with CMIS

I’ve written a new tutorial on the proposed Content Management Interoperability Services (CMIS) standard called, “Getting Started with CMIS“. The tutorial first takes you through an overview of the specification. Then, I do several examples. The examples start out using curl to make GET, PUT, POST, and DELETE calls against Alfresco to perform CRUD functions on folders, documents, and relationships in the repository. If you’ve been dabbling with CMIS and you’ve struggled to find examples, particularly of POSTs, here you go.

I used Alfresco Community built from head, but yesterday, Alfresco pushed a new Community release that supports CMIS 1.0 Committee Draft 04 so you can download that, use the hosted Alfresco CMIS repository, or spin up an EC2 image (once Luis gets it updated with the new Community release). If you don’t want to use Alfresco you should be able to use any CMIS repository that supports 1.0cd04. I tried some, but not all, of the command-line examples against the Apache Chemistry test server.

Once you’ve felt both the joy and the pain of talking directly to the CMIS AtomPub Binding, I take you through some very short examples using JavaScript and Java. For Java I show Apache Abdera, Apache Chemistry, and the Apache Chemistry TCK.

For the Chemistry TCK stuff, I’m using Alfresco’s CMIS Maven Toolkit which Gabriele Columbro and Richard McKnight put together. That inspired me to do my examples with Maven as well (plus, it’s practical–the Abdera and Chemistry clients have a lot of dependencies, and using Maven meant I didn’t have to chase any of those down).

So take a look at the tutorial, try out the examples with your favorite CMIS 1.0 repo, and let me know what you think. If you like it, pass it along to a friend. As with past tutorials, I’ve released it under Creative Commons Attribution-Share Alike.

[Updated to correct typo with Gabriele’s name. Sorry, Gab!]

Summer grilling tips for your CMS vendor

I like this post from Jon Marks at JonOnTech. It’s about questions you should be asking your CMS vendor you might not have thought to ask. The first five are especially good (see his post for the explanation of each question and the rest of the list):

  1. Who was the last vendor to beat you in the last round of a selection exercise? Why do you think they won?
  2. If, in a few years time, we decided to move away from your product, how would I go about migrating all my content into a new system?
  3. How many active developers do you have on your developer forums?
  4. All of these are important, but please rate these in order of your priority: a) Product Features b) Performance and Stability c) Usability d) Security
  5. How much would I expect to pay a contractor developer that is skilled with your CMS, and are they easy to find?

I am consistently disappointed with how companies evaluate and choose software vendors. Part of the problem is when companies use RFP processes that handle software purchases the same way that factory equipment purchases are handled, but that’s another post (see Making RFP’s More Effective).

The other part of the problem is the questions that never get asked during the vendor pitch. To Jon’s list, I would add:

  1. How long and how many resources did it take to build this demo? You’re looking for closeness of fit, effort to customize, and skillsets involved.
  2. What are the top three technical resources my team should have at the ready during the implementation? You’re looking for availability and helpfulness of documentation. How much of it is vendor-produced versus community-produced? It’s not necessarily bad if the majority of the resources are community-produced–it’s just a data point.
  3. If it makes sense depending on the kind of software, ask do you use your own software in-house. If they don’t, that’s certainly a data point. If they do, ask, as an end-user, what are your top-three headaches when using the software? This is sort of a “what is your biggest area for improvement” kind of question–watch out for turn-your-weakness-into-a-positive kind of answer (“The software is just too powerful!”). Every piece of software has idiosyncrasies. They should be able to name a few.
  4. Tell us about the last implementation that just went completely sideways for reasons attributable to the technology, not to project mis-management, political, or other issues. Obviously, the vendor scores points for honesty on this one, but it’s also interesting to hear how much/little the vendor was involved in salvaging the deal (if it was able to be salvaged).
  5. What is your maintenance renewal rate? I’ve never heard this one asked, but I would think this would be a very telling stat. Customers have all sorts of reasons for not renewing maintenance, but the obvious one is that they feel like the vendor isn’t giving them enough support value for the expense. For commercial open source vendors, support may be their sole source of revenue (excluding professional services, hosting, etc.), so for them you’d think this would be a very high number, otherwise, what’s the point?

By the way, giving your vendors a good grilling isn’t limited to software companies. Picking a services firm also deserves a good set of probing questions, but that’s also another post.

What about you? Got any good questions to ask CMS or other software vendors?

Google App Engine Now Supports Java

I’ve been playing with the newly-released Java support in Google App Engine and it is pretty cool. You can do more than I expected you could:

  • The Google App Engine Eclipse plug-in gives you a template project and associated config files, Ant build scripts, a deployment tool, and a local run-time environment that acts like GAE (user service, data store, limitations imposed by the platform).
  • You’ve got full persistence and query capability via JDO. You pretty much just model your entities as POJO’s, then you annotate the fields in those classes as “persistent” and you’re good to go. You do JDOQL to query your objects. Queries will only return the first 1000 results.
  • You can run cron jobs. A cron job wakes up on a schedule and invokes a URL you specify.
  • Servlets and JSPs are supported but you can also use things like Struts and Spring (See Will it Work in Google App Engine?).
  • You can take advantage Google’s User service, which means anyone with a Google account can sign-in to your app without creating a new account.
  • You can take advantage of Memcache if you need it (JCache).
  • You can fetch URLs via the URL Fetch service or
  • You can send mail via JavaMail.
  • You can use their Image service to resize, rotate, flip, and crop images.
  • Both JDK 5 and JDK 6 are supported.

There are some limits:

  • Execution of requests is limited to 30 seconds and that includes URLs invoked by cron jobs.
  • You can’t write to the file system. If you need to write out files, I assume you’d use S3 or something.
  • You can’t open sockets.
  • Each developer can create up to 10 applications and apps can’t be deleted so don’t fill up on Hello Worlds.
  • You can run an app that has up to 500 MB of storage and serves 5 million page views per month at no cost.

The beauty, obviously, is that as a developer, you get to focus on the code and let Google worry about scaling. For many applications, this Platform-as-a-Service (PaaS) will be preferred over Infrastructure-as-a-Service (IaaS). In an IaaS setup, you can use solutions like RightScale to automatically provision new nodes to handle spikes in demand, but you still have to set that up. Plus, you’ve got the additional cost and headache of installing, configuring, and maintaining the application server and database software (and making sure it is set up to work when new nodes are auto-provisioned). With the app engine, scaling globally is pretty simple: Step 1 – Write (Good) Code; Step 2 – Deploy Code to GAE.