Back to the Future of Content Repositories

mcflyFive years ago I wrote a blog post called, “Alfresco, NoSQL, and the Future of ECM“. Today is “Back to the Future Day”–the exact date Marty McFly time-traveled to in the movie Back to the Future. The movie made many observations about what life would be like on October 21, 2015–some were spot on, some not so much. I thought it fitting that we take a look at my old blog post and see where we are now with regard to content repositories and NoSQL.

One point of the post was that NOSQL might be a more fitting back-end for content repositories than relational:

But why shouldn’t the Content Management tier benefit from the scalability and replication capabilities of a NOSQL repository? And why can’t a NOSQL repository have an end-user focused user interface with integrated workflow, a form service, and other traditional DM/CMS/WCM functionality? It should, it can and they will.

This has definitely turned out to be the case. New content management solution vendors like CloudCMS have built their platform on NOSQL technology while older vendors, such as Nuxeo, have started to integrate NOSQL into their solutions.

Open source projects are also taking advantage of the technology. Apache Jackrabbit provides an implementation of the JCR standard. Its “next generation” offering, Jackrabbit Oak, is essentially JCR with MongoDB as the back-end.The second point of the post was that as NOSQL repositories become more widely adopted, they compete directly with content repositories in use cases where those content repositories are used primarily as a back-end for developers’ custom content-centric solutions.

In other words, 5 or 10 years ago, if you were a developer looking to implement a custom application, and you wanted something other than a relational back-end, you might build your application on top of something like Alfresco. Now developers may be less likely to go that route. That’s because today there is an explosion of stacks out there. Many of them assume a NOSQL back-end. Look at as just one example, which combines Node.js, Express (the Node.js web framework), AngularJS, and MongoDB and wraps it up with time-saving tooling.

Many people use Alfresco as a back-end. Their front-end uses a RESTful API implemented as web scripts to talk to the repository. The value the repository brings to the table is the ability to store documents in a hierarchy along with custom metadata defined in a content model. They may not be using Alfresco Share or much of the other functionality that Alfresco bundles with their offering–for their custom solution Alfresco is just a repository. When it is used like this, Alfresco is doing nothing more than what NOSQL repositories offer, and, in fact, it does less because NOSQL repositories have a more flexible schema and are built to be clustered and massively distributable–for free.

Years ago, Alfresco shifted its focus away from developers looking to build custom solutions on top of a bare repository. Its developer outreach is now more about customizing Alfresco Share and the underlying repository. Nuxeo, on the other hand, has doubled-down on its developer focus. I’ll spend some more time on this in a future post.

I guess this trend wasn’t terribly hard to predict five years ago, but it does feel kinda nice to see it come to pass. Now, if I could just have a hoverboard.

Posted in Alfresco, Apache Jackrabbit, Content Management, Content-as-a-Service | 8 Comments

10 Things to Consider When Planning Your Elasticsearch Project

elastic_logo_color_horizontalI am seeing a lot of interest in Elasticsearch from clients and colleagues. Elasticsearch is an open source search engine that is commercially supported by a company called Elastic. It’s used for web search, log analysis, and big data analytics. You’ll often see it compared with Apache Solr. Both depend on Apache Lucene for low-level indexing and analysis. People like Elasticsearch because it is easy to install, scales out to hundreds of nodes with no additional software needed, and is easy to work with thanks to its built-in RESTful API.

Multiple folks have asked me what they need to think about when leveraging Elasticsearch as part of their solution, so I thought I’d summarize those thoughts and share them here. This isn’t a detailed technical list but is more like a set of buckets that need time and attention.

1. Cluster sizing

The nice thing about Elasticsearch is how easy it is to scale out. But you should still have an idea of the near- and medium term hardware footprint. Indexing and querying time can vary depending on many factors and every installation is different. You’ll want to establish your “unit of scale” early so you know roughly what you’ll have to do to get a target level of throughput and CPU utilization.

I wrote a blog post on using Apache JMeter to load-test Elasticsearch, which works well for establishing how much load your cluster can take and where the bottlenecks are.

2. Cluster footprint

Related to cluster sizing is your cluster footprint. Elasticsearch nodes can be master nodes, data nodes, client nodes, or some combination. Most people opt for dedicated master nodes (3 at a minimum) and then some number of data and client nodes.

I like using dedicated nodes for everything because it separates responsibilities and lets you optimize each type of node for its particular workload. For example, I’ve seen a performance boost by separating client and data nodes. The client nodes handle the incoming HTTP requests which leaves the data nodes to service the queries.

Like sizing, the footprint that works well for you depends on what you’re doing, so use something like JMeter to test repeatedly until you get it right.

3. Security

You’ll need to secure your Elasticsearch cluster, both between the application/API and Elasticsearch layers and between the Elasticsearch layer and your internal network. Shield, which is a paid product from Elastic, can take you a lot of the way here and if you pay for support from Elastic, Shield is included.

One of my projects uses Shield to provide LDAP authentication, to encrypt all data between Elasticsearch nodes with SSL, and to control authorization for all of the indices in the cluster. We’ve been happy with it so far.

If you can’t justify a support subscription you’ll need to do something else to prevent unauthorized access to your cluster. Using something like nginx as a proxy is a common choice.

4. Index/Alias/Type Mapping approach

You might call this your data partitioning and data modeling approach. You should figure out early what your approach to indices and aliases will be. You’ll definitely want to use aliases–that’s a given. Aliases insulate your app from index name changes among other things. But some thought also needs to be given to how you partition data across indices.

You’ll also need to identify how you’ll leverage type mappings. Elasticsearch is schema-less but type mappings of some kind are almost always needed so that Elasticsearch knows how to index the data (longs versus dates versus strings, for example).

I’m building a dynamic content service on top of Elasticsearch for one of my clients. They have many different types of content that will be indexed in Elasticsearch and returned to their e-commerce app as JSON chunks. A lot of time is going into defining the JSON structure for those types which ultimately gets translated into type mappings.

It is worth spending some time looking at index templates, default mappings, and dynamic mappings and thinking about how you will manage your mappings as the number of types grows.

5. Query approach & relevance tuning

The Elasticsearch query DSL is vast. At a high level you will deal with queries and filters depending on exactly what you need to do. You’ll want to avoid queries, if possible, and lean toward filters. They are much, much more performant.

More than just query design, you’ll want to figure out how you’re going to expose queries to the API. On a recent project we started by having our Java-based API layer translate developer-friendly query string params into Elasticsearch filters. We didn’t stick with that, though, because tuning and tweaking our queries required the API layer to be re-compiled and deployed. We now do everything with search templates which pulls our query logic out of the Java code and makes it easier to manage.

Understanding how to write efficient queries is one thing, but making them return the results that end-users expect is another. Once written, expect to spend some time tweaking analyzers and scoring so that the engine returns the right hits. If this is a particular concern for you take a look at the Relevant Search book from Manning.

6. Monitoring & Alerting

Be sure to factor in a completely separate “monitoring” cluster that will only be used to capture stats about the health of the cluster and alert you when something goes wrong. Two tools that work great for this are Marvel and Watcher.

Marvel keeps track of the health of the cluster and Sense (built-in to Marvel) is used to run ad hoc operations against the cluster. Marvel includes a dashboard that reports on the health of the cluster.

Elastic just released a new tool called Watcher. It watches for certain conditions and alerts you when those conditions are met. So when some stat (JVM heap, for example) reaches a threshold you can take some action (send an email, call a web hook, etc.).

Watcher isn’t just for monitoring the health of the cluster. Watcher can monitor searches against any index. In fact, Watcher can invoke any HTTP end point and then take action based on what comes back.

7. Node provisioning and config management

Once you have more than a handful of nodes it becomes challenging to keep every node in sync with regard to software versions, configuration, etc. There are a number of open source tools that can help with this. I’ve used both Chef and Ansible to help manage Elasticsearch clusters. By far, my favorite tool for this is Ansible. It automates upgrades and configuration propagation without requiring any additional software to be installed on any of the Elasticsearch nodes.

You may not see a huge need for automation now, but if you’re going to start small and grow, you’ll want to be able to grow quickly. Having a library of common tasks scripted with Ansible will allow you to go from bare server to fully-provisioned Elasticsearch node in minutes with no manual intervention.

In addition to automating installs and config changes, you’ll have a need for scheduling routine administrative tasks like copying an index or cleaning up old indices that Marvel and Watcher create daily. I use a “job server” that I built from open source components to do this. Cron jobs are also a common approach.

8. Backup and recovery

Properly tuned, indexing can run pretty fast even for very large data sets. So some people opt to simply re-index if they lose data. Elasticsearch has built-in “snapshot” functionality that can back up your indices. If you do something to handle scheduled operations (see “job server”, above) then taking regular snapshots easy to do. Relying on OS-level file system backups may be dicey once you have multiple nodes due to how the data is stored.

9. API & UI development

It is likely that you’ll put Elasticsearch behind an API layer that provides an agnostic API to applications that are leveraging your search cluster. You may also want to do some transformation of input or output before and after requests hit Elasticsearch. Exactly what you use for this is up to you–there are Elasticsearch clients for most popular languages and you can always just use the REST endpoints if needed. I’ve implemented this layer using Node.js, Java, and Lua and they each have pluses and minuses, as usual.

Everything you index into Elasticsearch is JSON. So any tool that can speak HTTP and post JSON can be used to work with the server. Elastic offers a tool called Marvel that embeds another tool called Sense (also available as a Chrome extension) that is extremely useful for doing this. Of course command-line tools like curl also work well.

Depending on the makeup of your team and the use case, you may find that writing a custom UI to manage the documents indexed into Elasticsearch, rather than using Sense or curl, is the way to go. This is just a web development task, thanks to the Elasticsearch REST API, but obviously it takes time that needs to be accounted for in your plans.

10. Data indexing

It is easy to index data into Elasticsearch. Depending on the data source and other factors, you might write this yourself or you can use another tool from Elastic called Logstash. Logstash can watch log files or other inputs and then efficiently index the data into your cluster.


Installing and running an out-of-the-box Elasticsearch cluster is easy. Making it work for your exact use case and keeping everything humming along takes a bit more effort. Hopefully this list has given you a rough idea of the areas where you’ll likely need to spend time as you move forward with your Elasticsearch project.

Posted in Elasticsearch | Tagged , | 7 Comments

Alfresco cancels Summit, asks community to organize its own conference

summit-community-editionEarlier this week, in a post to a public mailing list, Ole Hejlskov, Developer Evangelist at Alfresco, announced that the company will not be putting on its annual conference, Alfresco Summit, this year as originally planned. Instead, the company is focusing on smaller, shorter, sales-oriented events which have been very successful in several cities around the globe.

Ole said that Alfresco will be adding developer content to its Alfresco Day events, which have historically been mostly end-user and decision-maker focused. In contrast, Alfresco’s yearly events started out as developer-focused conferences, but in recent years had a more balanced agenda with both technical and non-technical tracks.

Alfresco had announced earlier in the year that their annual conference would be in New Orleans in November. In each of the last five years the company put on two conferences–one in Europe and the other in United States. For 2015 the plan was to have a single conference only in the U.S. which drew criticism from the community that skews heavily toward a non-U.S. demographic.

When the community realized Alfresco Summit 2015 would be held only in the U.S., an independent community organization called The Order of the Bee began making plans to hold their own conference in Europe. Alfresco says it will support the community’s efforts to hold its own event and wants to explore “…ways in which participation from Alfresco corporate makes sense”.

I understand where Alfresco is coming from. Annual conferences are expensive in both real dollars and the time and attention it takes to plan and execute. When you multiply that times two it obviously represents an even bigger investment.

You also have to look at what Alfresco gets out of the conference. Alfresco is increasingly sales-focused. The conference has historically been focused on knowledge-sharing and camaraderie. Yes, there were deals closed at Alfresco Summit but it was not geared towards selling. It was more about coming together to share stories, good and bad.

The Alfresco Day events are unabashedly sales and marketing. The attendees (and they get very large turnouts) know this which means Alfresco does not have to apologize for coming off too sales-y. Multiple cities with hundreds of prospects is a better investment for them than two cities with 1400 attendees who are existing customers and community members.

As the guy who led DevCon and Alfresco Summit and together with my team grew it year after year, it is weird to see Alfresco cancel the conference for 2015. I was looking forward to attending.

As a member of The Order of the Bee, I’m intrigued by the challenge of using an all-volunteer organization to potentially put together a replacement conference of some sort. If you have any interest in helping and you did not see my email to the mailing list, we’ll probably be meeting next week to get organized. Reach out to me and I’ll add you to the invitation.

Posted in Alfresco, Alfresco Community, Alfresco DevCon, Alfresco Summit | Tagged , , | 25 Comments

The plain truth about Alfresco’s open source ethos

There was a small flare-up on the Order of the Bee list this week. It started when someone suggested that the Community Edition (CE) versus Enterprise Edition comparison page on put CE in a negative light. In full disclosure, I collaborated with Marketing on that page when I worked for Alfresco. My goal at the time was to make sure that the comparison was fair and that it didn’t disparage Community Edition. I think it still passes that test and is similar to the comparison pages of other commercial open source companies.

My response to the original post to the list was that people shouldn’t bother trying to get the page changed. Why? Because how Alfresco Software, Inc. chooses to market their software is out-of-scope for the community. As long as the commercial company behind Alfresco doesn’t say anything untrue about Community Edition, the community shouldn’t care.

The fact that there is a commercial company behind Alfresco, that they are in the business of selling Enterprise support subscriptions, and at the same time have a vested interest in promoting the use of Community Edition to certain market segments is something you have to get your head around.

Actually there are a handful of things that you really need to understand and accept so you can be a happy member of the community. Here they are:

1. CE is distributed under LGPLv3 so it is open source.

If you need to put a label on it and you are a binary type of person, this is at the top of the list. Alfresco is “open source” because it is distributed under an OSI-approved license. A more fine-grained description is that it is “open core” because the same software is distributed under two different licenses, with the enterprise version being based on the free version and including features not available in the free version.

2. Committers will only ever be employees.

There have been various efforts over the years to get the community more involved in making direct code contributions. The most recent is that Aikau is on github and accepting pull requests. Maybe some day the core repository will be donated to Apache or some other foundation. Until then, if you want to commit directly to core, send a resume to Alfresco Software, Inc. I know they are hiring talented engineers.

3. Alfresco Software, Inc. is a commercial, for-profit business.

Already mentioned, but worth repeating: The company behind the software earns revenue from support subscriptions, and, increasingly, value-added features not available in the open source distribution. The company is going to do everything it can to maximize revenue. The community needs this to be the case because a portion of those resources support the community product. The company needs the community, so it won’t do anything to aggressively undermine adoption of the free product. You have to believe this to be true. A certain amount of trust is required for a symbiotic relationship to work.

4. “Open source” is not a guiding principle for the company.

Individuals within the company are ardent open source advocates and passionate and valued community members, but the organization as a whole does not use “open source” as a fundamental guiding principle. This should not be surprising when you consider that:

  1. “Drive Open Innovation” not “Open Source” is a core value to the company as publicly expressed on the Our Values page.
  2. The leadership team has no open source experience (except John Newton and PHH whose open source experience is Alfresco and Activiti).
  3. The community team doesn’t exist any more–the company has shifted to a “developer engagement” strategy rather than having a dedicated community leadership or advocacy team.

Accept the fact that this is a software company like any other, that distributes some of its software under an open source license and employs many talented people who spend a lot of their time (on- and off-hours) to further the efforts of the community. It is not a “everything-we-do-we-do-because-open-source” kind of company. It just isn’t.

5. Alfresco originally released under an open source license primarily as a go-to-market strategy

In the early days, open source was attractive to the company not because it wanted help building the software, but because the license undermined the position of proprietary vendors and because they hoped to gain market share quickly by leveraging the viral nature of freely-distributable software. Being open was an attractive (and highly marketable) contrast to the extremely closed and proprietary nature of legacy ECM vendors such as EMC and Microsoft.

I think John and Paul also hoped that the open and transparent nature of open source would lend itself to developer adoption, third-party integrations and add-ons, and a partner ecosystem, which it did.

I think it is this last one–the mismatch between the original motivations to release as open source and what we as a community expect from an open source project–that causes angst. The “open source” moniker attracts people who wish the project was more like an organic open source project than it can or ever will be.

For me, personally, I accepted these as givens a long time ago–none of them bother me any more. I am taking this gift that we’ve been given–a highly-functional, freely-distributable ECM platform–and I’m using it to help people. I’m no longer interested in holding the company to a dogmatic standard they never intended to be held to.

So be cool and do your thing

The “commercial” part of “commercial open source” creates a tension that is felt both internally and externally. Internal tension happens when decisions have to be made for the benefit of one side at the expense of the other. External tension happens when the community feels like the company isn’t always acting in their best interest and lacks the context or visibility needed to believe otherwise.

This tension is a natural by-product of the commercial open source model. It will always be there. Let’s acknowledge it, but I see no reason to antagonize it.

If you want to help the community around Alfresco, participate. Build something. Install the software and help others get it up and running. Join the Order of the Bee. If you want to help Alfresco with its marketing, send them your resume.

Posted in Alfresco, Alfresco Community, General | Tagged , , , | 38 Comments

When to consider Cloud CMS for your content management project

Cloud CMS LogoCloud CMS announced today that it has added support for CMIS. This is a nice addition for all sorts of reasons, but near the top from Cloud CMS’s perspective is that it makes it easier to migrate content from existing solutions into the Cloud CMS repository.

Back in November I did a series of reviews on content-as-a-service providers. One of my posts was on Cloud CMS. The post assumes you are looking for hosted content-as-a-service and shows how Cloud CMS compares to other cloud offerings.

What I think we’re going start seeing more and more, however, are people who might consider Cloud CMS as an alternative to traditional on-premises ECM vendors like Alfresco, Nuxeo, Documentum, and Microsoft. Although Cloud CMS was originally built to be a hosted, content-centric, back-end for mobile and web applications, it can just as easily function as your hosted intranet or document management repository.

With custom content models, event triggers, and custom workflows, you may find that the only difference between Cloud CMS and your current on-premises document repository is that you don’t have to worry about software or hardware installation and upgrades any longer.

Considering Cloud CMS as an alternative to traditional players may make sense when:

  1. A 100% cloud-native solution is preferred. While Cloud CMS could be run on-premises it is certainly built to be hosted on your behalf. Plus, one of the benefits of letting Cloud CMS run, upgrade, and scale your repository is so you don’t have to.
  2. Customization is important. Some of the traditional vendors have made cloud add-ons for their products, but they then lock down the content model and the user interface so that it cannot be customized. Cloud CMS offers the benefits of hassle-free operations while maintaining your ability to customize it to meet your exact requirements.
  3. Budget is constrained. Clients who need enterprise-grade features and the peace of mind that a support contract brings but can’t justify the high cost of a traditional vendor’s enterprise license may find Cloud CMS to be a lower-cost alternative. Rather than licensing by the seat or the server, Cloud CMS cost is based on how and how much you use the system.

Clients who have very straightforward needs (simple file sync and share, for example) will probably choose something a little more utilitarian, like Google Drive, Box, Dropbox, or Amazon Zocalo. And, despite Cloud CMS recently having undergone an extensive security audit, I know some clients may still be reluctant to move to the cloud. Everyone else, though, should take a hard look at Cloud CMS.


Posted in Content Management | Tagged , , , | 7 Comments

Alfresco tutorials updated to SDK 2.0 and Alfresco 5.0

I’ve recently updated the Alfresco Developer Series tutorials to work with version 2.0 of the Alfresco SDK and Alfresco 5.0.d (and Enterprise 5.0).

Note that the SDK is not backwards compatible. If you are running Alfresco 4.x you need to use the older version of the SDK. When you move to 5.0 you need to move to SDK 2.0. The steps to do that are roughly:

1. Merge your pom.xml with the one generated by the 2.0 archetype.
2. Copy/merge tomcat/context.xml.
3. Copy from a 2.0 project into yours. Only needed if you are using spring-loaded.
4. Copy/merge src/test/resources.
5. Copy/merge src/test/properties.

That part was easy for all of the tutorial projects. The time-consuming part was just updating the screenshots and a few of the steps. The code stayed the same across all projects.

If you are still on 4.x and you want to use the tutorials that are specific to the older version, just use the source tagged with “4.x” on github.

Posted in Alfresco, Alfresco Tutorials | Tagged | 17 Comments

Quick Hack: Restricting Create Site Links to a Site Creators Group

At Alfresco Summit 2014 there was a wonderful session from Angel Borroy called, “10 Enhancements That Require Less Than 10 Lines of Code“. If you missed it you should follow that link and watch the recording.

Angel said the talk was inspired by my blog post about an example add-on I created that allows you to define default folder structures that get populated in the document library when you create a new site (see share-site-space-templates on GitHub).

One of the other 9 enhancements Angel showed was how to hide the “Create Site” link. I’ve seen so many of my clients and people in the forums asking for this functionality, I decided to enhance it a little bit, put it in an AMP, and make it available on GitHub. You can download share-site-creators and try it out for yourself.

Here’s a little more about how it works…

Instead of hiding the “Create Site” link from everyone but administrators, my add-on allows you to create a group that is used to determine who can create sites. The group, appropriately enough, is called “Site Creators”. If you aren’t in that group, you won’t see the “Create Site” link in any of these places:

  • The Sites menu
  • The “My Sites” Dashlet
  • The “Welcome” dashlet

Additionally, the add-on changes the underlying permissions at the repository tier so that if your teammates are hackers, they cannot circumvent the user interface and create their sites using other means.

The screenshot below (click to enlarge) shows what it looks like when you aren’t a member of the Site Creators group:

The Share Site Creators add-on restricts the Create Site link to a specific groupYou might notice that the text of the “Sharing” column in the welcome dashlet also changes to be more applicable to someone who cannot create their own sites.

The new text is in a properties file. Currently I have only an English version, so if any of my multi-lingual friends want to translate the new string, that might be useful to others.

Just like my Share Site Space Templates add-on, this one is not mind-blowing. But it is useful, both in terms of functionality, and as an example of how to override Alfresco Share web scripts without copying-and-pasting tons of code.

I’ve tested this with Alfresco 4.2.f Community Edition. If you want to get it working with other versions, or you have other improvements, bring on the pull requests!

Posted in Alfresco | Tagged , | 19 Comments

Richard Esplin steps down as Head of Alfresco Community Relations

Things continue to be in flux at Alfresco with regards to how they manage their community. Today, Richard Esplin announced that he is stepping down as Head of Alfresco Community Relations to become a Product Manager for Alfresco Community Edition. In his blog post, Richard says that although his title changed a while back, his day-to-day job has still been mostly focused on the community, until now.

It sounds like rather than having a centralized team focused on managing the community, the various community touch points will be diffused throughout the organization.

Last month, Alfresco hired long-time community member, author, and former Ixxus employee, Martin Bergljung. I know through the grapevine there are more community hires on the way. These seem to be focused on “developer outreach” and “developer ecosystem” which is one aspect of community management.

I hope the “community is everyone’s job” approach does not lead to a “community is no one’s job” problem at Alfresco.

Related to Community Edition, Richard said, “I will be rethinking our approach to Alfresco Community Edition in order to make it a better product for its target audience”. My worry here is that there hasn’t always been agreement on what is the “target audience” for Alfresco Community Edition. In the past, Alfresco Software, Inc. has wanted the target audience to be developers who experiment and test out code that will ultimately become Enterprise Edition. The reality has been that many people want to run Community Edition in production–they want a high quality, free/libre open source software product that helps them solve document management problems.

Hopefully, Richard and the rest of the Alfresco team are aligned to the new reality of how Community Edition is being used.

It will definitely be interesting to see how these staffing shifts work out for the community.

Posted in Alfresco, Alfresco Community | Tagged | 16 Comments

Book Review: Learning Alfresco Web Scripts

Packt Publishing recently sent me a copy of Learning Alfresco Web Scripts by Ramesh Chauhan to review in exchange for my thoughts on the book which I’ll share with you now…

Web Scripts are an essential part of Alfresco. If you are extending or customizing the platform and you have time to learn only one thing about it, web scripts might very well be that thing. The reason is that they are key to so much else you might want to do such as integrating Alfresco with a third party system or customizing Alfresco Share, which is, at its core, comprised of web scripts.

Most technical books on Alfresco give some attention to web scripts, but this one dives into the details. After reading it, you’ll know how to do simple things with web scripts and you’ll have some idea of how to do more complex work beyond understanding the Model-View-Controller basics.

While the comprehensiveness is a good thing, I think it also presents a bit of a challenge which comes down to this: Who is this book for? If it is for beginners, it needs a lot more examples and could cut back on a lot of the technical detail. If it is for experts, point to existing sources for the basics and drill deeper on the interesting technical topics.

I did see a couple of bad practices in the book. First, in the chapter on Java-backed web scripts (Ch. 6) the author provides an example Spring configuration XML file that injects the lower-cased Alfresco beans into the web script class. This may sound like a trivial nitpick, but it’s actually a big no-no that I see repeatedly. If you have good reason for using the unsecured, internal-only, lower-cased beans, explain it. Otherwise, stick to the public, secured, upper-cased beans so that beginners don’t pick up a bad habit.

Similarly, there is a part that discusses where web script files can live in the Alfresco WAR. The author does point out that the extension directory is “preferred” but I don’t think this is worded strongly enough. Those other locations should have been left out entirely or maybe it could have said, “Place your files only in the extension directory. While these other locations may technically work, you should never use them.”

I was happy to see the chapter on the Maven SDK and the discussion of AMPs. And I think putting the Eclipse details later was a good idea, as one of the features of web scripts is that you don’t have to use Java or an IDE of any kind if you don’t want to. The book doesn’t cover Surf, Aikau, or Share customization in any detail, and I think that was also a good decision as those areas are too fluid at the moment and because the singular focus on web scripts is one of the book’s assets.

Overall, Learning Alfresco Web Scripts is a very thorough and comprehensive treatment of an important technical topic for Alfresco developers.

Posted in Alfresco | Tagged , , | Leave a comment

Alfresco Community Edition needs sensible version labels

See if you can answer this question: What is the current stable release of Alfresco Community Edition?

Some of you probably blurted out “5.0”. But that’s not specific enough. Alfresco Community Edition releases have letters as part of the release name. Did I hear someone say “5.0.c”? That is certainly the latest version but is that the current stable release? I would argue that it’s not and that the correct answer to the question is actually “4.2.f”. That’s the newest version I would recommend to anyone wanting to run Community Edition in production.

The problem is that you can’t actually tell what is supposed to be the stable release by looking at the version labels like you can with virtually every other open source software project. Hindsight is actually the only tool we have. The reason 4.2.f is the latest stable release is because it was the last release in the 4.2 Community Edition code line. We won’t know which 5.0 release is the stable one until Alfresco stops creating 5.0 releases!

Really, 5.0.a, 5.0.b, and 5.0.c should be labeled 5.0-RC1, 5.0-RC2, and 5.0-RC3. I’m using RC for “Release Candidate” here because that’s basically what they are, but “snapshot” or “milestone” could also work. We just need something that indicates that, eventually, we’re going to see an end to the iterations and finally arrive on a stable release and that you should really wait to deploy to production until that stable release comes out.

If you look at 4.2, I think 4.2.e was the “final” release and then 4.2.f was a special release to address a serious security vulnerability. So 4.2, a, b, c, and d should have been “release candidates”, 4.2.e should have just been 4.2.0, and 4.2.f should have been 4.2.1. I wouldn’t expect the third digit, which normally signifies a “service pack” to be anything other than 0 for the vast majority of Community Edition releases. The 4.2.f release was an exception to the norm which is that Alfresco doesn’t provide service packs for CE.

The reason an easily identifiable release label is important is because today people in the community are going to the Alfresco download page and assuming that what they are downloading is a stable release. They are then installing and running that release in production. This leaves those people disappointed down the road when they find out they installed software with numerous known issues or partly-implemented features (I know those issues are often documented in the release notes and in Jira). The point is that downloaders, particularly newcomers, don’t have (and shouldn’t need) the insight that Alfresco releases don’t really settle down until the fourth iteration or so. That should be explicit.

The reason Alfresco doesn’t use “stable” to describe the release is partly commercial. The thinking is that doing so makes running the freely-available Community Edition in production seem less risky or even encouraged by the company that depends on revenue from the paid edition.

The other challenge is about process. I don’t think engineering always knows that a given Community Edition release is going to be the last one until after the fact.

Both of these could be addressed with a mindset change. Instead of “Let’s iterate on release X.0 until we’re ready to work on the next major release” the thinking ought to be, “Let’s drive toward a stable, production-ready X.0 release and once we think we have it, let’s call it that”.

I’ve heard chatter that Alfresco might, at some point, consider offering support for those running CE. Based on the number of SMB’s who have told me that Alfresco One is out-of-reach for them financially, there ought to be a strong demand for a low-priced subscription around Community Edition. If that happens, I assume both the mentality and the process around CE version labels will get cleaned up. But I hope we don’t have to wait.

Posted in Alfresco, Alfresco Community | 16 Comments