Category: Open Source

Sneak peek: Optaros Ext-JS Web Client for Alfresco

As I mentioned in my “Assembling 2.0 Solutions with Alfresco” talk, Optaros has written a streamlined web client for Alfresco based entirely on REST (web scripts) and Ext JS. We’re calling it DoCASU, which stands for Document Access for Casual Users, and we’re making it freely-available as an open source project.

We haven’t made the official announcement of its availability yet, but so many people have asked me about it after seeing the presentation, I thought I’d spill the beans, at least for faithful ecmarchitect.com readers.

Read more about the project and download the AMP from code.optaros.com. If you see things that need fixing, create a bug report or fix it yourself and contribute it back. Have fun!

Slinging some ideas around RESTful content

Via Seth Gottlieb news that Apache Sling has been officially released. Sling is interesting–I’ve played with it only a bit. You can read more about it on Seth’s post but essentially it is a REST API that sits on top of Apache Jackrabbit. Jackrabbit is the reference implementation of the JCR spec.

I’m not crazy about the JCR API because it is Java-only (yes, I know there are bridges out there). Plus, it doesn’t seem to be rich enough for many types of implementations. For example, Alfresco is a JCR-compliant (Level One) repository, but you don’t see too many people doing JCR-only interactions with Alfresco.

What is interesting to me, though, is the idea that you can abstract repositories at a higher level: the REST API. If we’re all going to talk to our repositories via REST, why not do it in a standard way?

Alfresco introduced a REST framework in 2.1 called “web scripts” (learn more). But Alfresco does not yet have a full-blown “REST API”. Yes, there are a few out-of-the-box REST calls but for the most part, when you interact with Alfresco via REST you are going to roll your own API. On Optaros projects, this has not yet been a huge burden. Quite the opposite, in fact–we’ve been able to develop everything from web script-backed JSR-168 portlets to a streamlined version of the Alfresco web client (soon to be released as an open source project), all on web scripts.

As part of 3.0, Alfresco anticipates rolling out additional out-of-the-box URLs to more fully establish the REST API. The new 3.0 web clients are based almost entirely on REST so they might as well build a REST API that we can all use.

When I look at Apache Sling, I’m thinking, why don’t we agree on a standard REST API for working with content repositories. Then, we could write a front-end once and theoretically use it with multiple back-end repositories. If someone then wants to use a JCR-compliant repository behind the scenes, then that’s cool, but it isn’t a requirement.

Obviously, there is a granularity challenge here. If the API is too granular the front-end ends up making too many calls. If you are aggregating multiple calls from within an intermediate application and then returning the result to the front-end (rather than making a bunch of AJAX calls initiated from the browser client), that’s not as much of an issue. If the API is too coarse, it is too hard to reuse across many different types of front-end applications.

Still, if we agree that REST is the preferred interaction model in the “content as a service” world, at least for the moment, and further, that front-end developers tend to want to have the option of using non-Java technologies for the presentation layer, a standard REST API for interacting with content repositories makes sense, whether that’s Sling, whatever Alfresco comes up with, or the best of both.

Maybe the Atom Publishing Protocol is close to what I’m thinking here. Maybe Alfresco thinks so too. I noticed the Abdera JAR was added to the Alfresco Community dependencies fairly recently. Abdera is an Apache incubator project for working with Atom.

We actually have an engagement with Alfresco right now to develop some of the new 3.0 web client modules. When I get some time I’ll explore this idea a little further by checking in on the 3.0 web client team, looking at Abdera, and doing a deeper dive on sling.

Toughest ECM Job: VP of HR at Vignette?

Quick, name an Alfresco SE who did not come from Vignette. Okay, you can shout out any name you want–I don’t really know the answer. But it does seem like every week I’m meeting an Alfresco new-hire that just turned in his Vignette badge. Most companies have an “attrition” number they keep an eye on. Vignette must have a “left to go to Alfresco” metric that keeps those guys up at night.

Of course the market is relatively small and incestuous and everybody’s gotta have a former employer, but I know when someone leaves you always want to know where they are going and if it turns out to be a competitor, it puts a pit in your stomach.

I guess as the legacy commercial ECM vendors get taken down by Alfresco and, more generally, the open source movement, this kind of thing is going to happen and it will get even worse. It’s kind of like watching Animal Planet. You see the tiger stalking the gazelle, you know what’s coming and you know it is the “circle of life” and all of that, but when the claws inevitably sink into those fleshy hindquarters, you still have to feel something for the gazelle, even if only for a moment. At least until you realize that’s what the gazelle gets for not embracing open source.

Of course hiring a bunch of folks from the same company doesn’t always work out for the best. People tend to be loyal and they move in packs (don’t worry, no one’s getting eaten in this metaphor). For example, someone recently connected the dots for me with regard to the Alfresco WCM engineer departures. It turns out Kevin Cochrane, Britt Park, and Jon Cox all came over from Interwoven so the timing of their departures is not coincidental at all.

Thanks for attending the Open Source ECM event

I want to thank everyone for attending the Open Source ECM event in Dallas this morning. In case you missed it, the slides I presented on “Assembling Enterprise 2.0 Solutions with Alfresco” are available on share.acrobat.com (which is powered by Alfresco, BTW) here.

The deck covers a bit about the general components of Enterprise 2.0 solutions and how a repository like Alfresco can be central to that architecture because it is so open. I then give a brief intro to web scripts (recycled from the talk I gave at the user conference in San Jose earlier this year) and walk through Endeca and a few other client examples.

I’ve also got some Alfresco-Ringside thoughts in there that include screenshots on the Alfresco-Facebook demo app running on Ringside and a list of potential features that might be interesting to implement with an Alfresco-Ringside combination.

Finally, I’ve got some never-before seen screenshots of the yet-to-be-announced Optaros-built streamlined Alfresco web client which we will release as an open source project under the GPLv3 soon.

Is Alfresco the “near beer” of open source?

I grew up in Oklahoma. For my international readers (I have quite a few), Oklahoma is in the central US, is quite beautiful, and is often called the “belt buckle” of the “bible belt”. This last characteristic gives way to some quite asinine laws, one of which is that beer sold in Oklahoma grocery stores must no more than 3.2% alcohol. As a kid I remember people ridiculing Oklahoma’s “near beer” to my father who would inevitably retort, “The 3.2 restriction is by weight while liquor stores measure by volume so it’s not a big deal.” I know–it always sounded lame to me too, but he’s a mathematician. (For details on the math, look here).

One of the criticisms of Alfresco by hardcore open source types is that it isn’t really open source. Like my home state’s beer, it’s almost open source. What does this mean? Certainly, the reasons I cited as to why clients choose open source (fit, standards, source code, transparency) hold true for Alfresco (See this post). But there’s a characteristic of “true” open source projects that’s missing for Alfresco that may not be as high on clients’ care-abouts, but is important to those of us in the community and that is this: In the current Alfresco model, none of us can ever be a committer. Yes, you can contribute patches and enhancements by opening a Jira ticket, but you’ve got to be an employee to be able to write to the SVN repository.

In the early days of Alfresco, this was more defensible than it is now–the code lines were the same, the product was still maturing, and, most importantly, Alfresco needed to protect its interests. Alfresco didn’t necessarily have time to let the community take the product wherever it wanted to. Instead, it needed to establish a critical mass, get things pointed in the right direction, and get some maintenance subscriptions flowing. Unlike other open source projects that start altruistically, Alfresco was a commercial enterprise from the start and there’s nothing at all wrong with that.

But now things have changed. There have been over 1 million downloads. There are tens of thousands of registered members of the community. The Community and Enterprise code lines have been separated. Why not give up some of the control of the Community edition to the, uh, community? Alfresco is still a small company with limited resources. Couldn’t a fraction of those thousands of registered developers be enlisted to help?

Alfresco often compares its model to that of Fedora/RHEL and JBoss.org/JBoss.com which is a good way to illustrate the difference between Community and Enterprise from a development build versus enterprise-ready build perspective. But what about the development model? For those not familiar, the JBoss Development Process is roughly that all code starts in JBoss.org where it is available to early adopters. When it starts to look viable, it is pulled into JBoss.com, where it is scrubbed (maybe even recoded), integrated with the rest of the platform, tested, and productized. The key difference is that JBoss.org contributions include not just JBoss employees but others in the community who’ve earned the right to do so. Why can’t Alfresco work this way?

I imagine the answer comes down to resources and control. I concede that having the same engineers contributing to Community that must then pull the features forward into Enterprise is very efficient. Especially In the beginning, I could see how Alfresco engineers might have to spend more time integrating Community code with Enterprise code than they would have under the closed community policy. Surely that would improve over time, though.

Regarding control, I can understand that a commercial software company would feel inclined to tightly control the project’s growth and that an open community would be seen as a threat to that. But if the community takes the product down a substantially different path than the planned roadmap, wouldn’t that tell you something? And this wouldn’t be completely giving up control–Alfresco product management and Marketing would still be responsible for understanding what clients want, setting the road map, and owning the overall vision.

Maybe this is something we can get John and others to talk about next week in San Jose. Over a beer.

JBoss World Highlights

I already mentioned Hibernate Search and Shards. Here are the rest of the highlights from my perspective…

JBoss World Notes

  • 750 – 1000 people at the conference, 30% from outside of the U.S.
  • This was the biggest JBoss World ever.
  • There were many sessions on Seam and SOA. The Hibernate sessions were overflowing. The BOF sessions, which ran from 8:00pm to 10:00pm, were also packed.
  • JBoss’ middleware is on track to grow to be twice their server business.
  • JBoss recently reached their 20 millionth download.
  • There were about two dozen vendors in the exhibit hall, offering a mixture of tools/add-on solutions, and service providers.

JBoss Portal

JBoss Portlet Container 2.0

  • More here
  • Implements the Portlet 2.0 standard (JSR-286)
  • Going forward, JBoss Portal will be based on this.
  • This is the first portal from JBoss that does not require JBoss Application Server which is an advantage Liferay currently has over JBoss Portal.

In one of the keynotes they did a demo of a seam app. Then they jumped over to JBoss Portal and the seam app was running there as well within a portlet. I didn’t get the details on how much additional config is required to make the seam app work within a portlet.

JBoss SOA Platform

The JBoss SOA Platform is the commercially-supported bundle of JBoss ESB, JBoss jBPM, JBoss Rules, JBoss Messaging, JBoss Application Server. The SOA Platform is a single distribution/install but you can configure out what you don’t need.

JBoss ESB

  • Routes messages (messages in the generic sense, not in the JMS sense, although JMS is one of the optional transports available).
  • Declarative, clusterable, supports hot deployment. It’s ability to be clustered is one potential advantage over other open source ESB implementations.
  • Listeners for many different types of transports are available such as Web Services, JMS, File, FTP, Email, Socket, etc. SFTP and HTTPS will be supported in future versions.
  • Demo showed an order being placed in SalesForce. ESB picked up the payload and parsed/transformed it using Smooks. If the order was below $5000 it was approved which made a call back in to SalesForce to update the order. Otherwise, it triggers an approval process.
  • Many ESB’s in production today were custom developed. A lot of times this means a limited number of transports are supported (e.g., only Web Services)

European Railroad Case Study

  • 100,000 passengers, $1b in revenue
  • Used JBoss SOA Platform to move unsold tickets to EBay auctions (and back)

North State Communications

  • Configure and track comm equipment down to the neighborhood level
  • Integrated apps and workflow using JBoss ESB and jBPM

Big Lots Case Study

Inventory Management application

  • Deploying to each of their thousand or so stores. JBoss app server and MySQL run in each location.
  • Developed (and open sourced) a data replication solution called Symmetric DS that moves data between each store (running MySQL) and the central office (running Oracle RAC)
  • Multi-channel app: Web client, PC-based cash register, Handheld devices (IE running on Symbol)
  • 10 developers, 5-6 months

They had several constraints that shaped their decisions:

  • Limited bandwidth to home office (56k frame relay)
  • User proficiency with web
  • Need offline support in case the link goes down between store and central office

Big Lots did some interesting work with ajax. Not all of their data resides locally in each store’s MySQL database. On pages with a mix of local data and remote data, the page renders immediately with the data it can get locally, then invokes asynchronous calls to retrieve the remote data. If it gets the data, the fields are updated. If not, the page still functions.

Seam

Seam is a Java web application development framework that significantly speeds up development by removing a good chunk of the XML configuration typically required. You can think of Seam as being JSF plus EJB3 with a lot less XML configuration and no JSF backing beans. Most of the configuration is handled through annotations in the Java code.

The framework includes many add-ons (such as AJAX and rich controls) that can result in very compelling user experiences and interesting applications, but it seems pretty easy to use only what you need.

Check out SeamFramework.org. It’s a wiki built w/Seam with documentation and downloads. Also, Joseph F. Nusairat did a pretty good Seam intro talk where he built an app from the ground, up. He’s going to post the demo (and what it took to build it) in a screen cast on his web site. I’ll update this when he’s posted the demo.

Seam 2.0.1 is now GA. Seam 2.1.0.A1 adds support for Portal and Wicket.

Adobe Flex

There are many cool examples of Rich Internet Applications (RIAs) out there but the eBay Desktop application is one I hadn’t seen before and is quite impressive. Aside from the sexy graphics it really gets interesting when you run it on Adobe Air. Air gives your Flex-based web application the ability to run like a desktop application. Because it has its own database, you don’t even have to be connected to the net. In the eBay example, when you close the app, it continues to run in the background and pops up little alerts when certain events happen (such as getting outbid or winning an auction).

Some goals of RIAs

  • Richer, desktop-like user experience
  • Remove view logic from the server
  • Do more on the client (such as sorting, cacheing, etc.)

Flex is made up of two parts

  • ActionScript 3 (JavaScript 2)
  • MXML (Declarative markup that wraps around the ActionScript

Flex’s two-stage compiler converts the MXML into ActionScript then compiles the ActionScript into bytecode which is saved as a SWF and then played by either the Flash player or AIR.Developing a Flex application essentially involves creating a mock dataset, coding up the ActionScript and MXML, compiling, testing, and iterating, and then implementing the thin layer that sits between the ActionScript and your back-end data sources.

More on Flex here

JBoss Single Sign-On (SSO) Framework

The JBoss SSO Framework allows users to sign in to a webapp once and automatically be authenticated in others even if the webapps are running in different domains. If the webapps are already using JAAS, this requires no change to the participating webapps. Unlike other SSO implementations, there is no central authentication server. If a user is unauthenticated, the webapp authenticates the user. When the user visits the next participating webapp, their credentials are trusted and the user doesn’t have to log in.

Four components of the JBoss SSO Framework:

  • Token management
  • Identity connector
  • Webapp coordinator
  • Federation server

Currently, only Java web applications are supported. Because the payload is based on XML (SAML 1.0) it should be possible to make the framework suitable for other types of webapps. If someone were willing to port the identity coordinator and webapp coordinator to another language such as PHP or Python, you could have a mix of Java and non-Java web apps in your federation. Of course JBoss would be happy to take the donation.

More on the framework here.

Two interesting Hibernate projects you should check out

I attended two very interesting Hibernate sessions at JBoss World yesterday. One was on Hibernate Search, the other was Hibernate Shards.

Hibernate Search

What do you do when your customers add “Google-like search” to the list of requirements for the web application you are delivering? With a straight relational back-end, implementing that requirement might be tougher than you’d think. Just a few challenges are:

  • “Keyword” search means you’re probably going to want to use wildcards which perform poorly.
  • Your data is scattered across multiple columns. Writing a SQL query to search them all is ugly.
  • SQL doesn’t really know how to deal with typical search constructs like proximity, synonyms, or relevance.

Hibernate Search is the answer. It combines the power of the proven Apache Lucene search engine with the ease of configuration of Hibernate. Hooking Hibernate Search into your app is a matter of dropping in a couple of JAR files and adding annotations to your classes to describe what should and shouldn’t be indexed.

Hibernate Shards

What do you do when you can’t (or don’t want to) put all of your data in the same relational database? Google’s word for horizontal partitioning–taking horizontal slices of your database and storing each slice in a separate physical database–is called “sharding”. Give a few Google engineers passionate about the subject a few months and Hibernate Shards is the result (Google donated the project to JBoss).

Shards lets you come up with your own scheme for how rows will be partitioned. Each partition is called a shard. Once you’ve settled on a scheme it’s a matter of configuration through familiar Hibernate configuration constructs. If you don’t use any of the out-of-the-box implementations for how to decide which shard to create and find objects on or for how to generate ID’s, you’ll have to implement those interfaces as well. Once you’ve got everything in place, persisting and querying objects works the same way as straight Hibernate.

Why open source?

It may be surprising to those not actively engaged in the open source revolution, but when I talk to clients about their business problems I don’t have to spend much time, if any, selling the benefits of open source. Most of the businesses we talk to already get it. But every once-in-a-while someone will ask me, “Why open source?”. So I thought I’d talk about some of the reasons why Optaros clients choose open source.

There are many reasons why our clients choose open source. Some clients are initially attracted to open source by the idea that they may be able to lower their total cost of ownership by shifting a portion of their license dollars to services and saving the rest. While that is a consideration, it’s not the whole story. In addition to lower cost, there are at least three other major factors that make assembling solutions from open source components an attractive option for our clients:

  1. Open source solutions are often a better fit.
  2. Open source solutions are often standards-based.
  3. Open source solutions are more transparent.

Let’s look at each of these.

1. Open source solutions are often a better fit.

It has been fairly well documented that companies have over-spent on things like application servers and content management solutions. As a whole the number is easily in the billions of dollars every year. Clients buy into the vendor pitch that a particular package can address all current and most future needs. They spend six figures on up front licenses, three to five times that on services to implement and customize the solution, and then 25% to 40% of the license fee every year on support and maintenance.

In the end they have a very complex, heavily-customized solution that takes a disproportionately large staff to run-and-maintain. They might use a fraction of the functionality the product provides out-of-the-box.

That would be bad enough if the functionality used was a fit with what the business needed. But the real kick in the shins is that after the product is installed the client may still need to go through major efforts to customize, extend, or tweak the system to make it fit their business.

Open source projects tend to offer the lower common denominator of functionality. Without spending anything on up-front licenses, you get what’s common across most installations. You can take what you would have spent on those licenses and spend them on services to fill in the gaps resulting in a closer fit to the business needs at a lower total cost.

As a quick aside regarding lower total cost, let me not paint too rosy of a picture here. Most clients pay someone to support their open source solutions. (As opposed to proprietary vendors, clients actually have a choice of support providers). Commercial support for open source software certainly isn’t free and often suffers from the same issues as closed source software support (slow response, foggy escalation procedures, shallow technical depth).

The point is that you don’t have to buy giant, sprawling platforms to get something that merely approximates your requirements—instead you can assemble solutions from open source components that get you closer to exactly what you need without up-front licenses.

2. Open source solutions are often standards-based.

Open source solutions are often based on established standards (or even built on top of other open source projects). This happens naturally–it is a lot easier, from a collaboration stand point and for sheer level of effort, for an open source community to build upon an established industry standard or other open source projects than it is to come up with a proprietary solution from scratch. Conversely, closed-source vendors invest time and money in proprietary standards to use as a competitive advantage.

Why are open standards important? For a couple of reasons. First, the pool of people in the market place who can quickly get up to speed on your system is far greater than for proprietary systems. As an exercise in the content management space, try searching Monster for people who know Documentum’s WDK. Then, try searching for people who know Spring, Hibernate, or JavaServer Faces (JSF), some of the core technologies behind Alfresco, an open source Documentum competitor.

The other reason why open standards are important is because it means greater flexibility. This flexibility might manifest itself as the ability to swap components in and out to fit the technology preferences at the client or to take advantage of specialized functionality. And it can mean lower switching costs. For example, maybe the repository solution needs to be switched out and because it is standards-based, content migration is easier. Or maybe switching out the portal isn’t as painful because the repository is JSR-170 compliant which makes it easier for other systems, like portals, to get content into and out of the repository.

Proprietary vendors often claim to be “open” and “standards-based”. Usually they are putting lipstick on a pig. The fact that your proprietary repository can store XML does not make it “open” or “standards-based”.

Of course, just because a system is open source does not guarantee 100% interoperability but, in general, open source solutions are much more open and standards-based than their closed source counterparts.

3. Open source solutions are more transparent.

The obvious example of transparency is that clients get the source code. Access to the source is invaluable when troubleshooting or customizing. Clients with proprietary solutions are often forced to decompile vendor code–a clear violation of most license agreements–just to figure out how to properly extend a component.

Beyond troubleshooting or customizing, the source offers an opportunity to help the community improve the product with bug fixes or enhancements. And clients can influence product growth and direction much more easily in the open source world because they can download and build nightly software releases as development is happening. Contrast that with the partner-only “early beta” software release approach of closed source companies.

The way open projects are run is also transparent. When a client invests in a critical piece of infrastructure, they need to know what the issues are with the system and they need access to people who have a deep understanding of the system. Closed-source vendors work to hide bug information for fear of hurting sales. And they tightly control access to the hardcore engineers, usually reserving access to them for “Enterprise” customers paying top dollar for the privilege.

Open source projects, on the other hand, usually provide full access to bug tracking. Clients can vote for and monitor issues they care about. Beyond that, they often make product road maps available in wikis, and low-level product planning and technical discussions available in forums. (Alfresco is one good example of this behavior but is by no means unique).

Open source developers aren’t secluded–they are usually active participants in the community around the project. These communities, made up of product development engineers (often working for commercial open source companies), integrators, consultants, corporate developers, and end-users serve as a valuable support resource for clients that is usually much more useful than the closed pay-for-support system offered by proprietary vendors.

Conclusion

I could add that many clients see solutions assembled from open source as having better stability, faster performance, improved security, and a shorter implementation time but these advantages all tend to accrue as a result of the benefits I’ve outlined above.

I’ll leave you with an anecdote I think is fairly telling. Six or seven years ago, clients would tell me they’d only consider open source as a last resort (if they even knew what it was). Last year, I was talking to a large client with a household name who said, “Our CIO has issued an edict which is essentially ‘open source first’. If we’re going to propose a solution using closed source software, we’d better have a very good reason.” For that client, I’d say open source had reached a tipping point. How far can we be from the rest of the world’s CIOs taking the same stand?

CIO magazine on what makes Alfresco different

Esther Schindler wrote a short article on CIO.com about Alfresco with the premise that the presence of a marketing department and a PR firm makes them unique in the open source world. I liked how she summed up how Alfresco’s approach was different than many open source projects when she said, “Instead of a project that began with the attitude of ‘My Dad has a barn; let’s put on a play!’ the Alfresco team started with a core competency in content management and looked for new market opportunities”.

She also rightly identified Alfresco’s competition as Documentum, OpenText, and FileNet rather than Joomla, Plone, and Drupal although Microsoft (and anyone else on Gartner’s Mysterious Magic Quadrant) should be considered fair game as well.

But I don’t think they are unique in the larger realm of open source. There are many examples of commercial open source companies with much bigger marketing budgets than Alfresco’s, although in the ECM space, I can’t think of one.