Category: Open Source

Now available: Alfresco Fivestar Ratings add-on for Alfresco Share

A couple of weeks ago I posted a survey asking if anyone saw any value in a five star ratings widget for Alfresco Share. Honestly, it would have only taken one or two positive responses–even if no one needed one, there’s value in it for example’s sake. It turns out about 20 readers of this blog voted positively, so I went ahead and knocked it out.

This Alfresco Share customization makes it possible for any document in the repository to become “rateable”. When a document is rateable, the Alfresco Share user interface will show a clickable five star ratings widget. The stars light up to indicate the average rating for that document. Users simply click one of the stars to post their own rating. When clicked, the widget refreshes itself with the updated average.

Here is a short screencast that demonstrates the customization. You’ll want to make it full screen.

To implement this, I took the Someco Ratings Service from the Alfresco Developer Guide, moved it to the Metaversant namespace, and changed the names of my Spring beans and JavaScript root variable. Even though my initial target Alfresco version is 3.3, I didn’t want the code to conflict with Alfresco’s new back-end-only ratings service in 3.4 which uses some of the same names that were in the book. I also changed the JSON that the ratings web scripts use to be closer to what exists in 3.4. That way, when I do make a version that works with 3.4, it could potentially work with either my ratings back-end or Alfresco’s.

I then went to work on the UI side, integrating the widget into Share’s document details page, document library (both Share and repository views), search results page, and document-related dashlets. To go from what was in the book to a working integration I revamped the client-side ratings JavaScript from a set of functions to an actual object. Then, I started injecting my own methods into Alfresco’s client-side object prototypes to drop my widget in where appropriate.

Alfresco is still working to make customizations like this more modular and easier to plug in alongside their code and code from the community. Until then, be aware that if your Alfresco implementation already has customizations that override some of the same web scripts and client-side components this module does, there may be some manual integration needed. If you have an out-of-the-box installation (or a set of customizations that won’t conflict with this one) you can deploy the AMP to the Alfresco WAR and the Share customizations to the Share WAR and you’ll be set.

The Alfresco Fivestar Ratings project lives at Google Code. Feel free to check out the source, try it out, and use it on your projects. If you find a bug, log it, then fix it!

Apache Chemistry cmislib 0.4 incubating now available

Apache Chemistry LogoThe Apache Chemistry development team is pleased to announce that the 0.4 incubating release of cmislib, the Python client API for CMIS, is now available for download. You may have to use one of the backup servers until the mirrors fully update. Alternatively, you can use easy_install to install cmislib by typing “easy_install cmislib”.

This release has various fixes and enhancements that the community has contributed since cmislib joined the Apache Chemistry project with its 0.3 release. If you are using Alfresco, you might be interested in an enhancement in cmislib 0.4 that makes it possible to use ticket-based authentication instead of basic auth.

For those who haven’t used it, cmislib makes it easy to work with CMIS-compliant repositories from Python.

Introducing the Alfresco Community Committer Program

It’s been a little over two years since I wrote a blog post entitled, “Is Alfresco the ‘near beer’ of open source?“. In that post, I lamented the fact that the Alfresco code line is entirely closed to community developers and that Alfresco seems unwilling to relinquish any amount of control over the development of their open source product. Writing that post had me a bit riled up so during the Q&A session at the community meetup in San Jose later that week, I asked John Newton, Alfresco CTO, and former Alfrescan, Kevin Cochrane when and if it would ever be different. They said they were “working on it” (See Alfresco pledges to open community by 3.0).

I’m glad to say that, although it took a while, there is now a process by which your code can find its way into the Alfresco code base (Community, and even, potentially, Enterprise). It’s called the Alfresco Community Committer Program (ACCP). The ACCP is a motley crew of volunteers from Alfresco customers and partners around the world. Although not a requirement for membership, I think most of us have developed at least one open source add-on for Alfresco. Our goal is to help community-developed code find its way into the product. Does this mean Alfresco is now as open as “true” open source community projects like Apache and Drupal? No, and honestly, I’m not sure it will ever get there. But Alfresco’s support of the ACCP process is a start. Here’s how the process works.

First step: Nomination to the ACCP Incubator

Today, developers in the community create add-ons, utilities, extensions, language packs and all kinds of software built to work with Alfresco. Some of these might make great additions to the Alfresco product. At a high-level, what the ACCP seeks to do is to act as an on-ramp or incubator for that subset of projects. We want you, real world Alfresco developers and end-users, to nominate community-developed extensions that you find useful and that you would eventually like to see as part of the Alfresco product. The ACCP then reviews these nominations and votes for their inclusion into the incubator. The project’s developers can then decide to leave their code where it is (Google Code, Sourceforge, Alfresco forge, etc.) or they may choose to migrate to the Alfresco-hosted ACCP incubator subversion repository.

Projects accepted to the incubator so far include:

As a side note, it’s great that there are so many community-developed add-ons for Alfresco. But the lack of a central index makes it hard to see what’s available. As a related effort, Nancy Garrity is working on something that would provide a central index, support ratings, etc.

Second step: Community code line

Once a project has been in the incubator for a while, the ACCP may recommend its inclusion as part of Alfresco Community making it much easier for Alfresco Community users to leverage these add-ons. The exact nature of how these will be made available is still being worked out. You could imagine a “community-extensions” directory under the Alfresco Community subversion root or something similar. For certain types of contributions, maybe the installer could even provide an optional “install community extensions” step. Again, although we have recently voted some projects into the ACCP incubator, none have yet to reach Community so the details of exactly how those will be incorporated into the Community code base are still being worked out.

Third step: Enterprise code line

The ACCP may then recommend Enterprise adoption. This step is subject to Alfresco Engineering approval, which may be a significant hurdle for some, but if it happens, the entire Alfresco customer base gets the benefit of Alfresco’s ongoing support of the community-developed code. Note that the Enterprise approval step is the only one where Alfresco employees have a say about how an ACCP project is handled–per our charter, Alfresco employees cannot be voting members of the committee.

How you can get involved

First, and foremost, you can nominate an open source Alfresco add-on/extension/customization project. If you want to take an active role on the committee or know someone who would be a good addition, there are spots available. So, another way to help out would be to serve on the committee or nominate someone who should. The committee meets regularly to review and vote on project and committee member nominations. All you have to do is get in touch with me or one of the other regular members of the committee. You’ll find the list on the Alfresco Community Contributor Program wiki page.

We’ll be doing a webinar on July 27th to talk about this more and answer questions. Check out the Alfresco events page to register.

Updated Python CMIS library released

I’ve tagged and released a new version of cmislib, the Python CMIS client library. What’s cool about this release is that it is the first one known to work with more than one CMIS provider. Yea for interoperability! The beauty of CMIS, realized! Okay, it wasn’t that beautiful, it’s still “0.1”, and there are known issues. But I can now say the library works with both Alfresco and IBM FileNet and that’s a Good Thing.

IBM was a big help with this. Al Brown, one of the CMIS spec leads turned one of his colleagues, Jay Brown, onto cmislib. Jay called me up and asked, “If I give you access to a FileNet P8 server, can you test cmislib against it?” I was on it faster than you could say, “unittest.main()”.

I think the effort was valuable for all sides. Our little “mini plugfest” turned up issues in my client as well as both CMIS providers. Jay worked hard to chase down everything on the FileNet side. Dave Caruana chased a few down on the Alfresco side as well. Thanks to everyone for the team effort.

Anyway, give the new cmislib release a try and give me your feedback. If you want a feel for how easy it can be to work with CMIS repositories using the cmislib API, check out the documentation or dive right in. Installation is as easy as “easy_install cmislib” (easy_install instructions).

Next up is Nuxeo. Can the open source ECM vendor achieve cmislib Unit Test Greatness faster than Big Blue? We shall see!

Alfresco Share Status microblog component now supports 3.2 Enterprise

I’ve posted a new release of the Alfresco Share Status microblog component (original post). There are no new features–it’s just a few bug fixes. But one of the bugs was keeping it from working on 3.2 Enterprise. That’s now fixed, so if you’ve upgraded to Alfresco 3.2 Enterprise and you want to use this component in your Share sites, have at it.

Although I’ve tagged both the repository extensions and the surf extensions as “0.2”, only the surf extensions have changed since the last release. If you want to know specifically what’s changed, refer to the Release Notes.

cmislib: A CMIS client library for Python

I’ve started a new project on Google Code called cmislib. It is an interoperable client library for CMIS in Python that uses the Restful AtomPub Binding of a CMIS provider to perform CRUD and query functions on the repository.

I created it for a couple of reasons. First, it’s been bugging me that, unlike our Drupal Alfresco integration, our Django Alfresco integration does not use CMIS. After talking it over with one of our clients we decided it would make more sense to create a more general purpose CMIS API for Python that Django (and any other Python app) could leverage, rather than build CMIS support directly into the Django Alfresco integration.

Second, around the time I was putting together the Getting Started with CMIS tutorial, it struck me that there needed to be an API that didn’t have a lot of dependencies and was very easy to use. Otherwise, it’s too easy to get lost in the weeds and miss the whole point of CMIS: Easily working with rich content repositories, regardless of the underlying implementation.

Even if you’ve never worked with Python before, it is super easy to get started with cmislib. The install is less than 3 steps and the API should feel very natural to anyone that’s worked with a content repository before. Check it out.

Install

  1. If you don’t have Python installed already, do so. I’ve only tested on Python 2.6 so unless you’re looking to help test, stick with that.
  2. If you don’t have setuptools installed already, do so. It’s a nice tool to use for installing Python packages.
  3. Once setuptools is installed, type easy_install cmislib

That’s all there is to it. Now you’re ready to connect to your favorite CMIS-compliant repository.

Examples

There’s nothing in cmislib that is specific to any particular vendor. Once you give it your CMIS provider’s service URL and some credentials, it figures out where to go from there. But I haven’t tested with anything other than Alfresco yet, and this thing is still hot out of the oven. If you want to help test it against other CMIS 1.0cd04 repositories I’d love the help.

Anyway, let’s look at some examples using Alfresco’s public CMIS repository.

  1. From the command-line, start the Python shell by typing python then hit enter.
  2. Python 2.6.3 (r263:75183, Oct 22 2009, 20:01:16)
    GCC 4.2.1 (Apple Inc. build 5646)] on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>>
  3. Import the CmisClient and Repository classes:
  4. >>> from cmislib.model import CmisClient, Repository
  5. Point the CmisClient at the repository’s service URL
  6. >>> client = CmisClient('http://cmis.alfresco.com/s/cmis', 'admin', 'admin')
  7. Get the default repository for the service
  8. >>> repo = client.getDefaultRepository()
    >>> repo.getRepositoryId()
    u'83beb297-a6fa-4ac5-844b-98c871c0eea9'
  9. Get the repository’s properties. This for-loop spits out everything cmislib knows about the repo.
  10. >>> repo.getRepositoryName()
        u'Main Repository'
    >>> info = repo.getRepositoryInfo()
    >>> for k,v in info.items():
        ...     print "%s:%s" % (k,v)
        ...
        cmisSpecificationTitle:Version 1.0 Committee Draft 04
        cmisVersionSupported:1.0
        repositoryDescription:None
        productVersion:3.2.0 (r2 2440)
        rootFolderId:workspace://SpacesStore/aa1ecedf-9551-49c5-831a-0502bb43f348
        repositoryId:83beb297-a6fa-4ac5-844b-98c871c0eea9
        repositoryName:Main Repository
        vendorName:Alfresco
        productName:Alfresco Repository (Community)

Once you’ve got the Repository object you can start working with folders.

  1. Create a new folder in the root. You should name yours something unique.
  2. >>> root = repo.getRootFolder()
    >>> someFolder = root.createFolder('someFolder')
    >>> someFolder.getObjectId()
    u'workspace://SpacesStore/91f344ef-84e7-43d8-b379-959c0be7e8fc'
  3. Then, you can create some content:
  4. >>> someFile = open('test.txt', 'r')
    >>> someDoc = someFolder.createDocument('Test Document', contentFile=someFile)
  5. And, if you want, you can dump the properties of the newly-created document (this is a partial list):
  6. >>> props = someDoc.getProperties()
    >>> for k,v in props.items():
    ...     print '%s:%s' % (k,v)
    ...
    cmis:contentStreamMimeType:text/plain
    cmis:creationDate:2009-12-18T10:59:26.667-06:00
    cmis:baseTypeId:cmis:document
    cmis:isLatestMajorVersion:false
    cmis:isImmutable:false
    cmis:isMajorVersion:false
    cmis:objectId:workspace://SpacesStore/2cf36ad5-92b0-4731-94a4-9f3fef25b479
  7. You can also use cmislib to run CMIS queries. Let’s find the doc we just created with a full-text search. (Note that I’m currently seeing a problem with Alfresco in which the CMIS service returns one less result than what’s really there):
  8. >>> results = repo.query("select * from cmis:document where contains('test')")
    >>> for result in results:
    ...     print result.getName()
    ...
    Test Document2
    example test script.js
  9. Alternatively, you can also get objects by their object ID or their path, like this:
  10. >>> someDoc = repo.getObjectByPath('/someFolder/Test Document')
    >>> someDoc.getObjectId()
    u'workspace://SpacesStore/2cf36ad5-92b0-4731-94a4-9f3fef25b479'

Set Python loose on your CMIS repository

These are just a few examples meant to give you a feel for the API. There are several other things you can do with cmislib. The package comes with documentation so look there for more info. If you find any problems and you want to pitch in, you can check out the source from Google Code and create issues there as well.

Give this a try and let me know what you think.

[UPDATE: I had the wrong URL for the Alfresco-hosted CMIS service. It’s fixed now.]

Alfresco Share microblogging component released as open source

Back in February (I know, it’s been simmering on the back burner for too long), I did a couple of screencasts on Optaros Labs showing a demo of Alfresco Share (part 1, part 2). In part 2 of that screencast I showed two custom components: Status and Bookmark. Alfresco made Bookmark obsolete by releasing their own shared bookmarks module for Share, and that’s a Good Thing. I kind of expected them to release a microblog component as well, but they haven’t yet. Well, I finally got around to making ours available, so until a similar feature makes it into the product, feel free to use it in your own projects.

The component is simple: A “My Current Activity” dashlet lets you and your team give a quick blurb about what you’re working on. Another dashlet aggregates all of the status entries from your teammates. A global dashlet aggregates the entries from all Share sites. All status changes automatically show up in Alfresco’s Activity Feed as well.

My Current Activity Dashlet
My Current Activity Dashlet

Unlike Twitter, the status component lets you mark an entry as “done”. When you do that, your current status gets reset and the old entry moves to the archive. So it’s a little more task-oriented than more general purpose, free-form microblogging tools.

Deployment is pretty easy. An AMP gets deployed to your Alfresco WAR, and a ZIP gets unzipped into your Alfresco Share web application. That’s it. No configuration necessary. All of the data lives in the same structure as the other tools in your Share site.

I’ve put the code out on Google Code under a BSD license. There’s a pre-built AMP and a ZIP for download or you can checkout and build from source. There’s one Eclipse project for the repository tier and one for the Surf tier. I’ve tested this on Alfresco 3.2 Community. I’ll test it out on the Enterprise releases when I get a chance. There were some changes in the Activity Feed that I had to deal with and I’m not sure how far back those go so I may have to have version-specific releases.

Have a look and give me your feedback. If you want to dig in and make enhancements, bring ’em on.

Screencast: Basic Alfresco-Kaltura integration

Bryan Spaulding, Media Practice Lead at Optaros, and I have been thinking about lightweight digital asset management and Alfresco. Alfresco can manage any kind of asset, including rich media. It has some built-in functionality for doing image transformations and you can easily integrate with open source solutions like ffmpeg to work with video. But many of our clients need something more, especially when it comes to video.

That’s where Kaltura comes in. Kaltura is a fully hosted video solution that provides full analytics, flexible and customizable players and playlists, and robust back-end CDN and hosting services. You can also download the open source Kaltura Community Edition and run it yourself if you want.

There are a variety of ways Alfresco and Kaltura could work together. We decided to start with a basic integration focused on the Alfresco DM repository. The idea is to use that as a foundation, expanding in the future based on community and client feedback to include deeper functionality for the DM repository or broader integration with other Alfresco products like Alfresco Share and Alfresco WCM.

In this short screencast, I demo the basic CRUD functions the integration provides. You will probably want to hit the “full screen” icon on the Kaltura player to see the detail.

The integration is available as open source. You can download the integration from Kaltura’s community site and use it on your projects, or better yet, expand on it and contribute back the code. The readme that is included with the source includes installation and configuration instructions.

Yet another reason to love Open Source Content Management

Man, I don’t miss delivering solutions on top of Documentum. After reading Laurence Hart’s post on Documentum Developer Edition, I’m reminded how much I take for granted working exclusively in the open source content management world.

Laurence’s post was intended to discuss the ins and outs of Documentum’s efforts to make it easier for developers, and, as usual, he’s done a good job of that. But it also underscores the benefits enjoyed by those who work in open source land. In case you don’t know how good you’ve got it, my open source brothers and sisters, check it out:

Developers working with closed source ECM vendors have to pay to get the software

As Laurence points out,

“There are lots of independent consultants out there that have trouble keeping-up with the technology because they can’t afford to become partners for the requisite fee.”

If you are a developer looking to go deep on closed source software, you have no choice but to pay. There’s no other way to get access to the software. Sometimes you can’t even get access to the documentation or the bug database without a paid-up partner account (or a client that lets you use theirs).

[UPDATE: Jerry Silver, from EMC, points out that the Documentum Developer Edition is a free download. My original post made it sound like you had to be part of the partner program to obtain the download.]

With open source, the barrier to entry is much lower. You pay nothing to get the software. It’s all about the time and energy you put into learning the product and implementing cool solutions.

To be fair, commercial open source vendors often charge partner fees as well, but the bottom line is that it costs nothing to get started with the code.

Developers working with closed source ECM vendors struggle with giant developer footprints

I feel sorry for Laurence’s laptop:

“The complete Development install calls for 3GB of RAM (after a 1.7+GB download).  That is no small thing for a development laptop.  It needs to be on a newer machine.  If you can move the database service to a different box, that will make your life easier.”

Oh dear. A 1.7GB download for a developer setup? Am I downloading a VM image or a content management server? Let’s look at Alfresco for a comparison. Assuming you are starting from scratch, and assuming you are going to go full-on with the Alfresco platform, your total download is right around 300MB. That includes:

  • Alfresco SDK
  • Alfresco WAR
  • Alfresco WCM (Deployment listener and add-on to core repo)
  • Apache Tomcat
  • Sun JDK
  • MySQL (Server and connector)

All of which runs comfortably in 2GB of RAM and won’t even cause your fan to kick on in 4GB.

Developers working with closed source ECM vendors have less choice

Optaros consultants are now split fairly evenly in their choice of OS across Windows, Mac OS X, and some flavor of Linux. Some people prefer MySQL and some prefer PostgreSQL. Mostly we use Eclipse for Java development but everyone’s got a preference. I use Tomcat for everything locally while others like JBoss. The point is, developers want to use their tools the way they want to. It’s not a stubbornness thing it’s an efficiency thing.

Within my CMS I want the same flexibility. I want to tweak settings. I want to name my database what I want. I want the flexibility to deploy across as many (or as few) nodes as I need to. From Laurence’s post, it sounds like Documentum clearly falls down here.

Developers working with closed source ECM vendors can’t see the code

It’s obvious, I know. For developers that work with open source it is extremely natural to use the CMS source code when debugging or for reference. You don’t even think about it–it’s just there and you use it. Imagine the frustration of someone who works with closed source CMS who has to routinely decompile classes to figure out what’s going on. That truly sucks. What good is a “Developer Edition” that doesn’t come with source code?

Partner defections from closed source are on the rise

I’ve seen recent announcements from multiple partners who were previously exclusive to closed source vendors but are now adding open source to their partner list. This is a reflection of increasing demand by customers who are realizing the business value of open source, especially in tough economic times as well as partners’ desire to make up for sagging demand in the proprietary world. But could it also be that more firms are realizing how much more productive and pleasant it is to work with open source content management?

Help your employer/client see the light

Open source ECM technologies like Alfresco, Drupal, Liferay, Lucene, and many others, are now at or beyond their closed source equivalents. If you are a developer who’s sick of the shackles closed source CMS places on you, why not suggest exploring open source alternatives?

Notes from OSCON 2009 in San Jose

I’m back from San Jose. My colleage, Dave Gynn, and I had fun at the O’Reilly Open Source Conference (OSCON) and learned a lot. Dave’s ability to pick out open source rockstars from a crowd is uncanny. It was pretty sweet seeing Larry Wall (and his family) hanging out and then hearing him speak. Although there are all kinds of topics on all things Open Source, the conference does have a heavy Perl bias.

Dave and I decided we were glad we went but we don’t feel like we have to be there every year going forward. This was my first time, but Dave said the general excitement level seemed low for some reason. Maybe it was Allison Randal’s seriously downbeat welcome address. Not sure. Anyway, here are my rough notes from some of the sessions I attended…

“Open Source in Government” was a big theme at OSCON this year. Speakers tried to instill a sense of urgency in the audience by saying that the window of opportunity for getting the government behind open source in a big way will only be open for a few more months. If you want to get involved, check out some of these links:

Data.gov mash-up contest
http://sunlightlabs.com/contests/appsforamerica2/

Machine readable datasets from the US Govt
http://www.data.gov/

Help the government make better use of open source
http://www.opensourceforamerica.org/

Some folks from Liferay presented on a new UI framework they’ve created called Alloy. Alloy is aimed at providing a single framework that addresses HTML, CSS, and JavaScript in a way that is abstracted from the underlying libraries. Alloy basically extends/subclasses JQuery and YUI. Liferay is migrating a lot of their OOTB portlets now to the new framework. It is expected to ship as part of 5.3. This talk was more about the “why” and less about the “what”. I would have liked to see more examples/demos.

Went to a talk on “using Django for election audits” that turned out to be more about how screwed up our elections process is and the minutiae of performing an audit on election results with not so much on how Django was used to solve the problem. The speaker did give a shout out to the Django Debug Toolbar that might prove to be useful. The presenter is looking for help with the project. He needs everything from UI help to people who can send him election results from their local election boards.

Saw a decent talk on Apache CouchDB. Couch is a schema-less database that is built for massive distributed scalability. Instead of SQL you use map-reduce functions to query. Key to Couch is the concept of “eventual consistency”–in a Couch app, data can be consistent over time instead of right now. Couch always knows either the correct old value or the correct current value, but it may take time to propogate the current value to every node in the system.

Noteworthy bullet points:

  • Couch can idle in 4MB of RAM. With a couple of production databases Couch will use about 20MB.
  • Canonical is including Couch in the Karmic Koala release. This will give apps running on Karmic the ability to easily sync data between nodes. Couch will also be running as part of Ubuntu One which means Karmic desktops can sync data with the Ubuntu cloud (See the Ubuntu wiki).
  • Someone is currently working on a JavaScript implementation of Couch. Among other things, this would give you the ability to replicate your CouchDB to a local version of Couch running in someone’s browser.
  • Current ACL is limited to “you are either an admin or you aren’t”. ACL for writers *might* make it into 1.0. ACL for readers won’t.

I went to the “JRuby on AppEngine” talk not for the JRuby, but because it was the only Google AppEngine session I could find. I was looking for some factoids on who’s using AppEngine. Here’s what they said:

  • 200,000 registered developers
  • 85,000 applications
  • Household names such as: eBay, Best Buy, Forbes, Whitehouse.gov.

Whitehouse.gov was a cool scalability story for AppEngine. They used AppEngine to moderate questions submitted during Obama’s first online town hall. According to the Google Code blog,

“During the 48-hour open voting period, the site peaked at 700 hits per second, and 92,934 people submitted 104,073 questions and cast 3,605,984 votes. In total, over one million unique visitors visited the site before the town hall. Even while the site was featured on major news outlets and even the Google homepage the other 50,000 apps built on App Engine were fully supported and experienced no adverse effects.”

The Erlang talk provided a good history of the language. I would have liked more on the language itself and less of the detailed history behind Ericsson’s telecom switches (even though Erlang played a critical role in those products). I was aware that CouchDB is built with Erlang but the speaker mentioned a couple of other open source projects that leverage Erlang that I hadn’t heard of: ejabberd is an Erlang-based chat server and RabbitMQ is an Erlang-based messaging server.

The “building a business on an open source distributed cloud” talk by Bradford Stephens was good. The speaker’s company, Visible Technologies, mines social networks and the internet in general for consumer sentiment on its customer’s brands. Their system ingests vast subsets of the Internet, parses the results, processes it, and indexes it so that they can run analytics against it for their clients. They moved from an all-Microsoft stack to an open source stack and have been very happy with it.

This was the third “noSQL”-themed talk I saw. He made a good point that when we design apps, we should be saying, “I need persistence” and then figure out what is the best provider of that given scalability and other constraints rather than starting out with “I need a relational database”.

The open source stack used by Visible Technologies includes the usual search players (Lucene, Nutch, Solr) as well as one I haven’t heard of: Katta is used to shard large Lucene indexes across multiple servers. They also use a couple of Hadoop sub-projects, HBase and ZooKeeper, and several others.

The New York Times API and NPR API talks were very good. I didn’t realize how many different API’s NYT has exposed. You can check out their API’s around people, news, search, movies, and books at http://developer.nytimes.com. Their blog is also worth checking out.

Lots of apps have been built using the NYT API. A personal favorite is InstantWatcher. It is a mash-up of NYT’s movies API with Netflix that helps you find good movies available to watch instantly.

NPR’s talk focused less on their specific API and more on how it is being used. Noteworthy bullets:

  • You can build API calls with their query generator (requires a free API key) or by hand (doc).
  • NPR offers tiered key levels. If you create something cool and drive a little traffic their way, you can get your key upgraded to a higher tier.
  • There are no rate limits. NPR believes they have built an infrastructure that can take “anything we can throw at it”.
  • The API has 2,000 users and serves 24 million requests (per ?) averaging 2 million requests per month.
  • 50% of the API requests are for NPRML with less than 0.1% requesting ATOM. NPR API results are also available as JSON, RSS, and several other formats.
  • The NPR Digital Media team blogs at http://www.npr.org/blogs/inside/
  • Interesting side-note: NPR is currently migrating off of Oracle 10g to MySQL

After the NYT and NPR talks, they held a developer meet-up of sorts. Unfortunately I had to head to the airport so I missed out on that.