Tag: development

My initial experience with Antsle, a virtual machine appliance

I love virtual machines and containers because they make it easy to isolate the applications and dependencies I’m using for a particular project. Tools like Docker, Virtualbox, and vagrant are indispensable for most of my projects and I’m still using those, but in this post I’ll describe a product called Antsle which has given me additional flexibility and has freed up some local resources.

My daily developer workstation is a MacBook Pro with 16 GB of RAM and a 500 GB SSD. From a memory and CPU perspective, it can handle running a handful of virtual machines simultaneously without a problem. But disk space is starting to be an issue.

I use vagrant and Ansible to make virtual machine provisioning repeatable–I can delete any VM at any time without remorse because I can always recreate it easily. But I get tired of continually cleaning up machines and pruning back base boxes just to reclaim space.

I decided to do something about it. My options were:

  • When Apple releases a MacBook Pro that can take 32 GB of RAM, buy that with at least 1 TB SSD, then continue with my current toolset.
  • Buy a Mac Pro or some other desktop to use exclusively for virtual machines.
  • Buy or build an actual server and set it up with virtualization. Something like this, for example.
  • Use AWS for my development virtual machines.

Then I came across a little company based out of San Diego called Antsle. Antsle builds virtualization appliances. What makes their product attractive to me versus buying a workstation or server or building my own is that:

  • The machines have no fan or other moving parts–they are completely silent. The case acts as a heat sink.
  • The machines are energy-efficient. The docs say mine will run at 45 watts.
  • They are built on Linux with standard virtualization technology (LXC and KVM) plus some additional optimizations from Antsle.
  • They are ready-to-go out-of-the-box, saving me the time and effort of building my own solution.

I really like using AWS, and I think for production workloads, no one, not even your own internal IT data center, can do it cheaper or more securely. Plus the breadth of their service offering is nuts. But for my modest developer needs, I’m pretty sure I’ll break even within a year, and that’s not counting the productivity gain of not having to wait for instances to spin up or having to fool with the complexity of the AWS console.

So, after that analysis, I was ready to buy. The biggest struggle was to decide which model to buy and whether or not to do any upgrades. I went for an Ultra, which has an 8-core 2.4 GHz Intel processor, 32 GB of ECC RAM and two Samsung EVO 850 1 TB SSD drives. The drives are mirrored so that’s 1 TB of space. I could have expanded the RAM to 64 GB and increased the storage up to 16 TB, but it was hard to justify the added expense based on my needs.

My Antsle arrived last week and I’ve been pretty happy with it so far. I’ve got a set of “base” images created so that I can easily instantiate new machines based on typical components and configuration. For example, I have an image for every recent Alfresco release. When I need to work on one for a client project or to help someone in the forums, I can just clone one of my base images and start it up. I can let it run as long as I want without worrying about cost, and then kill it or keep it around as needed.

Here is a summary of my experience, thus far:

  • No setup necessary. I plugged it in, started it up, and was starting up machines in minutes.
  • Creating machines from templates, cloning machines, taking snapshots, and startup/shutdown happens very quickly.
  • Templates and instantiated machines take up less space than I would have thought, which is great. So far, I’m glad I stuck with the base storage option.
  • I haven’t pegged the CPU yet, but I have seen it spike briefly to as high as 50%, and that was when I was only running a single VM. I continue to see brief spikes here and there, but as I won’t have too many machines under load at any given time so I’m not that worried about it yet.
  • Documentation seems thorough and helpful. The company has been really responsive and helpful so far as well. They responded to a minor billing issue quickly and resolved it without a fuss.
  • I noticed when you clone a machine that has a bridged network adapter, the MAC address doesn’t change. You have to drop and re-add the NIC if you want a new MAC address, otherwise DHCP will assign it the same IP address as the original machine. This isn’t a big deal once you know the behavior.
  • I had to change the vm.max_mem_map setting to make Elasticsearch happy, which is a typical setup task for Elastic. It took me a minute to realize that needs to be done on the Antsle host and applies to all guests–it cannot be done on the individual VM, at least for LXC.
  • There does not appear to be a way to tag or comment on virtual machines. Additionally, the name you assign to each image is fixed-length and fairly short. So I’m somewhat concerned that, as my library grows, I’ll start to lose track of what’s installed on which machine. AntMan, the management console, seems to be evolving fairly rapidly so maybe this will change in a future release.

I’ve also created a few videos if you want to see it in action.

This video is the unboxing.

This video shows an Ubuntu and a CentOS image being created and then configured for bridged networking.

This video shows how image templates work and gives you a little bit of a feel for the performance using a real-world app (in this case, Alfresco running on CentOS) while other machines are running simultaneously (one Mail/LDAP machine and a four-node Elastic cluster).

 

 

Yet another reason to love Open Source Content Management

Man, I don’t miss delivering solutions on top of Documentum. After reading Laurence Hart’s post on Documentum Developer Edition, I’m reminded how much I take for granted working exclusively in the open source content management world.

Laurence’s post was intended to discuss the ins and outs of Documentum’s efforts to make it easier for developers, and, as usual, he’s done a good job of that. But it also underscores the benefits enjoyed by those who work in open source land. In case you don’t know how good you’ve got it, my open source brothers and sisters, check it out:

Developers working with closed source ECM vendors have to pay to get the software

As Laurence points out,

“There are lots of independent consultants out there that have trouble keeping-up with the technology because they can’t afford to become partners for the requisite fee.”

If you are a developer looking to go deep on closed source software, you have no choice but to pay. There’s no other way to get access to the software. Sometimes you can’t even get access to the documentation or the bug database without a paid-up partner account (or a client that lets you use theirs).

[UPDATE: Jerry Silver, from EMC, points out that the Documentum Developer Edition is a free download. My original post made it sound like you had to be part of the partner program to obtain the download.]

With open source, the barrier to entry is much lower. You pay nothing to get the software. It’s all about the time and energy you put into learning the product and implementing cool solutions.

To be fair, commercial open source vendors often charge partner fees as well, but the bottom line is that it costs nothing to get started with the code.

Developers working with closed source ECM vendors struggle with giant developer footprints

I feel sorry for Laurence’s laptop:

“The complete Development install calls for 3GB of RAM (after a 1.7+GB download).  That is no small thing for a development laptop.  It needs to be on a newer machine.  If you can move the database service to a different box, that will make your life easier.”

Oh dear. A 1.7GB download for a developer setup? Am I downloading a VM image or a content management server? Let’s look at Alfresco for a comparison. Assuming you are starting from scratch, and assuming you are going to go full-on with the Alfresco platform, your total download is right around 300MB. That includes:

  • Alfresco SDK
  • Alfresco WAR
  • Alfresco WCM (Deployment listener and add-on to core repo)
  • Apache Tomcat
  • Sun JDK
  • MySQL (Server and connector)

All of which runs comfortably in 2GB of RAM and won’t even cause your fan to kick on in 4GB.

Developers working with closed source ECM vendors have less choice

Optaros consultants are now split fairly evenly in their choice of OS across Windows, Mac OS X, and some flavor of Linux. Some people prefer MySQL and some prefer PostgreSQL. Mostly we use Eclipse for Java development but everyone’s got a preference. I use Tomcat for everything locally while others like JBoss. The point is, developers want to use their tools the way they want to. It’s not a stubbornness thing it’s an efficiency thing.

Within my CMS I want the same flexibility. I want to tweak settings. I want to name my database what I want. I want the flexibility to deploy across as many (or as few) nodes as I need to. From Laurence’s post, it sounds like Documentum clearly falls down here.

Developers working with closed source ECM vendors can’t see the code

It’s obvious, I know. For developers that work with open source it is extremely natural to use the CMS source code when debugging or for reference. You don’t even think about it–it’s just there and you use it. Imagine the frustration of someone who works with closed source CMS who has to routinely decompile classes to figure out what’s going on. That truly sucks. What good is a “Developer Edition” that doesn’t come with source code?

Partner defections from closed source are on the rise

I’ve seen recent announcements from multiple partners who were previously exclusive to closed source vendors but are now adding open source to their partner list. This is a reflection of increasing demand by customers who are realizing the business value of open source, especially in tough economic times as well as partners’ desire to make up for sagging demand in the proprietary world. But could it also be that more firms are realizing how much more productive and pleasant it is to work with open source content management?

Help your employer/client see the light

Open source ECM technologies like Alfresco, Drupal, Liferay, Lucene, and many others, are now at or beyond their closed source equivalents. If you are a developer who’s sick of the shackles closed source CMS places on you, why not suggest exploring open source alternatives?

Curl up with a good web script

Curl is a useful tool for all sorts of things. One specific example of when it comes in handy is when you are developing Alfresco web scripts. On a Surf project, for example, you might divide into a “Surf tier” team and a “Repository tier” team. Once you’ve agreed on the interface, including both the URLs and the format of the data that goes back-and-forth between the tiers, the two teams can start cranking out code in parallel.

If you’re on the repo team, you need a way to test your API, and you probably don’t have a UI to test it with (that’s what the other team’s working on). There are lots of solutions to this but curl is really handy and it runs everywhere (on Windows, use Cygwin).

This post isn’t intended to be a full reference or how-to for curl, and obviously, you can use curl for a lot of tasks that involve HTTP, not just Alfresco web scripts. Here are some quick examples of using curl with Alfresco web scripts to get you going.

Get a ticket

It’s highly likely that your web script will require authentication. So the first thing you do is call the login web script to get a ticket.

curl -v "http://localhost:8080/alfresco/service/api/login?u=admin&pw=somepassword"

Alfresco will respond with something like:

<?xml version="1.0" encoding="UTF-8"?>
<ticket>TICKET_e46107058fdd2760441b44481a22e7498e7dbf66</ticket>

Now you can take that ticket and append it to your subsequent web script calls.

Any web script you’ve got that accepts GET can be tested using the same simple syntax.

Post JSON to your custom web script

If all you had were GETs you’d probably just test them in your browser. POSTs, PUTs and DELETEs require a little more doing to test. You’re going to want to test those web scripts so that when the front-end team has their stuff ready, it all comes together without a lot of fuss.

So let’s say you’ve got a web script that the front-end will be POSTing JSON to. To test it out, create a file with some test JSON, then post it to the web script using curl, like this:

curl -v -X POST "http://localhost:8080/alfresco/service/someco/someScript?alf_ticket=TICKET_e46107058fdd2760441b44481a22e7498e7dbf66" -H "Content-Type: application/json" -d @/Users/jpotts/test.json

By the way, did you know that starting with 3.0, if you name your controller with “.json” before the “.js” the JSON will be sitting in a root variable called “json”?  So in this case instead of naming my controller “someScript.post.js” I’d name it “someScript.post.json.js” and then in my JavaScript, I can just eval the “json” variable that got created for me automatically and start working with the object,  like this:

var postedObject = eval('(' + json + ')');
logger.log("Customer name:" + postedObject.customerName);

Run a CMIS query

With 3.0 Alfresco added an implementation of the proposed CMIS spec to the product. CMIS gives you a Web Services API, a RESTful API, and a SQL-like query language. Once you figure out the syntax, it’s easy to post CMIS queries to the repository. You can wrap the CMIS query in XML:

<cmis:query xmlns:cmis="http://www.cmis.org/2008/05" >
<cmis:statement><![CDATA[select * from cm_content where cm_name like '%Foo%']]></cmis:statement>
</cmis:query>

Then post it using the same syntax as you saw previously, but with a different Content-Type in the header, like this:

curl -v -X POST "http://localhost:8080/alfresco/service/api/query?alf_ticket=TICKET_e46107058fdd2760441b44481a22e7498e7dbf66" -H "Content-Type: application/cmisquery+xml" -d @/Users/jpotts/cmis-query.xml

Alfresco will respond with ATOM, but it’s a little verbose so I won’t take up space here to show you the result. Also, I noticed this bombed when I ran it against 3.1 Enterprise but I haven’t drilled down on why yet.

Create a new object using CMIS ATOM

Issuing a GET against a CMIS URL returns ATOM. But CMIS URLs can also accept POSTed ATOM to do things like create new objects. For example, to create a new content object you would first create the ATOM XML:

<?xml version="1.0" encoding="utf-8"?>
<entry xmlns="http://www.w3.org/2005/Atom" xmlns:cmis="http://www.cmis.org/2008/05">
<title>Test Plain Text Content</title>
<summary>Plain text content created via CMIS POST</summary>
<content type="text/plain">SGVyZSBpcyBzb21lIHBsYWluIHRleHQgY29udGVudC4K</content>
<cmis:object>
<cmis:properties>
<cmis:propertyString cmis:name="ObjectTypeId"><cmis:value>document</cmis:value></cmis:propertyString>
</cmis:properties>
</cmis:object>
</entry>

Note that the content has to be Base64 encoded. In this case, the content is plain text that reads, “Here is some plain text content.” One way to encode it is to use OpenSSL like “openssl base64 -in <infile> -out <outfile>”. The exact syntax of ATOM XML with CMIS is the subject for another post.

Once you’ve got the XML ready to go, post it using the same syntax shown previously, with a different Content-Type in the header:

curl -v -X POST "http://localhost:8080/alfresco/service/api/node/workspace/SpacesStore/18fd9821-42a5-4c6a-86d3-3f252679cf7d/children?alf_ticket=TICKET_e46107058fdd2760441b44481a22e7498e7dbf66" -H "Content-Type: application/atom+xml" -d @/Users/jpotts/testCreate.atom.xml

The node reference in the URL above is a reference to the folder in which this new child will be created. There’s also a similar URL that uses the path instead of a node ref if that’s more your thing.

Refreshing Web Scripts from Ant

One of the things you do quite frequently when you develop web scripts is tell Alfresco to refresh its list of web scripts. There are lots of ways to automate this, but one is to create an Ant task that uses curl to invoke the web script refresh URL. This lets you deploy your changes and tell Alfresco to refresh the list in one step (and makes sure you and your teammates never forget to do the refresh).

<target name="deploy-webscripts" depends="deploy" description="Refreshes the list of webscripts">
<exec executable="curl">
<arg value="-d"/>
<arg value="reset=on"/>
<arg value="http://${alfresco.web.url}/service/index"/>
</exec>
</target>

In this example, the “deploy” ant task this task depends on is responsible for copying the web scripts to the appropriate place in the exploded Alfresco WAR. (Thanks to my colleague Eric Shea (http://www.eshea.net/2009/01/30/alfresco-dev-survivors-kit-part-1/) for this tip).

So there you go. It’s not Earth-shattering but it might give you a productivity boost if you don’t already have curl or an alternative already in your bag of tricks.