Tag: REST

Seven Options for Scripting Against Alfresco

It is often necessary to perform bulk operations against the data in your Alfresco repository. Frequently, writing a quick script, rather than a full application, is the best approach. Sometimes those scripts need to be scheduled, like with a cron job, or given to power users to run ad hoc.

The good news is that there are many options available depending on what you need to do and personal preference.

All of these options can be used to develop full-blown applications, but the focus in this post is on how well each option works specifically for the scripting use case.

JavaScript Console
License: Apache 2.0
Link: https://github.com/share-extras/js-console
Install: Repo and Share AMPs, requires restart

The JavaScript Console provides a user interface within Share to interactively run server-side scripts that leverage the Alfresco JavaScript API. The UI features syntax highlighting and code completion.

This is a must-have tool for both developers and administrators. It is the fastest way to perform bulk tasks or to test out JavaScript logic before using it in something harder to debug or change, such as in the controller of a web script. I install this on every server I touch. I suppose that, eventually, as Share falls out of favor, this will be less of a go-to option, but, for now, it is essential.

The JavaScript Console is only available to administrators, so this is not an option if you are developing something that will be invoked by non-admins.

Apache Chemistry CMIS Workbench
License: Apache 2.0
Link: https://chemistry.apache.org/java/developing/tools/dev-tools-workbench.html
Install: On your own desktop, no server config/restart required

The Apache Chemistry CMIS Workbench is a desktop application that uses pure CMIS calls to communicate with the repository. It contains a lot of useful features such as a CMIS Query Language console, a node browser, a data dictionary browser, and a Groovy console. The Groovy console is similar to the JavaScript console in that it can be used to quickly script repetitive tasks, but instead of JavaScript it uses Apache Groovy. If you aren’t familiar with Groovy but you know Java, you can use Java in the Groovy console–Groovy is a JVM language and is happy to interpret your Java syntax.

This tool is particularly useful if you are writing code that leverages CMIS. If you can do it in the Workbench you can be confident you can do it from your CMIS-based application. However, it is also useful when you just need to execute some repetitive tasks, especially if the server in question does not have the JavaScript Console installed.

Unlike the JavaScript Console, however, you are limited by what is supported in the CMIS specification. For example, you cannot manage users, groups, or workflows, just to name a few. It is pretty much limited to documents, folders, and items (content-less objects). Custom content models, however, are fully-supported.

The Swing-based UI of the workbench has a face only a mother could love so this is definitely not something you’d put in the hands of typical end-users.

Apache Chemistry cmislib (Python client)
License: Apache 2.0
Link: https://chemistry.apache.org/python/cmislib.html
Install: Client library, no server config/restart required

If what you need to do is covered by the CMIS specification but you prefer Python, then Apache Chemistry cmislib might be a good choice. It is a client library that you can use from your own Python scripts and applications. For example, if I need to generate a bunch of folders or documents, I’ll often just write a quick Python script to do it.

The library has the same limitation as the Workbench in that it only makes CMIS calls, but if that’s all you need it should work well. The current release requires Python 2.7 but a new release that supports Python 3.x is being tested now.

If you are giving the script to power users who might want to make tweaks this can be a good choice if they already have Python installed.

Apache Chemistry OpenCMIS (Java client)
License: Apache 2.0
Link: https://chemistry.apache.org/java/opencmis.html
Install: Client library, no server config/restart required

Rounding out my CMIS-related recommendations is Apache Chemistry OpenCMIS. Like cmislib, it is a client library, but this one is written in Java and is much more mature and more widely-used. For quick scripts you can use Groovy on the command-line to make CMIS calls against Alfresco via OpenCMIS. Often, I already have my Java IDE open anyway, so it can be just as quick to write a little runnable Java class that uses OpenCMIS to carry out some tasks.

I use Groovy scripts and OpenCMIS when I want to automate tasks that power users are going to run from the command-line that they may want to tweak, but who do not have Python installed.

When a single script needs to perform operations that are a mix of CMIS and non-CMIS, I use Groovy’s HTTP client to call the Alfresco REST API (more on that, below).

Alfresco Web Script Framework
License: LGPL v3
Link: https://docs.alfresco.com/5.2/concepts/ws-framework.html
Install: For quick scripts, copy to Data Dictionary

Strictly speaking, web scripts weren’t built to be a “quick scripting” option, but they’ll work in a pinch. Basically, you just write a web script that does what you need it to do, but instead of packaging the web script into an AMP like you would for a production extension, you deploy the web script to the running repository by copying the web script related files into the Data Dictionary’s Web Scripts folder. The next step is to refresh the web scripts via the web script console. After that the web script can be invoked in a browser, via CuRL, or Postman.

Just like the JavaScript Console, you are limited to what the Alfresco JavaScript API can do, but that is definitely a broader set of capabilities than a CMIS-based solution.

Files in the Data Dictionary can only be edited by administrators, so if the folks running this script need to make changes and aren’t administrators, this won’t work. You can always build that flexibility into the web script and let them control various options by what they pass to the web script.

For more information on Web Scripts, see the documentation link above or check out my tutorial.

Alfresco Client-side JavaScript API
License: Apache 2.0
Link: https://github.com/Alfresco/alfresco-js-api
Install: Node Package Manager

A relatively new kid on the block is the Alfresco Client-side JavaScript API, aka, alfresco-js, which is part of the Alfresco Developer Framework (ADF). Of course you can use it to develop custom user interfaces in your favorite JavaScript framework, but you can also use it for quick scripting needs. For example, you could write a Node.js script and run it from the command-line.

The alfresco-js library relies on the Alfresco REST API that was created in 5.2, so if you need to work with an older Alfresco version, this isn’t an option. The alfresco-js library and the REST API are both under active development, so there could be gaps and some instability, but it is still worth taking a look, especially if you are already invested in the Node.js and NPM ecosystem.

Alfresco Content Services REST API (And others)
License: LGPL v3
Link: https://api-explorer.alfresco.com
Install: Your favorite REST client

Last, but not least, is the Alfresco Content Services REST API. If none of the other options in this list appeal to you, grab your favorite REST client or import your preferred scripting language’s HTTP client library, head over to the Alfresco API Explorer, and get busy.

As mentioned above in the alfresco-js description, this option relies on Alfresco 5.2 or higher (5.2.d Community Edition, distributed as 201612).

If the Alfresco Content Services REST API doesn’t have what you need or you aren’t yet running 5.2 or higher, you could always hit the CMIS browser (JSON) binding, the CMIS atompub (XML) binding, out-of-the-box web scripts, or custom web scripts.

Depending on what you need to do, using a lower-level REST API may not be the fastest option in terms of development time because instead of working with higher-level wrapper classes it is up to you to issue the REST calls and parse the responses.

Summary

Hopefully you see some options in the above list that will work for you and your environment. I definitely recommend you get a few of these in place now (especially the JavaScript Console) because if you ever need them in a hurry (oh, if this keyboard could talk!), you’ll want to have a quick scripting option you are comfortable and proficient with.

Photo Credit: Quality Quick Cleaning by GmanViz, by-nc-nd 2.0

Curl up with a good web script

Curl is a useful tool for all sorts of things. One specific example of when it comes in handy is when you are developing Alfresco web scripts. On a Surf project, for example, you might divide into a “Surf tier” team and a “Repository tier” team. Once you’ve agreed on the interface, including both the URLs and the format of the data that goes back-and-forth between the tiers, the two teams can start cranking out code in parallel.

If you’re on the repo team, you need a way to test your API, and you probably don’t have a UI to test it with (that’s what the other team’s working on). There are lots of solutions to this but curl is really handy and it runs everywhere (on Windows, use Cygwin).

This post isn’t intended to be a full reference or how-to for curl, and obviously, you can use curl for a lot of tasks that involve HTTP, not just Alfresco web scripts. Here are some quick examples of using curl with Alfresco web scripts to get you going.

Get a ticket

It’s highly likely that your web script will require authentication. So the first thing you do is call the login web script to get a ticket.

curl -v "http://localhost:8080/alfresco/service/api/login?u=admin&pw=somepassword"

Alfresco will respond with something like:

<?xml version="1.0" encoding="UTF-8"?>
<ticket>TICKET_e46107058fdd2760441b44481a22e7498e7dbf66</ticket>

Now you can take that ticket and append it to your subsequent web script calls.

Any web script you’ve got that accepts GET can be tested using the same simple syntax.

Post JSON to your custom web script

If all you had were GETs you’d probably just test them in your browser. POSTs, PUTs and DELETEs require a little more doing to test. You’re going to want to test those web scripts so that when the front-end team has their stuff ready, it all comes together without a lot of fuss.

So let’s say you’ve got a web script that the front-end will be POSTing JSON to. To test it out, create a file with some test JSON, then post it to the web script using curl, like this:

curl -v -X POST "http://localhost:8080/alfresco/service/someco/someScript?alf_ticket=TICKET_e46107058fdd2760441b44481a22e7498e7dbf66" -H "Content-Type: application/json" -d @/Users/jpotts/test.json

By the way, did you know that starting with 3.0, if you name your controller with “.json” before the “.js” the JSON will be sitting in a root variable called “json”?  So in this case instead of naming my controller “someScript.post.js” I’d name it “someScript.post.json.js” and then in my JavaScript, I can just eval the “json” variable that got created for me automatically and start working with the object,  like this:

var postedObject = eval('(' + json + ')');
logger.log("Customer name:" + postedObject.customerName);

Run a CMIS query

With 3.0 Alfresco added an implementation of the proposed CMIS spec to the product. CMIS gives you a Web Services API, a RESTful API, and a SQL-like query language. Once you figure out the syntax, it’s easy to post CMIS queries to the repository. You can wrap the CMIS query in XML:

<cmis:query xmlns:cmis="http://www.cmis.org/2008/05" >
<cmis:statement><![CDATA[select * from cm_content where cm_name like '%Foo%']]></cmis:statement>
</cmis:query>

Then post it using the same syntax as you saw previously, but with a different Content-Type in the header, like this:

curl -v -X POST "http://localhost:8080/alfresco/service/api/query?alf_ticket=TICKET_e46107058fdd2760441b44481a22e7498e7dbf66" -H "Content-Type: application/cmisquery+xml" -d @/Users/jpotts/cmis-query.xml

Alfresco will respond with ATOM, but it’s a little verbose so I won’t take up space here to show you the result. Also, I noticed this bombed when I ran it against 3.1 Enterprise but I haven’t drilled down on why yet.

Create a new object using CMIS ATOM

Issuing a GET against a CMIS URL returns ATOM. But CMIS URLs can also accept POSTed ATOM to do things like create new objects. For example, to create a new content object you would first create the ATOM XML:

<?xml version="1.0" encoding="utf-8"?>
<entry xmlns="http://www.w3.org/2005/Atom" xmlns:cmis="http://www.cmis.org/2008/05">
<title>Test Plain Text Content</title>
<summary>Plain text content created via CMIS POST</summary>
<content type="text/plain">SGVyZSBpcyBzb21lIHBsYWluIHRleHQgY29udGVudC4K</content>
<cmis:object>
<cmis:properties>
<cmis:propertyString cmis:name="ObjectTypeId"><cmis:value>document</cmis:value></cmis:propertyString>
</cmis:properties>
</cmis:object>
</entry>

Note that the content has to be Base64 encoded. In this case, the content is plain text that reads, “Here is some plain text content.” One way to encode it is to use OpenSSL like “openssl base64 -in <infile> -out <outfile>”. The exact syntax of ATOM XML with CMIS is the subject for another post.

Once you’ve got the XML ready to go, post it using the same syntax shown previously, with a different Content-Type in the header:

curl -v -X POST "http://localhost:8080/alfresco/service/api/node/workspace/SpacesStore/18fd9821-42a5-4c6a-86d3-3f252679cf7d/children?alf_ticket=TICKET_e46107058fdd2760441b44481a22e7498e7dbf66" -H "Content-Type: application/atom+xml" -d @/Users/jpotts/testCreate.atom.xml

The node reference in the URL above is a reference to the folder in which this new child will be created. There’s also a similar URL that uses the path instead of a node ref if that’s more your thing.

Refreshing Web Scripts from Ant

One of the things you do quite frequently when you develop web scripts is tell Alfresco to refresh its list of web scripts. There are lots of ways to automate this, but one is to create an Ant task that uses curl to invoke the web script refresh URL. This lets you deploy your changes and tell Alfresco to refresh the list in one step (and makes sure you and your teammates never forget to do the refresh).

<target name="deploy-webscripts" depends="deploy" description="Refreshes the list of webscripts">
<exec executable="curl">
<arg value="-d"/>
<arg value="reset=on"/>
<arg value="http://${alfresco.web.url}/service/index"/>
</exec>
</target>

In this example, the “deploy” ant task this task depends on is responsible for copying the web scripts to the appropriate place in the exploded Alfresco WAR. (Thanks to my colleague Eric Shea (http://www.eshea.net/2009/01/30/alfresco-dev-survivors-kit-part-1/) for this tip).

So there you go. It’s not Earth-shattering but it might give you a productivity boost if you don’t already have curl or an alternative already in your bag of tricks.

Catching up on XForms, XRX, XProc, and Orbeon

I recently spent a little time looking at open source components we could assemble to provide a basic web form authoring solution embedded within one of our SaaS offerings. Rather than full-blown Web Content Management, all that the solution really called for was the ability for non-technical users to enter data in a form and to upload binary objects which may be related in some way to that form data. There could be several forms with some chunks of forms being reused, and at some point, it might be nice for non-technical people to create their own forms.

For the forms piece I immediately thought of XForms because (1) I knew we wanted the data stored as XML and (2) I like the MVC pattern that XForms follows.

It had been a while since I played with XForms directly. Alfresco’s web forms engine is currently based on the Chiba implementation of XForms, but you don’t normally get exposed to the XForms details. There are a few things going on in the world of XForms that caught my attention:

XProc. XProc is a W3C specification for an XML Pipeline Language. If you’ve ever worked with Apache Cocoon you’ll get this concept immediately as Cocoon was an early implementor of the XML pipeline approach. Think of raw XML going in a pipeline on one end, having it processed with one or more steps as it goes through the pipeline, and then possibly new XML emerging from the other end. Those processing steps can be thought of as modules that can be reused and recombined in different ways to build new pipelines.

One of our past clients was doing something similar to this with their own home grown solution. They were taking XML data feeds from sporting events, and then performing various operations on that XML before it was eventually posted on the web site in the form of scoreboards and stats pages. They called the process definition a “workflow” and it was described in XML. XProc would be ideal for something like that.

XRX. XRX stands for XForms/REST/XQuery. It is not a standard–it’s an approach for building web applications. It means using XForms on the front-end to present and capture data, REST between the front-end and the back-end, and XQuery to retrieve and transform XML from the back-end. This approach allows you to build a web application without any object-relational mapping. The data you are dealing with is always XML so there is no translation necessary.

eXist. eXist is an open source, native XML database. If you’re dealing exclusively in XML, why go to the trouble of translating your XML into rows and columns (and then back in to XML when it is retrieved)? Native XML databases do a better job of storing XML with no translation required while preserving your ability to efficiently do things like XQuery and XPath statements across the entire scope of your dataset. I had previously played with Apache Xindice but Xindice doesn’t support XQuery which is a major focus for eXist (plus, things seem a little quiet over at the Xindice project).

Orbeon Forms is a server-side XForms implementation. If you’re looking for an open source forms solution, you need to take a look. Orbeon is XForms, XProc, and eXist, all rolled into a single offering. You can merge the Orbeon WAR file with your web app’s WAR or you can deploy Orbeon in its own web app and simply tell it to handle all of the XForms tasks. Orbeon also has a graphical forms builder but I didn’t get a chance to play with that.

Thinking I might want to use Apache Sling/Jackrabbit as my repository, I decided to see how easy it would be to persist the XForms data into Jackrabbit instead of eXist, as Orbeon’s tutorial does by default. As I suspected, it turned out to be a 2 minute task. Because Sling provides a REST API into Jackrabbit, and because XForms can persist data via REST natively, it was simply a matter of changing the post URL from the eXist REST URL to the Sling REST URL and it was a done deal. Deciding whether or not Jackrabbit (instead of or in combination with eXist) is the right way to go is a decision for another day.

I’ll provide an update at some point down the road after we’ve done some implementation work on this embedded forms stuff and we’ll see how it actually held up.

Running Alfresco web scripts as Liferay portlets

I’ve seen a lot of Liferay and Alfresco forum posts from people having trouble getting Alfresco running within a Liferay portal. Once that’s done, people usually want to invoke Alfresco web scripts as portlets without requiring a separate single sign-on (SSO) infrastructure. Some people have pointed to the Alfresco wiki (Deploying 2.1 WAR Liferay 4.3). That is a helpful reference but it isn’t the full story. Here are some notes that may help.

1. Download the Liferay 4.3.6 + Tomcat 5.5 JDK5 bundle. I had mixed results with the latest release 4.4.2. You may be tempted to try to download the WAR-only distribution and configure it in your existing Tomcat instance. In this case, save yourself the time and headache and get the bundle. Fool with the WAR distribution later.

2. Unpack the Liferay distribution and fire it up. Make sure you can log in as the test@liferay.com (password: test) user to validate that all is well with the Liferay install.

2a. Create a test user. (“Create Account” on the Liferay login screen). Remember the email address. This will matter shortly. For this discussion I’ll assume Foo User with a screen name of fuser and an email address of fuser@foo.com. Make sure you create a home directory. In this example, we’ll call it “fuser”.

2b. Verify that you can log in as your test user.

3. Shut down the server.

4. Download Alfresco 2.1.2 Enterprise, WAR only. Alfresco 2.1.1 has a known issue (AWC-1686) with the way authentication is handled for web scripts in the context of Liferay so make sure you are using 2.1.2.

5. Expand the Alfresco 2.1.2 WAR into the Tomcat webapps/alfresco directory (which you’ll have to create the first time). If you are tweaking the install (such as pointing to a specific MySQL database, using something other than MySQL, pointing to a different data directory, etc.) make sure you have copied your good set of extensions into Tomcat’s shared/classes/alfresco/extension directory.

6. Copy the MySQL connector into Tomcat’s common/lib directory.

7. Start Tomcat. When it comes up, you’ll have Liferay running and you’ll have Alfresco running, but Liferay doesn’t yet know about Alfresco. Verify that you can log in to Alfresco as admin.

7a. While you are here, create a test user account. You need to create a user account that has an email address that matches the test user account you created in Liferay. In this example you created Foo User with a screen name of fuser and an email address of fuser@foo.com so you need to create an Alfresco user with the same settings. You’ll log in to Alfresco as fuser. You’ll log in to Liferay as fuser@foo.com.

7b. Verify that you can log in to Alfresco as fuser.

8. Shut down Tomcat.

9. Now you need to configure Alfresco as a Liferay plug-in. This involves adding four files to Alfresco’s WEB-INF directory: liferay-display.xml, liferay-plugin-package.xml, liferay-portlet.xml, and portlet.xml. Why aren’t these available in the Alfresco source or on the wiki? Apparently someone tried to address this at some point because there is a link on the wiki but it is broken. Until that’s addressed, I’ve put them here.

10. Remove the portlet-api-lib.jar file from Alfresco’s WEB-INF/lib directory.

11. Re-package alfresco.war. It is now ready to hand over to Liferay.

12. Start Tomcat.

13. Find your Liferay deploy directory. If you are running out-of-the-box on Linux, Liferay’s “deploy” directory is called liferay/deploy and it resides in the home directory of the user who started Tomcat. I’m running it as root so my Liferay deploy directory is /root/liferay/deploy.

14. Copy the alfresco.war you just created into the deploy directory. Watch the log. You should see Liferay working on the WAR. He’s finding the plug-in config files and essentially deploying the Alfresco portlets.

15. Now log in to Liferay using the Liferay admin account (test@liferay.com). Go to a page, then use the global navigation dropdown to select “Add Content”. The list of portlets should appear and you should see the “Alfresco” category. If you don’t, look at the log because something is amiss. Add the My Spaces portlet to the page. You may see an error at this point but ignore it. The problem is you probably don’t have a user in Alfresco that has an email address of “test@liferay.com”, which is the currently-logged in user.

16. Log out.

17. Log in as your test user that exists in both Alfresco and Liferay (fuser@foo.com).

18. Go to the page. You should see the “My Spaces” portlet. You should be able to upload content, create spaces, etc.

Exposing your own web scripts as portlets

All Alfresco web scripts are automatically exposed as JSR-168 portlets, including the ones you create. To add your web scripts as portlets, first make sure you have authentication set to “user” and transaction set to “required” in your web script’s descriptor. Then, update portlet.xml, liferay-portlet.xml, and liferay-display.xml. Follow the pattern that’s in those files already and you’ll be fine. For example, if you deploy the Hello World web script from my web script tutorial, you need to add a new portlet to portlet.xml with a “scriptUrl” like: /alfresco/168s/someco/helloworld?name=jeff. Then you update liferay-portlet.xml and liferay-display.xml with the new portlet name or portlet ID.

Single sign-on with no single sign-on?

The web script runtime has a JSR-168 authenticator. So when your web scripts get invoked by the portlet, the current credentials are passed in. That’s why your web script can run without requiring an additional sign in. Prior to this being put in place, people had to implement Yale CAS (or an equivalent) to get SSO between Liferay and Alfresco web scripts.

What’s not covered in these instructions is that you’ll probably want to (1) configure both Alfresco and Liferay to authenticate against LDAP and (2) change the configuration of either Alfresco or Liferay to use the same credential (either username or email address) for both systems so that if you do have users logging in to both, they don’t have to remember that one requires the full email address but the other doesn’t.

Troubleshooting

If you see one of the Alfresco portlets displaying “Data is not currently available” or somesuch, try hitting
Alfresco in another tab. Log in, then log out. Then go back to the
portal and open the page again. It should work now. I’m not sure what’s going on there. I think it may have to do with me switching back-and-forth between Liferay instances (4.3.2 versus 4.4.2) so maybe you won’t see it.

Open issues

You may see an error like this:

21:22:15,965 WARN [BaseDeployer:1038] Unable to format /usr/local/bin/liferay-4.3.6/temp/20080408212212978/WEB-INF/faces-config-jbpm.xml: Error on line 5 of document file:///usr/local/bin/liferay-4.3.6/temp/20080408212212978/WEB-INF/faces-config-jbpm.xml : A ‘)’ is required in the declaration of element type “application”. Nested exception: A ‘)’ is required in the declaration of element type “application”.

I haven’t chased that down yet. I’ll update this post with a comment when I find out. I’m sure fixing that will also fix the problem that you’ll see if you try to start an advanced workflow from a piece of content displayed in the My Spaces portlet.

I was also seeing an error when trying to use the “Add Content” link in the straight Alfresco client. I think it is JSF-related. Again, I’ll update this post with a comment when it is resolved (or when I find a Jira ticket).