Building a WS-Security enabled SOAP client in Maven2 to the EC2 WSDL using JAX-WS / CXF & WSS4J: tips & tricks

Generating a Java client from the Amazon EC2 WSDL that correctly used WS-Security is not completely simple. This blog post from Glen Mazza contains pretty much all the info you need, but as usual there are many things to trip up over along the way. So, without further ado, my contribution.

My setup: I was using Maven2 to construct a JAR file. Running "mvn generate-sources", then, downloads the WSDL and uses it to generate the EC2 object model in src/main/java.

Blogger doesn't like me quoting XML, so I've put my sample POM at pastebin, here. Inside the cxf-codegen-plugin plugin XML you'll see two specific options, "autoNameResolution", which is needed to prevent naming conflicts with the WSDL, and a link to the JXB binding file for JAXWS, which is needed to generate the correct method signatures

Once this is done, then the security credentials need to be configured. There are some pecularities:

As laid out in this tutorial for the Amazon product advertising API, the X.509 certificate and the private key need to be converted into a pkcs12 -format file before they're usable in Java. This is done using OpenSSL:
openssl pkcs12 -export -name amaws -out aws.pkcs12 -in cert-BLABLABLA.pem -inkey pk-BLABLABLA.pem
At this point, I should admit that I spent hours scratching my head because the generated client (see below) gave me the error "java.io.IOException: DER length more than 4 bytes" when trying to read the PKCS12 file. So I switched to the Java Keystore format by using this command (JDK6 format):
keytool -v -importkeystore -srckeystore aws.pkcs12 -srcstoretype pkcs12 -srcalias amaws -srcstorepass password -deststoretype jks -deststorepass password -destkeystore keystore.jks
...and then received the error "java.io.IOException: Invalid keystore format" instead. At this point I googled a bit, and discovered two ways to verify the integrity of keystores, via openSSL and the Java keytool:
#for pkcs12
openssl pkcs12 -in aws.pkcs12 -info

#for keystore
keytool -v -list -storetype jks -keystore keystore.jks
Both the keystore and pkcs12 file were valid. Then, I realised that I'd put the files in src/test/resources which was being put through a filter before landing in "target". The filter was doing something to the files, so of course they couldn't be read properly. Duh me. I put the key material in a dedicated folder with no filtering and this problem was fixed.

My next problem was the exception "java.io.IOException: exception decrypting data - java.security.InvalidKeyException: Illegal key size". This was solved by downloading the "Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files". Simple!

At this point the request was being sent to Amazon! Which then returned a new error message, "Security Header Element is missing the timestamp element". This was because the request didn't have a timestamp. So, I changed the action to TIMESTAMP+SIGNATURE (as seen in the below code sample), at which point I got a new error message: "Timestamp must be signed". This I fixed by setting a custom SIGNATURE_PARTS property also as below.

Finally, once this was all done, and everything was signed, Amazon gave me back the message "AWS was not able to authenticate the request: access credentials are missing". This is exactly the same error that you get when nothing is signed at all, which needless to say is somewhat ambiguous.

At this point I decided that I'd really like to see what was being sent over the wire. The WSDL specifies the port address with an HTTPS URL. However, I had saved the WSDL locally, and changing the URL to HTTP made the result inspectable with the inestimable Wireshark. Despite the request being sent in HTTP, not HTTPS, it was still executed. According to the docs, this should not be!

Anyway, once I was looking at the bytes, I saw that the certificate was only being referred to, not included as specified in the AWS SOAP documents, in this case for SDB. This was fixed by setting the SIG_KEY_ID (key identifier type) property to "DirectReference", which includes the certificate in the request.

...and then it worked. Oh Frabjous Day, Callooh, Callay! The final testcase code that I used is more or less as follows:

package net.ex337.postgrec2.test;

import com.amazonaws.ec2.doc._2009_10_31.AmazonEC2;
import com.amazonaws.ec2.doc._2009_10_31.AmazonEC2PortType;
import com.amazonaws.ec2.doc._2009_10_31.DescribeInstancesType;
import junit.framework.TestCase;

import java.io.IOException;
import java.util.HashMap;
import java.util.Map;
import javax.security.auth.callback.Callback;
import javax.security.auth.callback.CallbackHandler;
import javax.security.auth.callback.UnsupportedCallbackException;
import org.apache.cxf.endpoint.Client;
import org.apache.cxf.frontend.ClientProxy;
import org.apache.cxf.ws.security.wss4j.WSS4JOutInterceptor;
import org.apache.ws.security.WSPasswordCallback;
import org.apache.ws.security.handler.WSHandlerConstants;

* @author Ian
public class Testcase_CXF_EC2 extends TestCase {

public void test_01_DescribeInstances() throws Exception {

AmazonEC2PortType port = new AmazonEC2().getAmazonEC2Port();

Client client = ClientProxy.getClient(port);
org.apache.cxf.endpoint.Endpoint cxfEndpoint = client.getEndpoint();

Map outProps = new HashMap();

//the order is important, apparently. Both must be present.
outProps.put(WSHandlerConstants.ACTION, WSHandlerConstants.TIMESTAMP+" "+WSHandlerConstants.SIGNATURE);
//this is the configuration that signs both the body and the timestamp

//alias, password & properties file for actual signature.
outProps.put(WSHandlerConstants.USER, "amaws");
outProps.put(WSHandlerConstants.PW_CALLBACK_CLASS, PasswordCallBackHandler.class.getName());
outProps.put(WSHandlerConstants.SIG_PROP_FILE, "client_sign.properties");

//necessary to include the certificate in the request
outProps.put(WSHandlerConstants.SIG_KEY_ID, "DirectReference");

cxfEndpoint.getOutInterceptors().add(new WSS4JOutInterceptor(new HashMap(outProps)));

//sample request.

DescribeInstancesType r = new DescribeInstancesType();


//simple callback handler with the password.
public static class PasswordCallBackHandler implements CallbackHandler {
private Map passwords = new HashMap();

public PasswordCallBackHandler() {
passwords.put("amaws", "password");

public void handle(Callback[] callbacks) throws IOException, UnsupportedCallbackException {
for (int i = 0; i < pc =" (WSPasswordCallback)callbacks[i];" pass =" passwords.get(pc.getIdentifer());"

provider="org.apache.ws.security.components.crypto.Merlin" type="pkcs12" password="password" alias="amaws" file="aws.pkcs12" href="http://s3.amazonaws.com/ec2-downloads/ec2.wsdl">http://s3.amazonaws.com/ec2-downloads/ec2.wsdl.

[I think I mangled somethjing here, will fix it soon]

At this, the method signatures of the generated port abruptly changed to something other, because I forgot to change the wsdlLocation in the JXB binding file. Once I fixed this, it worked again.

Some thoughts:

1) Were I publishing a library for general use in accessing AWS, I would probably not use the direct "symlink" above that always points to the latest version of the WSDL. Instead, I would link deliberately to each version, and in that way always generate ports for each version of the WSDL, this ensuring backwards compatibility.

2) Secondly, I find it inelegant to have to specify the WSDL location in two places (the POM and the binding file), and so I'd like to try and pass the binding file through a filter, using a ${variable} in both places referring to a property in the POM.

3) I find it likewise confusing that the password for the keystore is used in two places, firstly in client_sign.properties and secondly in the CallbackHandler that is invoked from within the bowels of the WSS4JOutInterceptor. In the code above, this is obviously duplicated data, however in the final 'production' version of this code I expect to have the data centralised & the code prettified around it.


Using CXF instead of Axis for Java from WSDL: better results faster.

In the footsteps of the same guy, Glen Mazza, I linked to at the bottom of the previous post, who did the same thing (SOAP client / server using WS-Security and X.509 certificates using CXF), I switched from Axis2 to CXF, and had immediately better results:
  • The documentation and maven plugin instructions is current, and accurate.
  • The plugin works.
  • All the right JAR files are in repos.
  • The code generation worked fine, with some JAX-WS binding stuff added into the mix.
Which leads me to ask, why are there two projects at Apache doing essentially identical things, right down to the usage patterns for the tools they provide? (A: CXF nee XFire is from Codehaus). Anyway, I don't have to write a HOW-TO for this stuff, the docs are there and they're useful.

I have yet to look at CXF support for WS-Security, but it seems simpler from the get-go than the equivalent stuff in Axis2, despite insisting on Java's proprietary keystore, hum, I didn't read this howto clearly enough - files are supported. We shall see!


Creating Java code from a WSDL using Apache Axis2: maven2 and the command line.

Four years ago, creating Java code from WSDL was difficult and annoying. Today, I'm trying to generate a client for the EC2 WSDL, that ideally would download the latest version and rebuild the API when I type "mvn clean install".

I've given up. The Axis2 Maven2 plugin does not seem to work correctly, so I've resorted to using the command-line tool, which does work. My command was:

wsdl2java.bat -d jaxbri -o -S . -R . --noBuildXML --noMessageReceiver -uri http://s3.amazonaws.com/ec2-downloads/2009-10-31.ec2.wsdl

I used the JAXBRI output format because XMLBeans is basically dormant. Unfortunately the JAXBRI compiler generates sources in an "src" subdirectory, which can't be changed via command-line options, so some manual copy-and-pasting is required.

Secondly, the generated classes then depend on axis. So this needs to be added to the POM:

(Blogger broke my XML):


Thirdly, there's an undeclared dependency in Axis2-generated code on Axiom, so this also needs to be added:


(The latest version is 1.2.8, but this doesn't seem to be in repos yet.)

Following which, attempting to run this:

AmazonEC2Stub stub = new AmazonEC2Stub("http://eu-west-1.ec2.amazonaws.com");
DescribeRegionsType t = new DescribeRegionsType();

...rendered many different ClassNotFoundExceptionS which were, one after another, which I attempted to solve shotgun-style by adding each new dependency to the POM as it cropped up. This was an abject failure - I stopped at org.apache.axis2.transport.local.LocalTransportSender, which apparently is only available in Axis2 v. 1.2 (I'm using 1.5.1). So instead I deleted all the Axis2-related stuff from my POM and just added the JARs from the 1.5.1 downloded ZIP file straight to the Eclipse project.

This worked, and gave me the error message that I was looking for, to whit: "AWS was not able to authenticate the request: access credentials are missing". From here, I would just need to get the Rampart/WS-Security/WSS4J stuff working properly with the Amazon X.509 certificate, and then I should be home free. We shall see.

Further light reading can be found on this article on IBM developer works and this article on SOAP clients with Axis.

Update 2009/12/20: The work has been done (I should have googled first!), and it is herculean, as you can see by reading this impressive tutorial for creating an Axis2 SOAP Client for the Amazon Product Advertising API.


On the subject of being too harsh a critic

One of the great things about OSS is its transparency. This is also a great response to critics of any particular project, which is "instead of talking so much, why don't you shut up and help?".

I'm guilty of forgetting that there are real people behind most projects. With commercial software this happens a lot more and is disguised as "I'm a paying customer and I expect good service", but there's no excuse, honestly, for criticism that isn't phrased constructively and considerately when the product itself is free.

Fnar fnar fnar.


EC2 upgrades again

At lunch today, I read that EC2 can now boot off EBS images, something that simplifies the whole AMI thing and brings it up to speed with Rackspace on the ease-of-deployment front. However, two points:
  1. EC2 is still the clear loser in price-performance, and charging for I/O to the root partition won't help. More specifically, when will EBS I/O become consistent? Probably this has a lot to do with being popular and dealing with shared resources at the lower end, see this HN thread.
  2. My next question is, how does this affect the attack surface of EC2? Can the work done in the "Get off my Cloud!" paper be expanded on?
Anyway, this makes my Postgres stuff interesting, guess I'll be using this new mechanism instead. It's always nice when new stuff arrives, even if it means reworking stuff.


Note to self: too many publicly facing websites.

So, from now on:
  • Delicious is for storing bookmarks.
  • Reader is for sharing articles.
  • Twitter is for linking to stuff. (i.e. sharing stuff that I didn't come across in Reader)
Right now there's stuff duped between all three. I expect this to continue. It's not inconveniencing anyone, but it offends my sense of order. Oh well.


Rolling your own EC2 administration code: the basics

So, instead of using a PaaS or SaaS to manage the IaaS, I'm writing code myself to do exactly what I want, to be published shortly I hope. Basic ingredients:
  • A library to access EC2, like Typica.
  • An SSH library like JSch, to exec commands and transfer files.
  • A templating engine for wrangling config files, such as Velocity.
  • Working knowledge of the platform. I'm probably going to discover a few more subtleties such as this one before I'm finished.
Aside: Maven2 may be looked upon with scorn in certain circles, but it does make life much easier. If certain packages included javadoc & source packages, and refrained from (e.g.) including log4j config files in the library JAR files, it would be even easier! Wouldn't that be nice.


Using the EC2 API: console output blank, connection refused, socket timeout, etc.

Hello there. As usual, there's a world of difference between the conceptual usage of an API and it's real-world, practical stuff. In my adventures I'm stubbornly, block-headedly not interested in using EC2 via anything except the API, i.e. no command-line tools or management console (except for debugging), and so, I intend to be able to create my images etc. in a test-harness. For reasons to be enumerated anon. Anyway, the following stuff may be useful to people writing code that uses the EC2 API for the first time:
When booting an instance, it is not assumed that once it is "running", that SSH will be serving on port 22, neither can it be assumed that the console output is there. So if you want to SSH into your instance, first poll the instance state, and once it's "running", then poll on console output. Once it's there, it's complete, so you can retrieve the fingerprints and go on from there.
This is good to know of course if one wants to sling instances around, but I find it slightly incongruous is that I'm being charged for about a minute of time on a machine I can't access yet. Of course, from Amazon's perspective the instant (har) I'm blocking a slot on a server, then it's chargeable, so it makes sense from their perspective I guess.


Matt Taibbi does have a way with words:

I'm glad that there's at least one reporter who takes so much cynical glee in uncovering what happens in Wall Street, even if in somewhat lurid and no-doubt slightly exagerrated form:
What really happened to Bear and Lehman is that an economic drought temporarily left the hyenas without any more middle-class victims — and so they started eating each other, using the exact same schemes they had been using for years to fleece the rest of the country. And in the forensic footprint left by those kills, we can see for the first time exactly how the scam worked — and how completely even the government regulators who are supposed to protect us have given up trying to stop it.

Just like that, with a slight nod of Paulson's big shiny head, Bear was vaporized. This, remember, all took place while Bear's stock was still selling at $30. By knocking the share price down 28 bucks, Paulson ensured that the manipulators who were illegally counterfeiting Bear's shares would make an awesome fortune.
What is interesting is that he seems to suggest that Geithner & Bernanke gave false testimony to the Senate, which would be tectonically enormous if true:
The month after Bear's collapse, both men testified before the Senate that they only learned how dire the firm's liquidity problems were on Thursday, March 13th — despite the fact that rumors of Bear's troubles had begun as early as that Monday and both men had met in person with every key player on Wall Street that Tuesday. This is a little like saying you spent the afternoon of September 12th, 2001, in the Oval Office, but didn't hear about the Twin Towers falling until September 14th.
Like with the whole torture thing (appropos of which, sunlight maybe?), the more I read about Wall Street, the worse it looks.


EC2: now having actually played with it a *little*...

So, in place of my previous bloviation on the subject, unfettered by the weight of experience, a couple of somewhat-more-tempered comments follow:
  • Using the command-line tools is slow. They're shifting gigs of data around at the touch of a button, but hey, it's a UX thing.
  • In terms of actual tools, the options seem to be:
    • Go with the command-line tools and a bunch of bash scripts
    • Go with a (generally) half-baked third-party API, with its own idiosyncrasies built in, and the traditional lack of documentation OSS projects feel they can get away with.
    • (My inevitable option) download the WSDLs & use something to generate your own API in whatever language. Regenerate it whenever the API changes.
  • This choice is especially acute since I'm not intending, ultimately, to have to do anything by hand - so programming things properly to start with seems like the only sensible option.
  • Consistent IO on EBS is apparently not an option. This is something I think Amazon should fix toute suite, because things like RackSpace (maybe) and NewServers (h.t. etbe) seem to be to stomping all over the EBS I/O figures. In a different context, James Hamilton says "it makes no sense to allow a lower cost component impose constraints on the optimization of a higher cost component", and assuming that the servers are the expensive part, this is what (IMHO) may make using RDBMSs on EC2 a bit of a PIA long-term.


“If they’re too big to fail, they’re too big,”
So, the next question is, when will the current head of the Fed adopt this position?


Norwegian Irony?

I like the guy too, but:
In February, the Obama DOJ went to court to block victims of rendition and torture from having a day in court, adopting in full the Bush argument that whatever was done to the victims is a "state secret" and national security would be harmed if the case proceeded.
And all year long, the Obama DOJ fought (unsuccessfully) to keep encaged at Guantanamo a man whom Bush officials had tortured while knowing he was innocent.
- The indefatigable Glenn Greenwald

Asked why the [nobel] prize had been awarded to Mr Obama less than a year after he took office, Nobel Committee head Thorbjoern Jagland said: "It was because we would like to support what he is trying to achieve"

"Obama has a long way to go still and lots of work to do before he can deserve a reward,"
- Hamas official Sami Abu Zuhri.

"It's the prize for not being George W. Bush"
--Sky News commentator.

I find myself in the bizarre situation of agreeing with crazy right-wing nutcases (not linked to, but depressingly easy to find), terrorists, and a News Corporation talking head, all at the same time. He hasn't done anything yet!!! And in any in any case, torture! Sheesh!


Hotrepart: long term plans

Pace my previous post on moving hotrepart forward, the plan is as follows:
  • Patch the seemingly-dormant CloudTools to support PostgreSQL, using Londiste for replication.
  • Patch CloudTools again to allow online adding & removing slaves.
  • Patch hotrepart and PL/Proxy again, this time to add a CLUSTERS command that will allow PL/Proxy to act as a bus not just for sharding, but also for master-slave replications.
  • Patch CloudTools again to allow dynamic repartitioning with hotrepart.
Once this is done, then I should be able to run a test cluster that auto-adapts the server provisioning based on workload. If I can get this up to 500 nodes, then I'll consider myself happy, and start working on snazzy canvas-based visualisation/management toolkit.

Canvas will probably be fully supported in IE10 by this time. Anyway.


My WiFi access point.

My Wifi access point used to be open to anyone, with the name "noP2PwebOKthanks". What this meant was that people were free to surf the web, but politely requested not to download movies, films, or anything else that would get me in trouble.

Of course, this didn't work, and after too many times of having slow access to my own connection, and exceeding my bandwidth quota, I added a password. The wifi network is now called "bit.ly/1wflj1", which is a link to this blog post.

If you used to use my network and enjoyed free Internet access for occasional use, then you can still use my wifi - get in touch with me and I'll see what I can do.


You were saying something about 'best intentions'?

Well, hotrepart is my most recent attempt at doing something interesting outside of work, and it took me down some interesting avenues.

I didn't figure that I'd be patching PL/Proxy, learning C, pointer arithletic and memory management along the way. It was interesting working in a bare text editor, but also much, much slower. It took me maybe 40 hours of screen-time and 20 hours of no-screen thinking, spread across about 3 months, to get a 1.1Kloc patch out the door.

Once I did this, I then realised that my plan to scale Postgres to the moon was fundamentally flawed: queries that had to be run across the whole dataset (RUN ON ALL in PL/Proxy parlance) would eventually saturate the cluster without a decent replication system. A decent replication system (that can be automatically installed and reconfigured on-the-fly at runtime) is of course the one thing Postgres lacks right now.

During this three months I also read up more on datacenter-scale applications and Google's concept of the Warehouse Scale Computer, which altered my thinking somewhat.


At this point I should explain, to myself if nobody else, why I started this whole experiment. The thinking was that the algorithms behind DHTs and distributed key-value stores like Dynamo should in theory be implementable using RDBMS installs as building blocks. What you lose in performance overhead you gain in query language expressivity. Going further, with a decent programmable proxy it should be possible to route DB requests just like DHTs route pull requests etc. Further still, a "proxy layer" on top of the DB layer should self-heal and use a Paxos algorithm to route around failures and update the routing table.

One of the properties of the hotrepart system as-is is that it is immediately consistent. In theory, the system proposed above would trade that off for eventual consistency, with a lag equal to the replication delay, but gaining partition-tolerance, on the assumption that the replication also had a hot-failover component. Hello, there, Brewer's conjecture.

Needless to say this would be a ridiculous amount of work to implement, even it it worked in theory. Were this to be made real, however, would such a system scale to the numbers obtained with Hadoop? That is the question.

One way to answer it would be to construct a simulation cluster processing requests and then to torture the cluster in various ways (kill hosts, partition the network) and watch what happens. We shall see.

The end result is that hotrepart is stalled for the moment, pending ideas on the direction in which to take it foward. As with Gradient, even if nothing comes of the project, it's still out there in the open. Four years after I stopped work on Gradient it still proved useful to other people because of the ideas alone, and combining XMPP and hypertext is something everyone's doing now, so I don't necessarily think I've wasted my time on this so far.



The code that produced the below graph is released! Hotrepart was basically done a month ago, but I decided that releasing code into the wild, especially code that is supposed to start conversations, is probably not the best thing to do at the same time as organising my own wedding. I'm happily married now :-) Hence the release.

This is an ongoing project that I don't intend to let rot, like I did with Gradient. There are several things to do off the bat: another patch to PL/Proxy, then configuring EC2 to run Postgres nicely, then coding up some intelligence to trigger the repartitioning. Amazon releasing autoscaling just made that a whole lot easier.



The below graphs show two minutes of reads and writes against a PostgreSQL database running inside VMWare on a modern 2-core laptop, with #ops per second and response time plotted in each case. The initial high response time for writes is caused by the connection pool filling.

Half-way through, where the lines spike, the database is split (repartitioned) into two partitions - one being the original, the second newly created that second. Both of the new databases are on the same host, but this is a trivial detail.

More soon :-)