2006/08/31

EC2 SOAP + XMLSig + WSSE + X509 = Frustration

I'm exploring the EC2 SOAP API. Why ? I never really developed a client for a SOAP API, and this was a good occasion, and the interesting constraint is that it's not just a SOAP API, but a SOAP API complying with the WS-Security standard, where the messages are signed - in the EC2 case, it's with X.509 certificate with a public key.

A bit like what I did for S3 with my small Mac OS S3 Utility, I wanted to code a small OS X 'EC2 Console'. First problem: OS X SOAP support is not really exhaustive (although WebServices Core is a good start), but I need more: XML Signature, WSSE and BinarySecurityToken. SOAP support, for the subset I need for the EC2 API is not that hard and can be generated by hand easily. Signature is another matter: you have to canonicalize subset of the document, compute a digest on the canonical form, embed this in the signature part of the document (and actually do that twice, one for the actual SOAP body representing the request and another time for the timestamp), and then canonicalize and sign this signature section.

The first issue was canonicalization, NSXMLNode has a method and OpenSSL provides functions for sha1 digests and base64 encoding. The problem was the NSXMLNode method: it seems to be intended for canonicalization of entire documents, canonicalization of a subset does not correctly reintegrate XML namespaces defined on a parent tag. So the digest was not the same than my reference (a trace from the Amazon EC2 java tool). It took me a few hours to finally catch what was happening but, well. Simple fix: force namespace in the subset before computing de digest. Ugly, but works

The second issue is still open: my signature code generates something that's different from my 'reference' message (which might be explained by slight whitespace differences in the XML subset), and indeed, EC2 refuses my SOAP request with a authentication error. So either I'm missing something else in the canonicalization or I'm not using correctly OpenSSL to generate my rsa/sha1 signature, or the XML Signature expects something slightly different from what I get with my naive openssl use (load the private key pem file with PEM_read_bio_RSAPrivateKey then use RSA_private_encrypt(inlen, input, output, key, RSA_PKCS1_PADDING).

So I'm left with either: trying other openssl functions and options (RSA_sign maybe ? but it seems to be intended for PKCS #1 v2.0, and not PKCS #1 v1.5), learning how to use the xmlsec library and attempt a message generation with it, or look at the xmlsec or Axis sources to try to find if there's any subtlety with the rsa-sha1 signing process. Sigh


2006/08/30

Native Unicode support in programming languages

It's much needed. Here is my humble proposal (image version here):
❖include ❝list.h❞
❖include ❰iostream❱

bool List♒printout()  
◤
   Item♂ curr;
   curr = next;

   ☁
   ◤
     ♻(curr) 
       cout ⇇ ❝ node ❞  ⇇ curr➟val ⇇ ❛⏎❜;
   ◣

   ☂ (int err)
   ◤
     ➲ ✖;
   ◣
   ➲ ✔;
◣

int main()
◤
  List♂ l = ♥ List();

  l➟printout()

  ☠ l;

  ➲ 0;
◣
APL, new generation.

2006/08/28

Useless screenshot

Somebody asked for an EC2 "screenshot".

sleepless:~ olg$ ssh -i id_rsa-gsg-keypair root@xxx.compute.amazonaws.com


         __|  __|_  )
         _|  (     / 
        ___|\___|___|

 Welcome to an EC2 Public Image
                       :-)
 
[root@domx-xx-xx-xx-xx-xx-xx ~]#    

Awfully interesting, isn't it ?


2006/08/27

Stats galore

O'Reilly Labs collected statistics from 672 of their books, and put everything accessible online. Lot of interesting ways of exploring the data: from book structure to example distribution.

My own experiment: search for the most popular tags in a given year (no direct link because the results are displayed with an ajaxy trick):

1997
files, variables, functions and arrays
1998
files methods, text, ... web site, html also present but farther away
1999
URLs is the big thing, with databases, files, backups. Linux and security
2000
URLs, methods files security and objects
2001
files, classes and methods, objects security servers
2002
methods, classes functions and commands
2003
classes methods, objects files and configuration
2004
functions, methods, Oracle. XML, web sites, SQL Server
2005
methods, commands, functions, classes, web sites and security

Totally non-scientific, but it is quite funny to see URLs mostly visible in 1999 and 2000. Also interesting that OOP-related tags where a bit hidden in the noise first, but then clearly in front and stable at this position.


2006/08/24

Playing with Amazon EC2

Another announce in the Amazon "WebServices" portfolio: Elastic Compute Cloud (EC2 for short) is a highly adaptable hosting environment.

I signed up for the beta, here is some notes from my (short) experience with EC2.

Don't let the first paragraph on their web page fool you: when amazon says that it's a "web service that provides resizable compute capacity in the cloud", what they actually mean is that you use or provide a disk image for your server, choose how many instances of this image you want to run, and there's no step three. Don't worry about the web service part, you can run everything through Java command-line tools, and the command-line tools are just for the administration of your instances, and not related to what's running on the instances themselves.

Pricing is interesting: $0.10 per instance-hour consumed (about $73 a month for an always-on server), and bandwidth cost is similar to the S3 services ($0.20 per GB). Customized images are stored on S3 ($0.15 per GB-Month).

After setup (certificate generation, instance creation), you get access to a Xen-based, virtualized machine. According to Amazon, "each instance predictably provides the equivalent of a system with a 1.7Ghz Xeon CPU, 1.75GB of RAM, 160GB of local disk, and 250Mb/s of network bandwidth. I did some quick micro-benchmarks. Upload and download bandwidth seems to be in line with what I observed with S3, and the machine itself feel fast, which seems corroborated with various small tests.

You then just have to open the ports you need (let's say ssh and http), and log in to your new server. Yes, you're root, and you can install whatever you want (refer to the agreement for details).

Caveats: EC2 "adaptable hosting environment" is highly dynamic, you get a lot of freedom and reactivity when deploying your instances, but it comes with constraints that are not usual in "classic" hosting environments. First, your instance is allocated dynamically when you request it, and your image will be assigned an IP. Also, you can't use your own kernel (Xen is probably the blocker here). Then, you should now think about your instance as a new machine: when you're starting a new machine, it's allocated and ready to run, but when you're terminating the instance, you're not just "shutting down" the machine, you're scrapping your machine: all your data store locally on the instance are lost. It is your responsibility to backup to S3 or to another instance.

Other interesting features: you can define security "groups", which are collections of access rules which control traffic to your instances (i.e. only accept connections from members on the group, only open port 80, only accept ssh access from a specific subnet). Groups can then be applied when launching a new instance. Building a custom image

This is a really interesting move from Amazon, and an interesting new piece in the puzzle, with web frameworks and resources going more and more the stateless/replicated road. Having a hosting infrastructure where allocating a new resource can be done in five minutes is big plus ... if the application has been designed in a way that it can take advantage easily of the new resource.

I'm toying with the idea of developing a small "EC2 Console" application. Too bad the REST API is not yet available.


2006/08/20

Quote of the day

We used to write algorithms. Now we call APIs. [..]

Times have changed. Welcome to a world where the programmer who knows how to tap into other people's brains and experience using the Internet has a decisive advantage.

From an old Joel on Software chronique.

2006/08/18

S3 Browser 0.3

I released today v0.3 of S3 Browser (binary and source). It has some minor usability fixes (added drag-n-drop support in the objects view, double-click actions, and other small fixes). But the most important feature is that file upload and download no longer use local memory, the data is just streamed between S3 and the filesystem.

Implementing it for file download was a piece of cake, as NSURLDownload actually does almost everything for free. You just have to give the destination for the downloaded file - et voilà.

File upload was another story: when using NSMutableURLRequest to prepare a connection you can use setHTTPBodyStream instead of setHTTPBody to send the payload.
But... using a stream means that you usually don't know the content length upfront, so Foundation will send the message as chunked, without a Content-Length header (they are mutually exclusive).

And S3 does not support file upload using chunked-encoding (something I can understand). The conclusion is that you can't use a stream-based API at NSURLRequest-level to send your data if you don't want chunked transfers. That makes sense: you would need a new API, something like setHTTPBodyStream:forLength to precise what you're going to send.

So I ended implementing my own naive HTTP PUT client on top of NSInputStream/NSOutputStream and CFHTTPMessage, which works fine but does not get all the 'free' functionality available in the Foundation URL Loading system. For instance, if a web proxy is configured in system preferences, I fallback to my previous in-memory / NSURLRequest implementation.

The plan for the next version ? improving bucket content management, first to be able to deal with very large buckets, and also to see what can be implemented to take advantage of the new "hierarchical listing of keys" feature.


2006/08/08

Ruby on Rails and Leopard (Server)

Just saw that on OS X Leopard Server Sneak Peek:
Internet and Web

Leopard Server also features administration for either Apache 2.2 or 1.3, MySQL 5 with Apache/MySQL/PHP integration, JBoss 4, and Tomcat 5 for hosting EJB 3.0 and J2EE 1.4-compliant enterprise applications, and Ruby on Rails with Mongrel for simplified development and deployment of web-based applications.

Insanely cool.


This page is powered by Blogger. Isn't yours?