All my geeky stuff ends up here. Mostly Unix-related

Sorting Certificates

leave a comment »

As I went for a smoke the other day, I found two colleagues trying to solve a puzzle they had to code. The game is the following: first you get a list of certificates belonging to Certification Authorities. A certificate is a list of key/value pairs that are expressed in a canonical way in binary (in a format called ASN.1) and then signed with a cryptographic key. Among the key/value pairs are:

  • A name for the identity corresponding to this certificate, or DN for Distinguished Name
  • A name for the entity that delivered (signed) the certificate: Issuer name
  • A serial number that is unique for this Issuer+Certificate
  • Validity dates: valid from and valid until
  • … and a bunch of other fields that are irrelevant for this issue

Certificates are always delivered by a Certification Authority (CA) except the ones for Root CAs that are self-signed (or self-issued), in which case Issuer and Issuee have the same name. The way Certification Authorities work, you normally start by creating a Root CA then issue certificates for subordinate CAs (subCA) that are themselves in charge of creating their own CAs, or just issuing certificates to end-users, machines, or applications. CA hierarchies may look like this in their simplest tree-like approach:

CA hierarchy

CA hierarchy

Now you received a list of unsorted certificates and you are asked to sort them out so that any CA certificate must have its Issuing Root CA on its left. If there are multiple roots, they are allowed to appear anywhere in the list as long as they are left of their daughter CAs. How do you sort them?

A very straightforward approach would be re-building the CA tree. Find out Root CAs: they are easy to identify as their issuer is themselves. Then parse all remaining certificates and find the immediate daughters for Root CAs you already have. Parse again and re-attach in a tree-like structure, sorting siblings together. Once you have a sorted tree, iterate on all root CAs, then subCAs, etc. until you reach a terminal node, i.e. a CA that has not issued CA certificates itself.

Fancy, but that requires some tree-like structures in memory that may be tricky to get right on the first attempt. I also did not like the fact that emitting CAs in a list would probably have to use recursion to remain elegant. I have very bad memories of recursive algorithms in production, I have seen stacks vaporize in flight more than once. Sure, they can be translated to iterative methods but then forget about elegance.

My colleagues were looking into fancier ways of achieving the same result by designing some kind of clever sorting algorithm with a bit of memory to end up with a sorted list in a limited number of passes. When I joined them they had just found a sort in O(Nˆ3). I tried to understand their method but just could not figure it out.

I thought about it for a moment and got one of these a-ha! insights:

“Guys, have you tried sorting the input list by validity date? Since a daughter CA is always younger than its parent, just sort on the valid from field.”

Problem solved.


Written by nicolas314

Thursday 20 June 2013 at 11:31 pm

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: