Nicolas314

All my geeky stuff ends up here. Mostly Unix-related

Archive for the ‘programming’ Category

Put on your shoes

leave a comment »

shoes


– Mister engineer, we are about to leave the house. Could you please lace your shoes?

– I’m afraid I can’t do that before at least next year.

– What? No! We are leaving the house right now. Tie your shoes and let’s go!

– Well, it is obvious you have not been in the shoe-lacing business for quite a while mate. See: in order to tie my shoes I’d have to get my hands closer to my feet. I see three main possibilities:

1. I lower myself down to the level of my feet (and shoes), which is dangerously close to the ground. I could trip and fall, bringing me to ground level with sufficient speed to hurt my nose, probably causing bleeding in the process. Who would want to leave blood on the floor? You don’t want me to hurt myself, do you? This would take us to a large amount of blood cleaning and nose healing, which could take a lot of time and make us both look bad in case someone on the street asks why I have a bloody nose.

2. I could bring the shoes up to my level. Considering my feet would stop touching the ground, I would have very little time to complete the movement needed to effectively tie a knot to what could be considered decent shoe-lacing. Bad knots would make us look bad, and we do not want someone to notice that we are not even able to come out on the street with properly tied shoes.

3. The third and last possibility is to wait for my feet to grow up enough so that my shoes do not fit any more. This would probably trigger some shoe-buying and shoe-replacing, which could then be put to practical use to purchase a new pair of lace-free shoes, which would then solve all the above issues once and for all.

My conclusion is that we should wait until my feet have grown enough. See you in a couple of months.

– Man, you have reached the end of my patience. Let me tie those shoes for you.

– I’m afraid I can’t let you do that, Dave. Your role as a caretaker is not to take responsibilities and do things in my stead, but to teach me to be autonomous and let me do that myself. In addition, may I let you know that I have had these shoes for a few months now and you have never laced them before in your entire life, therefore I am the only suitable person to achieve that.

– C’mere, let me do it.

– Are you questioning my authority with respect to my own shoes? When you bought them you said they were mine!

– They are still yours, let me just lace them.

– You did not understand the above mentioned points. Apologies for my poor choice of words, I always forget that English is not your native language and you may not get the full power of the most subtle nuances.

– Don’t patronize me. Just don’t.

– Oh that was never my intention. In order to patronize someone…

– WILL YOU FUCKING TIE YOUR SHOES?

– Why the harsh language? Is that really needed? I have only given you the current status and all you can do is react strongly against me. I have not invented laces, nor did I decide to place my own hands at a different altitude than my own feet. I suggest you review our options and come to your senses before we do something we might regret.

– Do you see my hand? I swear it can fly and land on your face in no time.

– Let’s not be too hasty now. I would have to inform legal of your perceived intentions and will have to quote your language. Research indicates that people in your situation have very little chances of winning a legal fight that involves strong wording and physical violence.

– … You know what? You… You just stay here, Ok?

– That’s what I have been telling you all the time. Glad you finally came to your senses mate.

Written by nicolas314

Monday 9 July 2018 at 11:08 pm

Wunder Weather

leave a comment »

wunderJust released this small piece of code a few days back:

https://github.com/nicolas31/wunder

I wanted to be able to bring up the weather forecast for the place I am currently visiting without having to yield my address book to a shady app, or suffer from tons of annoying ads eating through my data plan and phone storage.

The Yahoo weather app is fantastic but has too many ads. Weather web sites are incredibly data heavy, making it nearly impossible to get right to the information I am looking for: is it going to rain today or tomorrow? Expected temperatures?  Android has some ad-less widgets but they usually request GPS positioning and I’d rather not activate location services when I don’t need them.

So I hacked something. Made a web app that identifies your position by geolocating the requester’s IP address, obtains the weather forecast from a reliable source, and displays the only weather information I need on a fast loading page.

First issue: geolocalize an IP address.

There are many free services on the net to achieve that. Alternatively, you can download a static list and refresh it at regular intervals, but I wanted to get something a bit more dynamic. I chose:

http://ip-api.com

Their API is dead simple and just works. Provide an IP address, get a country code, city name, latitude and longitude. You do not need to subscribe to their services, just make sure you are not choking them with too many requests.

Second issue: find a reliable weather source.

I fist tried openweathermap.org. This is a very cool site but has a few shortcomings:

You can get the weather for a given [city, country] or [lat, lon]. The list of supported [city, country] pairs is static and can be downloaded from their web site. While they do support a lot of cities in the world, the problem was figuring out how to match [city, country] between what is returned by ip-api.com and what is understood by openweathermap.org. The matching is not 100% accurate.

Getting the weather by coordinates would work but it is far from user-friendly.  You end up with Weather forecast for location Lat=XX Lon=YY. I’d rather look up the weather for San Francisco than for a pair of coordinates that are not obviously recognizable.

I ended up looking up [city, country] by computing the smallest distance on the openweathermap list, but that is just tedious and a lot of work for very little gain.

Other major issue: the weather forecast is only provided GMT, which is utterly useless. What I want is local time, always. What do I care if I am told that it will rain from 2 to 5am GMT if I cannot relate that to local time?

Figuring out a conversion between GMT and local time is a lot trickier than it looks. Thanks to Daylight Saving Time rules that are changed at random intervals in various countries, it is very hard to predict the time offset in some places more than a couple of weeks ahead. Relevant:

A bit of googling around revealed there is an actual API from Google Maps to convert a Unix time stamp + latitude and longitude to a local time. This API takes into account local DST rules at the considered date/time, which is exactly what we want. No need to register with Google, as usual the API is free to use and rate-limited.

Example code can be found here: https://github.com/nicolas314/tz

In summary: getting the weather from openweathermap would require:

  • One external API call to associate IP to [lat, lon]

  • A search to associate [lat, lon] to [city, country]

  • One external API call to obtain actual weather data

  • One external API call to convert GMT to local time

I have implemented that and the result is ugly. Ok let’s see if we can find something smarter.

Next try: wunderground.com

They also offer an API to obtain weather data for any place in the world and they take care of two things: converting [lat, lon] to [city, country], and converting weather forecast to local time. This is exactly what we want.

Their API can also take care of geolocating an IP address but I found their results to be a lot less reliable than what I get from ip-api.com, so will stick to that for geolocation.

Their terms and conditions are fair. You need to register with them to obtain an API key and that’s about it. Results are delivered in metric units and can be localized in several languages. You also get a pointer to icons symbolizing the weather, which is perfect to generate a nice web page effortlessly.

Some comments about my implementation:

Results from wunderground contain a whole bunch of information I am not interested in, like temperatures in Farenheit. Not an issue: the Go JSON API allows defining fewer fields than what is parsed, so you can keep your structs small with only relevant data.

When running behind a reverse proxy, the incoming requesting IP address you see is the one for the proxy. In order to get the real incoming IP address you need to configure the reverse proxy to pass it along, usually in an HTTP header. Since I am running this service behind nginx, I get the address from X-Real-IP. That is probably different for each reverse proxy out there.

Hardcoded handlers are provided to take care of requests for /favicon.ico and /robots.txt. I was tired of seeing 404 requests in my logs for these two.

Results are cached by IP address for one hour to avoid flooding upstream API services with requests. Results are displayed from a template that can easily be tweaked. The one I wrote fits nicely enough on both mobile and desktops, your mileage may vary.

I installed the end result on a tiny VPS instance, for my own use. Hoping that could be useful to somebody else.

Written by nicolas314

Tuesday 16 August 2016 at 1:46 pm

Posted in go, programming

Tagged with , ,

easy-rsa alternative

leave a comment »

Glad to announce that 2cca, the two-cent Certification Authority has now been ported to pure C with libcrypto (openssl) as single dependency. The goal was to make it available on openwrt as it seems pyopenssl is not available on this platform — without a lot of efforts.

As always, I swear this is the last time I ever link one of my sources against OpenSSL… until a replacement is made available.

Back to the point: you can now generate a Root CA, server, and client certificates to use with OpenVPN, with a couple of commands.

Download it from here:

https://github.com/nicolas314/2cca

Compile it with:

cc -o 2cca 2cca.c -lcrypto

Generate a root with e.g.:

2cca root O=Home CN=MyRootCA C=FR L=Paris email=postmaster@example.com

Your root is entirely defined by ca.crt and ca.key in the current directory. Its default duration is 10 years. Now that you have a root, you are going to use it to sign server and client certificates with e.g.:

2cca server CN=vpn.example.com C=FR L=Roubaix email=vpnmaster@example.com
2cca client CN=jdoe C=UK L=London email=jdoe@example.com duration=365

Your server identity is defined by vpn.example.com.crt and vpn.example.com.key. Your first client is jdoe.crt/jdoe.key.

You can verify certificates using openssl verify, e.g.:

openssl verify -CAfile ca.crt jdoe.crt

Certificate serial numbers are 128-bit long, which guarantees that they can be unique without having to memorize an incremental index. Your certificate database is the current directory.

Enjoy!

 

 

Written by nicolas314

Wednesday 30 December 2015 at 10:52 pm

Posted in openvpn, openwrt, pki, programming

Tagged with , ,

Easier easy-rsa

leave a comment »

openvpnIf you have ever set up an OpenVPN server, you probably had to fight your way through the certificate generation steps. Something like what is detailed here:

https://openvpn.net/index.php/open-source/documentation/miscellaneous/77-rsa-key-management.html

The official OpenVPN guide refers to easy-rsa, which is a royal pain in the butt. Even with the HOWTO in front of me, it takes me ages to set things up and if I ever have to come back later to generate more client certificates, I inevitably end up restarting from scratch because I cannot remember which steps I took and where I stored files.

Does not seem so difficult though. You need to generate a Root CA, and then use it to sign a server certificate (which is stored on your server) and client certificates which you distribute to your clients. I re-implemented the whole thing as a Python script in a couple of hours, tested it with an openvpn instance, and it works quite well. The script can be found here:

http://github.com/nicolas314/2cca

It is called two-cent CA because that is exactly what it is. There is no support for security modules like smart cards or HSMs because I do not need them, but since it is based on python-openssl it should not be too hard to make it work with P11 tokens.

Here is an example session where I create the root, a server identity, and two client identities for Alice and Bob.

$ python 2cca.py root
Give a name to your new root authority (default: Root CA)
Name: MyRoot
Which country is it located in? (default: ZZ)
Provide a 2-letter country code like US, FR, UK
Country: ZZ
Which city is it located in? (optional)
City: 
What organization is it part of? (default: Home)
Organization: Home
--- generating key pair (2048 bits)
Specify a certificate duration in days (default: 3650)
Duration: 
--- self-signing certificate
--- saving results to root.crt and root.key
done
$ python 2cca.py server
--- loading root certificate and key
Give a name to your new server (default: openvpn-server)
Name: myopenvpn-server
Which country is it located in? (default: ZZ)
Provide a 2-letter country code like US, FR, UK
Country: ZZ
Which city is it located in? (optional)
City: 
--- generating key pair (2048 bits)
Specify a certificate duration in days (default: 3650)
Duration: 
--- signing certificate with root
--- saving results to myopenvpn-server.crt and myopenvpn-server.key
$ python 2cca.py client
--- loading root certificate and key
Give a name to your new client (default: openvpn-client)
Name: Alice
Which country is it located in? (default: ZZ)
Provide a 2-letter country code like US, FR, UK
Country: UK
Which city is it located in? (optional)
City: Cambridge
--- generating key pair (2048 bits)
Specify a certificate duration in days (default: 3650)
Duration: 
--- signing certificate with root
--- saving results to Alice.crt and Alice.key
$ python 2cca.py client
--- loading root certificate and key
Give a name to your new client (default: openvpn-client)
Name: Bob
Which country is it located in? (default: ZZ)
Provide a 2-letter country code like US, FR, UK
Country: US
Which city is it located in? (optional)
City: Boston
--- generating key pair (2048 bits)
Specify a certificate duration in days (default: 3650)
Duration: 
--- signing certificate with root
--- saving results to Bob.crt and Bob.key
& ls
2cca.py    Alice.key  Bob.key    myopenvpn-server.crt  root.crt
Alice.crt  Bob.crt    README.md  myopenvpn-server.key  root.key

You want to keep root.crt for what OpenVPN calls the CA certificate. Do not loose root.key, you will need it whenever you will want to issue more client or server certificates. Install the other files as required.

Tested on Linux (Debian, Archlinux) and OSX.

Enjoy!

Written by nicolas314

Monday 28 December 2015 at 12:51 am

One-time file-sharing

leave a comment »

oneSay you rent a box somewhere on the Internet. You installed Debian stable on it because you want it to be nice and stable and run a few daemons that are useful to have online. Could be to hold your vast music collection, family pictures, or use it as remote storage for backup. Imagine you wanted to share some of the files hosted on this box with your relatives, who may or may not be computer-literate. Most of them would know how to use a webmail but asking them to install an ftp client is just beyond reach.  Obviously, you do not want to give these guys too many rights over your box (like an ssh access for scp). What are the solutions?

Setting up a dedicated HTTP server

Simple enough: set up an HTTP server to distribute static files. lighttpd is simple enough to setup in a couple of minutes and is very efficient for static stuff. But you do not want to distribute your files to the whole Internet. Sooner or later a web spider will crawl in and index your family pictures and all sorts of things you never meant to be public.  Next step: configure password-protection on the server

Fair enough. Now you have limited file downloads to people who know the password — provided they know how to enter a password. Do you create multiple accounts, one for each of your peers? It would be preferrable, otherwise you will never know who downloaded what. But then you have to communicate their passwords to your peers and make sure they have a procedure in case they forget it. You know you are headed straight to massive butt pains.

Second issue: passwords can be shared. You shared that 2GB movie with a couple of friends and a couple of weeks later you find out that there are currently 1,549 active downloads for this file. Sharing is in human nature and that is completely Ok, but you probably did not sign up to become a content distributor over the whole Internet, only with a couple of friends and relatives.

Next step: use one-time authentication

There are better solutions out there: since you only mean to share one single file (or set of files) each time, you do not need to create accounts for your friends. You give them a one-time download token and forget about it.

A one-time download token is a URL. It looks like the kind of URLs you get from URL shorteners with the funny string at the end. Something like http://shortener/12398741

One-time tokens can be shared but since they can only be used once, the person who shared it has lost it. The token is randomly generated and invalidated immediately after it is used to avoid having robots automatically scan all possible URLs in a row until they find a valid one.

There are many ways to achieve this on regular HTTP servers. Apache probably has a million configuration options for user authentication, including one-time passwords or something similar, but I have to admit I did not even try. I already wasted enough of my life in Apache config files. lighttpd can be configured to do that but the only solution I found required some Lua scripting and I did not feel up to the task.

Next-step: Do It Yourself

After reviewing countless pages of configuration options for various HTTP servers, I decided that it would be shorter for me to implement this in a tiny web app rather than try and understand complex configuration options.  My first iteration made use of a Python FCGI script in web.py attached to a lighttpd process. Pointing out static files from a Python web app to the embedding lighttpd process is reasonably simple.

This implementation suffered from a number of pitfalls though. For one thing, performance was bad. For some reason, the Python process would eat insane amounts of CPU and RAM when sending big files, slowing down the server to a crawl. Second showstopper was the complexity involved for such a simple setup. I had to write a Python script to generate the lighttpd configuration file with a number of deployment options: where to put config files, log files, static files, port number, etc. And then came the inevitable issues with dependencies: Python version versus web.py version versus lighttpd version.  Some combinations worked fine, some did not.  Nothing specific to Python or lighttpd, but the more you have gears, the more you have places for grains of sand to fit in.

I still survived with this setup for a year or so, when Go came in. I have already reviewed the language in the past and will not come back to that, but suffice it to say that developing HTTP servers in Go is the most natural thing. Adding the one-time token ingredient to the soup was implemented in just one evening.

Once rewritten in Go, I found out that the end-result was about just as big as the Python implementation, excluding the script that created the lighttpd config. The main difference was of course that I do not have to maintain cross-references between package versions for Python, lighttpd, and web.py, since there is only one dependency to cover: Go itself.

It was straightforward to enhance the program to support more options, respond to favicon, and handle a JSON-readable database for active tokens.  Performance is astounding. The serving Go process never takes more than a few megs of RAM (about the size of the executable itself) and only uses tiny amounts of CPU since the process is mostly I/O based anyway.

There is one thing I should have foreseen and had to re-implement. I am sending the one-time links by email and more and more people are reading their emails from their smartphone or tablet. Many just clicked the link without thinking twice, triggering a 2-4GB download and killing both their mobile and data plan at the same time. Wrong move.

The next version features a two-time download page: the first link sends users to a page summarizing the download, offering a second link to actually start the real thing with a big warning about the size of what will actually be sent.

There are many other features I would like to add to the current version, and I am hoping other people have better ideas for new features, which is the reason why I shared it on github. Find it here:

https://github.com/nicolas314/onetime

Since we are talking about sharing private date between friends and relatives, protecting the download may be a good idea. A recently added feature was support for HTTPS. You only need to point your config to server certificate and key files and off you go. The HTTP/HTTPS thing is completely handled by Go.

The resulting program is far from top-quality but it fulfils the needs. Go give it a try if you want to. Careful though: it will only work on Linux boxes for now.

Written by nicolas314

Wednesday 24 July 2013 at 9:18 pm

Posted in fun, go, programming, webapp

Tagged with , , ,

Starbugs

leave a comment »

The night was clear, we would not be playing Bomberman in the VLT control room that night. Clear skies and a sub-arcsecond seeing meant we would have a full batch of data to process every hour or so until the next morning. Once the calibrations had finished, the telescope operator launched the first observation. I re-compiled the whole processing software once more, just to be sure we had not forgotten anything, ran a series of unit tests for good measure, and waited in front of my screen for the first incoming set of frames to appear on the local disk.

First batch of sixty frames was completed after exactly sixty minutes. As the machine started doing its number-crunching, everybody in the room turned to me, waiting for the first processed image to come out. It took a good fifteen minutes for all algorithms to run through the set: calibrate all frames, remove the infrared sky, take into account bad and crazy pixels that have been hit with cosmic rays during the observation, register all frames to a common position, and finally stack them to a single image. The final result appeared on the screen above me and I could see smiles all around. It seemed the results were up to what my customers were expecting.

Now we had a clear image of a set of bright object against a dark background. In order to assess how much infrared light is emitted by each object, it needs to be calibrated. Somewhere on the image is a standard: a star with precisely known photometry in the wavelength we had been observing. Compute how many photons were received in this image from this star and you can deduce the magnitudes for all other objects present on the same frame.

I checked once more the final frame position on the sky and then launched the photometry calibration routine. The standard star was found and identified by name, its photometry computed by integrating all received light in a small surrounding radius, and then all objects in the frame were suddenly known by magnitude rather than number of photons. Perfect score! With a sigh of relief, I finally pushed myself away from the desk and reached for some water. The memory routines had done their job, we did not crash in flight by lack of RAM this time. Eleven more hours to go and then we could all go to sleep.

Next incoming data batch was processed just fine. Another image emerged. And then another one. It seemed everything was working perfectly fine.

Around midnight, something weird happened: the result image was correctly processed but photometry calibration failed because it found no standard star in the frame.

– What? Emilio, did you include a standard in the last observation?
– Let me check… Yes I did. You should have it somewhere around the top-right corner.

The standard star was indeed there, so why did the photometry calibration routine fail to find it?
I immediately opened the database we had for infrared standards and started searching frantically for the star, finding it immediately. I reached for the debugger and re-ran the whole routine once more with breakpoints. Confirmed: the search for standard stars in this region returned nothing, and yet the database was correctly loaded and completely in memory. The debugger showed what seemed like correct values for star positions, but the search function failed for some reason.

The star database we had was pretty simple: a simple text file containing named columns: first the star name, then its position on the sky as Right Ascension and Declination (a couple of angles), and then its magnitude at various wavelengths. Something like:

# Name | Ra         |  Dec      | Sp |  J     |    H   |   K    
AS01-0 | 00 55 09.9 |  00 43 13 | -- | 10.716 | 10.507 | 10.470 
AS03-0 | 01 04 21.6 |  04 13 39 | -- | 12.606 | 12.729 | 12.827 
AS04-1 | 01 54 43.4 |  00 43 59 | -- | 12.371 | 12.033 | 11.962 
AS05-0 | 02 30 16.4 |  05 15 52 | -- | 13.232 | 13.314 | 13.381 
AS05-1 | 02 30 18.6 |  05 16 42 | -- | 14.350 | 13.663 | 13.507 
AS07-0 | 02 57 21.2 |  00 18 39 | -- | 11.105 | 10.977 | 10.946 
AS10-0 | 04 52 58.9 | -00 14 41 | -- | 11.349 | 11.281 | 11.259 
AS13-1 | 05 57 10.4 |  00 01 38 | -- | 12.201 | 11.781 | 11.648 
AS13-1 | 05 57 09.5 |  00 01 50 | -- | 12.521 | 12.101 | 11.970 
AS13-3 | 05 57 08.0 |  00 00 07 | -- | 13.345 | 12.964 | 12.812 
AS15-0 | 06 40 34.3 |  09 19 13 | -- | 10.874 | 10.669 | 12.628 
AS15-1 | 06 40 36.2 |  09 18 60 | -- | 12.656 | 11.980 | 11.792 
AS15-2 | 06 40 37.9 |  09 18 41 | -- | 13.711 | 12.927 | 12.719 
AS15-3 | 06 40 37.9 |  09 18 19 | -- | 14.320 | 13.667 | 13.415 
AS16-0 | 07 24 15.3 | -00 32 50 | -- | 14.159 | 14.111 | 13.305 
AS16-1 | 07 24 14.3 | -00 33 05 | -- | 13.761 | 13.638 | 13.606 
AS16-2 | 07 24 15.4 | -00 32 49 | -- | 11.411 | 11.428 | 11.445 
AS16-3 | 07 24 17.2 | -00 32 27 | -- | 13.891 | 13.855 | 13.818 
AS16-4 | 07 24 17.5 | -00 33 07 | -- | 11.402 | 11.106 | 11.043

J, H, K are infrared bands corresponding to a relatively narrow wavelength range.

Something went wrong in the star-loading routine, so I loaded the whole set into memory once more and dumped it back to a text file to plot it. The results were not particularly obvious:

catalog

Somebody in the room came up to the screen and asked what we were looking at. I said: “these are the positions of all known infrared standards we have. For some reasons we cannot find tonight’s star in here.”

Looking at it again, I found our star. It was not in the right position. It should have been below the x axis but had shifted symmetrically above it. Looking at the data set again, the Declination was indeed negative: something like -00 14 41, but it was plotted on the wrong side of the x axis.

And then it dawned on me: the star was plotted at +00 14 41 instead of -00 14 41.

How do you read numeric data in C? Using scanf(). When you scanf() for “-00”, what do you think ends up in memory? Zero. Positive zero, since it is technically the same as negative zero. Except the angle has now been flipped around the x axis.

Right: plotting a denser set of stars revealed a clear white patch for Declinations between zero and minus one. I had just forgotten to take into account the first character as a sign since scanf() does not make any difference between “00” and “-00”. Once I corrected the database-loading line, everything fell into place and photometry computations could take place as expected.

Interestingly enough, it seems the same bug hit a large number of GPS devices over the past years. The German C’t magazine told the story a few years back about somebody who planned a bike tour around Bordeaux and ended up with intermediate points in the middle of the ocean. Bordeaux is located around longitude zero (Greenwhich), so you do have data points located at an angle that starts with -00. In effect, you could see all points correctly plotted on the map except for the ones located between zero and minus one degree, which flipped over the other side of the meridian. As soon as I saw the map I knew exactly what had happened.

At least the guy was clever enough not to bike into high waters. It could have been worse: though probably related to time manipulation errors rather than angles, you may want to read how F22 Raptors spontaneously rebooted upon crossing the international date line:

F22 Raptor gets zapped by international date line

There are some assumptions you should not make about handling time in software. Some of them are presented in this blog article:

Falsehoods programmers believe about time

Time and angles can be tricky scalars.

Written by nicolas314

Wednesday 26 June 2013 at 12:39 am

Sorting Certificates

leave a comment »

As I went for a smoke the other day, I found two colleagues trying to solve a puzzle they had to code. The game is the following: first you get a list of certificates belonging to Certification Authorities. A certificate is a list of key/value pairs that are expressed in a canonical way in binary (in a format called ASN.1) and then signed with a cryptographic key. Among the key/value pairs are:

  • A name for the identity corresponding to this certificate, or DN for Distinguished Name
  • A name for the entity that delivered (signed) the certificate: Issuer name
  • A serial number that is unique for this Issuer+Certificate
  • Validity dates: valid from and valid until
  • … and a bunch of other fields that are irrelevant for this issue

Certificates are always delivered by a Certification Authority (CA) except the ones for Root CAs that are self-signed (or self-issued), in which case Issuer and Issuee have the same name. The way Certification Authorities work, you normally start by creating a Root CA then issue certificates for subordinate CAs (subCA) that are themselves in charge of creating their own CAs, or just issuing certificates to end-users, machines, or applications. CA hierarchies may look like this in their simplest tree-like approach:

CA hierarchy

CA hierarchy

Now you received a list of unsorted certificates and you are asked to sort them out so that any CA certificate must have its Issuing Root CA on its left. If there are multiple roots, they are allowed to appear anywhere in the list as long as they are left of their daughter CAs. How do you sort them?

A very straightforward approach would be re-building the CA tree. Find out Root CAs: they are easy to identify as their issuer is themselves. Then parse all remaining certificates and find the immediate daughters for Root CAs you already have. Parse again and re-attach in a tree-like structure, sorting siblings together. Once you have a sorted tree, iterate on all root CAs, then subCAs, etc. until you reach a terminal node, i.e. a CA that has not issued CA certificates itself.

Fancy, but that requires some tree-like structures in memory that may be tricky to get right on the first attempt. I also did not like the fact that emitting CAs in a list would probably have to use recursion to remain elegant. I have very bad memories of recursive algorithms in production, I have seen stacks vaporize in flight more than once. Sure, they can be translated to iterative methods but then forget about elegance.

My colleagues were looking into fancier ways of achieving the same result by designing some kind of clever sorting algorithm with a bit of memory to end up with a sorted list in a limited number of passes. When I joined them they had just found a sort in O(Nˆ3). I tried to understand their method but just could not figure it out.

I thought about it for a moment and got one of these a-ha! insights:

“Guys, have you tried sorting the input list by validity date? Since a daughter CA is always younger than its parent, just sort on the valid from field.”

Problem solved.

Written by nicolas314

Thursday 20 June 2013 at 11:31 pm

C++ quotes

leave a comment »

Best C++ quotes ever: C++ is good for the Economy, it creates jobs!

I like the proposed alternatives:

  • C
  • Go
  • Throwing yourself in an active volcano

Written by nicolas314

Tuesday 22 May 2012 at 11:32 am

Posted in fun, programming

Tagged with , ,

Go recipe: 3DES

leave a comment »

Just posted a really basic example of 3DES encryption with Go. Check it out on github: https://github.com/nicolas314/go-recipes

See godes.go

Can’t remember where I got the test vectors from, probably an RFC. Did not invent them myself.

Written by nicolas314

Thursday 10 May 2012 at 7:39 pm

Posted in go, programming

Tagged with , , , ,

Die C++, Die

leave a comment »

Why should I have written ZeroMQ in C, not C++
http://www.250bpm.com/blog:4

We now have official confirmation that the C++ standardization body is made of extra-terrestrial beings come to Earth to prevent humanity from reaching the singularity stage. By creating C++ and making it appear as “the language of tomorrow”, they succeeded in stalling software engineering into a Stone Age it will never quite recover from.

The above post is just a late realization of that fact.

Written by nicolas314

Thursday 10 May 2012 at 4:22 pm

Posted in programming

Tagged with ,

Go language review

leave a comment »


Go is a relatively new language: designed in 2007, it was just released last month in version 1.0 under a BSD license. Go was invented at Google by Robert Griesemer, Rob Pike and Ken Thompson. Thompson is mostly known for his work on the B and C languages at Bell Labs when Unix was born and this is quite obvious in many of the language design decisions.

In a nutshell: Go is a compiled language meant for systems programming. It is part of the C family and is meant to be a 21st-century C with everything you would expect from a recent programming language. Think of it as what C++ should have been if it had not been taken over by an army of mad lobbyists, or if Java had been designed with efficiency in mind.

Go was written for Unix: Linux, BSD, OSX. A courageous team of volunteers apparently ported it to Windows.

Compared to its C ancestor, Go comes with a large number of features that draw from 40 years of experience with C. Every sore point in C has been addressed and an elegant solution proposed. Let me review the main points:

Fast compiler

Go’s compiler is fast. Small projects compile in terms of milliseconds, not seconds, which in effect gives the feel of an interpreted language. Launching a full compilation for a small-sized project and running the resulting (native) executable takes less time than to start a Python interpreter. Various benchmarks show that generated code is close to C in terms of speed.

Impressive standard library

The standard Go library comes with a large number of utilities for everything you may need at a system level: string processing, web services, crypto primitives, image processing, etc. Far from a CPAN but you already have quite a lot to get started. If all you are looking for is a solid base to start webapp programming, you have all you need.

Enhancements compared to C

  • Go has no pre-processor. Yay!
  • Functions can return multiple values, which enables returning an error status out of band. This simplifies enormously error handling in low-level libraries.
  • Type/variable is inverted: name the variable first then give its type. This feels weird at first but it avoids a number of common mistakes like:
    int* p1, p2 ;

    where p1 is a pointer to int but p2 is just an int.

  • Switch statements do not fall through: after a case you exit the switch. This is probably one of the most common errors encountered in C.
  • Mandatory braces after if, for. Yet another common C error.
  • Native lists and maps
  • Native (immutable) strings and arrays with bound-checking. Forget buffer overflows!
  • Pointers but no pointer arithmetic, i.e. get the full power of pointers without the dirty tricks.
  • Garbage collection. This does not prevent you from doing memory management but at least you get some help at the language level.

Enhancements compared to Java and C++

  • Go has exceptions but reserves them to… exceptional cases. No more flow-control through try/catch! Returning an error status from every function that needs to signal an error to its caller is much easier to handle and does not break programs in random places.
  • Go allows you to define methods on struct types but does not have the notion of classes or inheritance. This still allows the benefits of storing related data and methods into single objects without having to suffer from inheritance pitfalls.
  • Polymorphism: Go has interfaces but does not force the programmer to declare who implements what, this is automatically handled by the compiler (as it should). This promotes duck-typing while preserving strong typing.
  • Package names look like URLs instead of Java’s infamous reverse notation. You can import remote code with something you can understand at first glance:
    import "git.example.com/user/project"

    instead of

    import com.example.project.user.module
  • No operator or function overloading. I have seen projects fail because developers abused these in C++. What you read is what you run!
  • Anonymous functions and closures: quite handy for callbacks and lots of other functional tricks

Concurrency handled through channels

This is a point I did not have enough time to study. Go programs are inherently multi-threaded but they handle concurrency by defining blocking interfaces (channels) between threads, which alleviates the need for mutexes. More on that topic as soon as I have time to play with it.

And more…

Go also includes a documentation tool, a code-formatting tool, integration with the most common Version Control tools (SVN, git, Mercurial), a couple of compiler candidates, it can be debugged using gdb and does not need Makefiles to build packages from source. All things you would expect from a modern programming language.

So what is Go good for?

Everything! It is easy to see a million opportunities for a language that has all the features of a high-level scripting language and the benefits of a strongly-typed compiled language.

The only missing point for now is a crucial lack of cookbook. The standard library comes with minimal documentation that could really use a more comprehensive tour. Or maybe it is just that searching Google for ‘Go’ is obviously bound to return a ton of irrelevant results?

Written by nicolas314

Monday 30 April 2012 at 6:07 pm

Posted in go, programming

Tagged with , , ,

Go recipes on github

leave a comment »

A blog is not an ideal place to post code. Whatever Go snippets I have written or found interesting will be placed on github:

https://github.com/nicolas314/go-recipes

See you there.

Written by nicolas314

Thursday 26 April 2012 at 11:10 pm

Posted in go, programming

Tagged with ,

Go recipe: HTTPS server

leave a comment »

Go recipe: implement an HTTPS server that requires a client-side certificate for authentication but does not check certificate origin. Any client-side cert will be accepted, the Subject Common Name is printed upon visiting the page. Start the program and point your browser to https://localhost:4443

Generating a server key and cert is left as an exercise for the reader :-)

package main

import (
    "fmt"
    "net/http"
    "crypto/tls"
)

func Hello(w http.ResponseWriter, req * http.Request) {
    w.Header().Set("Content-type", "text/plain")
    fmt.Fprintf(w, "Hello\n")
    client_cert := req.TLS.PeerCertificates[0]
    fmt.Fprintf(w, "You are: %s\n", client_cert.Subject.CommonName)
}

func main() {
    http.HandleFunc("/", Hello)
    t := tls.Config {
            ClientAuth: tls.RequireAnyClientCert,
         }
    s := &http.Server {
            Addr:       ":4443",
            TLSConfig:  &t,
         }
    fmt.Println("Listening on 4443...")
    err := s.ListenAndServeTLS("server.crt", "server.key")
    if err!=nil {
        fmt.Printf("err: %s", err)
    }
}

Written by nicolas314

Tuesday 24 April 2012 at 11:02 pm

Posted in go, programming

Go recipe: hashes

leave a comment »

Playing with Go: here is a simple implementation that computes crypto hashes of a list of files provided on the command-line. Usage:

ghash md5|sha1|sha2|sha5 filenames...
package main

import (
    "bufio"
    "crypto/md5"
    "crypto/sha1"
    "crypto/sha256"
    "crypto/sha512"
    "errors"
    "fmt"
    "hash"
    "os"
)

func Hash(hashname string, filename string) (sum string, err error) {
    f, err := os.Open(filename)
    if err!=nil {
        return
    }

    var h hash.Hash

    switch hashname {
        case "md5":
        h = md5.New()
        case "sha1":
        h = sha1.New()
        case "sha2", "sha256":
        h = sha256.New()
        case "sha5", "sha512":
        h = sha512.New()
        default:
        err = errors.New("unknown hash: "+hashname)
        return
    }

    var nr int
    buf := make([]byte, h.BlockSize())
    bf  := bufio.NewReader(f)
    for {
        nr, _ = bf.Read(buf)
        h.Write(buf[0:nr])
        if nr<len(buf) {
            break
        }
    }
    f.Close()
    sum = fmt.Sprintf("%0x", h.Sum(nil))
    return
}

func main() {
    if len(os.Args)<3 {
        fmt.Println("use: md5|sha1|sha2|sha5 filenames...")
        return
    }
    for i:=2 ; i<len(os.Args) ; i++ {
        digest, err := Hash(os.Args[1], os.Args[i])
        if err!=nil {
            fmt.Println(err)
        } else {
            fmt.Printf("%s  %s\n", digest, os.Args[i])
        }
    }
}

Written by nicolas314

Tuesday 24 April 2012 at 10:56 pm

Posted in go, programming

Daily Fallacies

leave a comment »

To the question: “Why do we do things this way?“, you often get three different kinds of answer:

Argument by Longevity
“Because we have always done things this way”, also known as “It is known.”
Argument by Numbers
“Because everybody does it this way, there must be a reason.”
Argument by Authority
“Because the experts say it must be done so.”

These arguments are examples of fallacies. A large number of examples and counter-examples are provided on the Wikipedia page, I will merely provide some here:

  • Longevity: Humanity has survived 100,000 years without need for toothbrush, therefore toothbrushes are useless for human existence.
  • Numbers: Most people believe that if you toss a coin 10 times and heads come out 10 times, the next time you toss the coin it has less chances of coming up with heads. Therefore it must be true.
  • Authority: My favourite singer votes for candidate X, therefore I will vote for candidate X.

A fallacy derives from the fact that there is no link between the premises, leading to a wrongly acquired conclusion. A statistical fact is independant from belief, it is a mathematical truth that can be demonstrated. Adding more people or time to the fact does not influence the demonstration in any way. Somebody claiming to be a math expert declaring that the 11th toss has different than 50/50 chances could easily be proven wrong.

A science experiment on monkeys was apparently carried out in 1967 (Did the monkey banana and water spray experiment ever take place?). The experiment could be summarized as:

Five monkeys were locked in a cage with a banana hanging from the ceiling. Whenever a monkey tried to get the banana they were all sprayed with ice-cold water. After a while they stopped trying. Next step: they replaced one of the monkeys. The newcomer tried to get to the banana but the others would beat him up before he had a chance, knowing perfectly well that touching the banana triggered a cold shower. The researchers kept replacing monkeys one by one until the ones left had never been sprayed with cold water but kept beating up any newcomer who would dare get close to the banana.

The story is often told to illustrate why large organizations tend to cristallize around age-old processes, even when they stopped making sense a while ago. In fact I have seen it happen in nearly every company I have ever visited, no matter how big or small, in a dozen countries in Europe, Africa, and Americas.

Who has never spent a half-hour filling up expense forms for sums largely inferior to what is actually spent in people’s time filling up and processing these forms? Why should it be done otherwise? The rules are the same for one or a thousand euro expense, everybody has always done it this way and nobody ever complained. I know only one company who would pay systematic and fixed lumpsums for travel expenses. You only had to fill forms if you could justify spending more than the allowance.

Argumenting by authority is quite common in the workplace. An example would be external consultants who come up observing for a day or two, go back to their office and end up sending a large report describing how work processes should be changed without giving any other alternative or trying to argument their suggestions. I have seen that happen more times than I am willing to admit.

I do not believe an expert just because he declares himself such. Experts are knowledgeable people so what I expect from them are clear arguments, new data, and demonstrations. I need to follow the same path if I want to end up with the same conclusions. Of course, a talented and biased expert could only provide data pervasive to the point he is trying to make. This is another kind of fallacy and it is pretty hard to detect as soon as things get outside your own fields of expertise.

Fallacies are common, they are everywhere. We all do it because it is easier than a completely logical train of thought that requires mental exercise. They tend to convince people easily, especially when they end up with a correct conclusion. Example: “Most people these days brush their teeth, therefore it is a good thing. Experts will confirm this.”

Yes, dentists will confirm it. And brushing your teeth is definitely good for your health. But you should do it because it has been proven that it helps you keep your real teeth longer and suffer less from cavities, not just because lots of people do it. Proven? Quite a bit: do a bit of search for yourself (hint: use the tubes, Luke), though this will be left as an exercise to the reader.

Coming back to my field of expertise, fallacies in software engineering are a dime a dozen.

  • A large majority of developers can code in Java, therefore my project should be coded in Java.
  • Many programs I use daily are coded in C++, therefore my project should be coded in C++.
  • 95% desktop users are running Windows, therefore I should use Windows
  • CORBA has been designed by a panel of experts, therefore CORBA is an expert-level technology, I should use it in my project.
  • Software is made of lines of code, therefore coding is all that matters in a software project. A programmer’s productivity is measured in the number of lines s/he writes every day.

A closely-related fallacious trait commonly found among young software engineers:

  1. My program crashes
  2. My code is bug-free
  3. Therefore: the environment around my code is faulty

I once coached an intern who told me: “I double-checked my programs and there are no faults, but the binary crashes every time I launch it so the compiler must have introduced bugs in it.” Other young colleagues have found numerous bugs in interpreters, libraries, the operating system, and when all else fails: blame the incompetent user. Yep, all these things can have bugs, but maybe you should first look into the most recent element added to the system, i.e. your own code, before looking into other directions.

Not a day goes by without meeting a young engineer who lectures me about software engineering, usually answering my “why do you do it this way?” by “this is the way we have always done software here”. I am ready to accept any kind of debatable argument but not the longevity fallacy coming from people who have not defined the rules themselves. If things have always happened this way, what was the reason for it initially? Is it still valid today? Are we in the same context now? Why can’t we touch the frigging banana?

The only lesson I got from this over time is: pick the banana, always. Best scenario: you now have a banana. Worst-case scenario: now you know exactly why nobody touched it, even though nobody could explain it to you before.

Written by nicolas314

Wednesday 26 October 2011 at 11:12 pm

Programming for kids

leave a comment »

Interesting links for kids who want to learn programming:

Scratch from MIT
Scratch is meant just for that: provide a first approach to programming. Completely visual and exists in multiple (human) languages.
Processing
Processing a full-fledged language used to create beautiful visualizations, equally loved by scientists and artists. Very easy to learn and the results are executables that run everywhere.
Snake Wrangling for Kids
This book introduces Python programming for young readers through entertaining topics.

Written by nicolas314

Saturday 22 October 2011 at 12:22 am

Config file hell

with one comment

Complexity

xkcd 963 nailed it once again. How much fun is it to have to open one of the zillion Unix config files on your Debian box and start tweaking until it finally works? The graph could just as well show “Time since I last opened wpa_supplicant.conf”, or “/etc/network/interfaces”, “fontconfig”, “httpd.conf”, or “crontab”.

Unix is famous for its configurability. Unfortunately it has never offered a single convenient base library to support configuration file parsing. Every little piece of software had to design its own format, generate its own syntax rules (which complexity mostly depend on the programmer’s talent for parser writing), and force you to learn yet another language that will take you hours to understand and minutes to forget. Examples:

crontab

Seems the syntax for crontab files is just a bit beyond what my brain can absorb. I have set millions of cron jobs in my life and still cannot write one without copy/paste from an existing file.

sudoers

The associated man page does not just describe a set of options, it goes as far as defining a full-fledged language by providing a formal BNF grammar, as if end-users were yacc compilers. It even features significant whitespace. Yummy.

procmailrc

Before we had spam filters, procmail was the only wall between a sane inbox and a wave of unsollicited messages. But there was a heavy price to pay: learn how to write a bug-free procmailrc with no way to test it except to send yourself half a million fake e-mails until you got it right. I still have a couple of procmailrc templates somewhere just in case I ever have to get into this again.

sendmail.cf

Sendmail.cf will probably earn the gold medal of the most obscure, un-debuggable, impossible to write and just frankly insane configuration file there ever was. It is now mentioned in the Geneva convention about the non-proliferation of mental-illness-inducing configuration formats. But let us not be too harsh. Without it we would not have had the case of the 500-mile email and a lot less horror stories to tell our children.

Hopefully things are getting better. Now that we have XML we can tear away the last shreds of hope of ever understanding how to configure a piece of software by editing a file. And if you are really vicious, you could go as far as creating an XML-like config file format that cannot be validated.

I once had a problem, then I discovered XML, and then I had two problems.

Written by nicolas314

Wednesday 12 October 2011 at 10:44 pm

gunicorn basics

leave a comment »

Following up on a previous post about fapws3 and its inability to spread the load on several workers, here is a quick rundown of how to get a gunicorn instance running in a matter of minutes.

If you do not want to disturb your system’s Python installation, virtualenv is the perfect tool. It will re-create the equivalent of a chrooted environment for Python stuff only, allowing you to mess up as much as you want with libraries without having to trash system libraries.

$ mkdir webtests
$ cd webtests
$ virtualenv app
New python executable in app/bin/python
Installing distribute..................................................................................................................................................................................done.
$ . app/bin/activate
$ easy_install -U gunicorn
Searching for gunicorn
Reading http://pypi.python.org/simple/gunicorn/
Reading http://github.com/benoitc/gunicorn
Reading http://www.gunicorn.org
Reading http://gunicorn.org
Best match: gunicorn 0.12.1
Downloading http://pypi.python.org/packages/source/g/gunicorn/gunicorn-0.12.1.tar.gz#md5=6540ec02de8e00b6b60c28a26a019662
Processing gunicorn-0.12.1.tar.gz
Running gunicorn-0.12.1/setup.py -q bdist_egg --dist-dir /tmp/easy_install-gf0pif/gunicorn-0.12.1/egg-dist-tmp-8ToCBM
Adding gunicorn 0.12.1 to easy-install.pth file
Installing gunicorn_paster script to /home/nicolas314/webtests/app/bin
Installing gunicorn script to /home/nicolas314/webtests/app/bin
Installing gunicorn_django script to /home/nicolas314/webtests/app/bin
Installed /home/nicolas314/webtests/app/lib/python2.6/site-packages/gunicorn-0.12.1-py2.6.egg
Processing dependencies for gunicorn
Finished processing dependencies for gunicorn

Now let us write a default Python webapp with two classes: immediate responds immediately to /im requests and delayed simulates a long-running task responding to /de: sleeps for 10 seconds and returns.

hello.py contains:

import time, web
class immediate:
    def GET(self):
        return 'immediate'
class delayed:
    def GET(self):
        time.sleep(10)
        return 'delayed'
urls = ('/im', 'immediate',
            '/de', 'delayed')
application = web.application(urls, globals(), True).wsgifunc()

Startup gunicorn with 10 workers on localhost:

gunicorn -w 10 hello

Open a browser, point one tab to localhost:8080/de and the other one to localhost:8080/im. The delayed task does not impact (directly) immediate responses. QED.

gunicorn has been benchmarked with excellent performance and will take care of all multi-worker stuff for you. Integration with web.py is excellent and painless. Installation is just a couple of commands. Congratulations to the gunicorn team!

References:

Written by nicolas314

Wednesday 6 April 2011 at 11:30 pm

fapws3 + web.py

leave a comment »

Objective: run a web.py application with the fapws3 web server

References:

If you are looking into fast and easy ways to run your web.py-based application, there are many exciting alternatives out there claiming to be both easier to install (easy) and faster (not so easy) than Apache+mod_wsgi. A benchmark of Python web servers summarizes all good candidates today. I decided to give them all a quick try and see what they have to offer. First in line: fapws3

Here is my HelloWorld web.py:

---hello.py---
import web
class hello:
  def GET(self):
    return 'Hello world'
urls = ('/', 'hello')
application = web.application(urls, globals(), True).wsgifunc()

and here is the glu to run it from fapws3:

---run.py---
import hello
from fapws import base
import fapws._evwsgi as evwsgi
if __name__=="__main__":
  evwsgi.start('0.0.0.0', '8080')
  evwsgi.set_base_module(base)
  evwsgi.wsgi_cb(('', hello.application))
  evwsgi.run()

Start the server with python run.py and point your browser to http://localhost:8080 to see it run. Ok, now let us modify a bit our web app: say I have a URL that requires longer computation times. This is simulated here with time.sleep:

import web, time
class immediate:
  def GET(self):
    return 'immediate'
class delayed:
  def GET(self):
    time.sleep(10)
    return 'delayed'
urls = ('/immediate', 'immediate',
           '/delayed', 'delayed')
application = web.application(urls, globals(), True).wsgifunc()

Open two tabs in your browsers, point the first to /immediate and the second one to /delayed. Now reload both… and wait 10 seconds to see /immediate get refreshed. Ouch. One long-running request blocks the whole server.

Issues

  • fapws3 is not threaded and never will be, according to the FAQ
  • fapws3 does not support SSL

No support for multi-threading means that you will have to implement your own manager/worker mechanism for long-running requests. The fapws3 FAQ recommends using many parallel instances and pound for load-balancing and SSL support. WTF?

Now I am left wondering: what could fapws3 possibly be useful for? There are so many more WSGI-compatible web servers with excellent performances, a full thread stack and complete SSL support out of the box, why should I bother with one that lets me do all the work? I probably missed something. Oh well…

Written by nicolas314

Monday 7 March 2011 at 11:39 pm

Today’s quotes

with 2 comments

Seen on reddit today. A bit of XML and Java-bashing never hurts…

“Java is a DSL for taking large XML files and converting them to stack traces.” (twitter)

“XML is like violence. If it doesn’t solve your problem, you’re not using enough of it.” (slashdot)

“XML is a giant step in no direction at all.” (see also XML rant)

“The essence of XML is this: the problem it solves is not hard, and it does not solve the problem well.” (Phil Wadler, POPL 2003)

Written by nicolas314

Wednesday 24 November 2010 at 3:15 pm

Posted in programming

Tagged with , , , ,