Nicolas314

All my geeky stuff ends up here. Mostly Unix-related

Archive for the ‘python’ Category

Easier easy-rsa

leave a comment »

openvpnIf you have ever set up an OpenVPN server, you probably had to fight your way through the certificate generation steps. Something like what is detailed here:

https://openvpn.net/index.php/open-source/documentation/miscellaneous/77-rsa-key-management.html

The official OpenVPN guide refers to easy-rsa, which is a royal pain in the butt. Even with the HOWTO in front of me, it takes me ages to set things up and if I ever have to come back later to generate more client certificates, I inevitably end up restarting from scratch because I cannot remember which steps I took and where I stored files.

Does not seem so difficult though. You need to generate a Root CA, and then use it to sign a server certificate (which is stored on your server) and client certificates which you distribute to your clients. I re-implemented the whole thing as a Python script in a couple of hours, tested it with an openvpn instance, and it works quite well. The script can be found here:

http://github.com/nicolas314/2cca

It is called two-cent CA because that is exactly what it is. There is no support for security modules like smart cards or HSMs because I do not need them, but since it is based on python-openssl it should not be too hard to make it work with P11 tokens.

Here is an example session where I create the root, a server identity, and two client identities for Alice and Bob.

$ python 2cca.py root
Give a name to your new root authority (default: Root CA)
Name: MyRoot
Which country is it located in? (default: ZZ)
Provide a 2-letter country code like US, FR, UK
Country: ZZ
Which city is it located in? (optional)
City: 
What organization is it part of? (default: Home)
Organization: Home
--- generating key pair (2048 bits)
Specify a certificate duration in days (default: 3650)
Duration: 
--- self-signing certificate
--- saving results to root.crt and root.key
done
$ python 2cca.py server
--- loading root certificate and key
Give a name to your new server (default: openvpn-server)
Name: myopenvpn-server
Which country is it located in? (default: ZZ)
Provide a 2-letter country code like US, FR, UK
Country: ZZ
Which city is it located in? (optional)
City: 
--- generating key pair (2048 bits)
Specify a certificate duration in days (default: 3650)
Duration: 
--- signing certificate with root
--- saving results to myopenvpn-server.crt and myopenvpn-server.key
$ python 2cca.py client
--- loading root certificate and key
Give a name to your new client (default: openvpn-client)
Name: Alice
Which country is it located in? (default: ZZ)
Provide a 2-letter country code like US, FR, UK
Country: UK
Which city is it located in? (optional)
City: Cambridge
--- generating key pair (2048 bits)
Specify a certificate duration in days (default: 3650)
Duration: 
--- signing certificate with root
--- saving results to Alice.crt and Alice.key
$ python 2cca.py client
--- loading root certificate and key
Give a name to your new client (default: openvpn-client)
Name: Bob
Which country is it located in? (default: ZZ)
Provide a 2-letter country code like US, FR, UK
Country: US
Which city is it located in? (optional)
City: Boston
--- generating key pair (2048 bits)
Specify a certificate duration in days (default: 3650)
Duration: 
--- signing certificate with root
--- saving results to Bob.crt and Bob.key
& ls
2cca.py    Alice.key  Bob.key    myopenvpn-server.crt  root.crt
Alice.crt  Bob.crt    README.md  myopenvpn-server.key  root.key

You want to keep root.crt for what OpenVPN calls the CA certificate. Do not loose root.key, you will need it whenever you will want to issue more client or server certificates. Install the other files as required.

Tested on Linux (Debian, Archlinux) and OSX.

Enjoy!

Advertisements

Written by nicolas314

Monday 28 December 2015 at 12:51 am

Programming for kids

leave a comment »

Interesting links for kids who want to learn programming:

Scratch from MIT
Scratch is meant just for that: provide a first approach to programming. Completely visual and exists in multiple (human) languages.
Processing
Processing a full-fledged language used to create beautiful visualizations, equally loved by scientists and artists. Very easy to learn and the results are executables that run everywhere.
Snake Wrangling for Kids
This book introduces Python programming for young readers through entertaining topics.

Written by nicolas314

Saturday 22 October 2011 at 12:22 am

gunicorn basics

leave a comment »

Following up on a previous post about fapws3 and its inability to spread the load on several workers, here is a quick rundown of how to get a gunicorn instance running in a matter of minutes.

If you do not want to disturb your system’s Python installation, virtualenv is the perfect tool. It will re-create the equivalent of a chrooted environment for Python stuff only, allowing you to mess up as much as you want with libraries without having to trash system libraries.

$ mkdir webtests
$ cd webtests
$ virtualenv app
New python executable in app/bin/python
Installing distribute..................................................................................................................................................................................done.
$ . app/bin/activate
$ easy_install -U gunicorn
Searching for gunicorn
Reading http://pypi.python.org/simple/gunicorn/
Reading http://github.com/benoitc/gunicorn
Reading http://www.gunicorn.org
Reading http://gunicorn.org
Best match: gunicorn 0.12.1
Downloading http://pypi.python.org/packages/source/g/gunicorn/gunicorn-0.12.1.tar.gz#md5=6540ec02de8e00b6b60c28a26a019662
Processing gunicorn-0.12.1.tar.gz
Running gunicorn-0.12.1/setup.py -q bdist_egg --dist-dir /tmp/easy_install-gf0pif/gunicorn-0.12.1/egg-dist-tmp-8ToCBM
Adding gunicorn 0.12.1 to easy-install.pth file
Installing gunicorn_paster script to /home/nicolas314/webtests/app/bin
Installing gunicorn script to /home/nicolas314/webtests/app/bin
Installing gunicorn_django script to /home/nicolas314/webtests/app/bin
Installed /home/nicolas314/webtests/app/lib/python2.6/site-packages/gunicorn-0.12.1-py2.6.egg
Processing dependencies for gunicorn
Finished processing dependencies for gunicorn

Now let us write a default Python webapp with two classes: immediate responds immediately to /im requests and delayed simulates a long-running task responding to /de: sleeps for 10 seconds and returns.

hello.py contains:

import time, web
class immediate:
    def GET(self):
        return 'immediate'
class delayed:
    def GET(self):
        time.sleep(10)
        return 'delayed'
urls = ('/im', 'immediate',
            '/de', 'delayed')
application = web.application(urls, globals(), True).wsgifunc()

Startup gunicorn with 10 workers on localhost:

gunicorn -w 10 hello

Open a browser, point one tab to localhost:8080/de and the other one to localhost:8080/im. The delayed task does not impact (directly) immediate responses. QED.

gunicorn has been benchmarked with excellent performance and will take care of all multi-worker stuff for you. Integration with web.py is excellent and painless. Installation is just a couple of commands. Congratulations to the gunicorn team!

References:

Written by nicolas314

Wednesday 6 April 2011 at 11:30 pm

fapws3 + web.py

leave a comment »

Objective: run a web.py application with the fapws3 web server

References:

If you are looking into fast and easy ways to run your web.py-based application, there are many exciting alternatives out there claiming to be both easier to install (easy) and faster (not so easy) than Apache+mod_wsgi. A benchmark of Python web servers summarizes all good candidates today. I decided to give them all a quick try and see what they have to offer. First in line: fapws3

Here is my HelloWorld web.py:

---hello.py---
import web
class hello:
  def GET(self):
    return 'Hello world'
urls = ('/', 'hello')
application = web.application(urls, globals(), True).wsgifunc()

and here is the glu to run it from fapws3:

---run.py---
import hello
from fapws import base
import fapws._evwsgi as evwsgi
if __name__=="__main__":
  evwsgi.start('0.0.0.0', '8080')
  evwsgi.set_base_module(base)
  evwsgi.wsgi_cb(('', hello.application))
  evwsgi.run()

Start the server with python run.py and point your browser to http://localhost:8080 to see it run. Ok, now let us modify a bit our web app: say I have a URL that requires longer computation times. This is simulated here with time.sleep:

import web, time
class immediate:
  def GET(self):
    return 'immediate'
class delayed:
  def GET(self):
    time.sleep(10)
    return 'delayed'
urls = ('/immediate', 'immediate',
           '/delayed', 'delayed')
application = web.application(urls, globals(), True).wsgifunc()

Open two tabs in your browsers, point the first to /immediate and the second one to /delayed. Now reload both… and wait 10 seconds to see /immediate get refreshed. Ouch. One long-running request blocks the whole server.

Issues

  • fapws3 is not threaded and never will be, according to the FAQ
  • fapws3 does not support SSL

No support for multi-threading means that you will have to implement your own manager/worker mechanism for long-running requests. The fapws3 FAQ recommends using many parallel instances and pound for load-balancing and SSL support. WTF?

Now I am left wondering: what could fapws3 possibly be useful for? There are so many more WSGI-compatible web servers with excellent performances, a full thread stack and complete SSL support out of the box, why should I bother with one that lets me do all the work? I probably missed something. Oh well…

Written by nicolas314

Monday 7 March 2011 at 11:39 pm

Fast Python webapp

leave a comment »

Just spent the last few days trying to find the fastest way to put together a Python webapp. Not an easy task, especially since documentation on the topic is really abundant and (I found) rarely self-sufficient. I ended up choosing what I believe is the most straightforward alignment of code to get a Python webapp up and running in minutes, and make it portable to production mode without efforts.

web.py

web.py is the simplest Python framework there is. Straight and to the point: you can program very basic stuff but if you really want to, you can add templates and database and model-view-controller design as you see fit. The basic hello world in web.py would be:

 

!/usr/bin/python
import web

class hello:
    def GET(self):
        return 'Hello world'

urls = ('/', 'hello')
app  = web.application(urls, globals(), True)

if __name__=="__main__":
    app.run()

 

Cannot get simpler than that! URLs are mapped to callables by regexes, which gives you perfect flexibility for URL design. Your classes can implement different methods for GET and POST, keeping closer to the real REST philosophy. Without having to install any further software you can immediately test your app by running:

 

python hello.py

 

web.py is friendly enough to embed its own (pure-Python) web server for test purposes. Debug mode is also automatically activated in that mode so you will be able to get usable messages when things go wrong during development.

lighttpd

I have spent a lot of time with lighttpd now, browsing documentation (I even bought the book!), parsing the source and even participating on their forum. Now is time to get my return on investment. I just found out that lighttpd can launch web.py-based apps directly in fastcgi mode without need to write your own boilerplate code to convert fastcgi to wsgi. Here is a minimal lighttpd configuration that just works:

 

server.port = 8080
server.modules = ("mod_fastcgi", "mod_rewrite")
server.document-root = "/home/www/"
fastcgi.server = ( "hello.py" =>
    (( "socket" => "/tmp/fastcgi.socket",
       "bin-path" => "/home/www/hello.py",
       "max-procs" => 5
    ))
    )
url.rewrite-once = (
    "^/sta/(.*)$" => "/sta/$1",
    "^/(.*)$" => "/hello.py/$1"
)

 

The above specifies a fastcgi handler called ‘hello.py’ that is always called thanks to the last rewrite rule, except for stuff located in /sta which is directly served by lighttpd. /sta is where you are going to store your served static content like images and css.

In production you can launch a bunch of lighttpd front-ends and configure them to talk to a fastcgi app possibly located on another server or farm of servers.

Took me quite a while to converge to this simple solution. Other paths I reviewed where:

  • Apache+mod_wsgi: too heavy
  • cherokee+uwsgi: cherokee is really nice but uwsgi is an ugly duckling
  • lighttpd+SCGI+flup+cherrypy: works but heavy and boilerplate code is ugly and un-maintainable

Not saying the other solutions are bad, they are just not as straightforward.

Written by nicolas314

Thursday 7 October 2010 at 11:24 pm

Posted in python, webapp

Tagged with , , ,

My 2c on Amazon

with one comment

Hide the family jewels

As an early adopter I have enjoyed digital cameras at home for over 12 years now. This translates into about 20Gb of JPEGs on my home partition which I absolutely do not want to loose. I had the painful experience of getting burglarized a few years back and was lucky enough to recover my computers from the police station a couple of days later. The hardware itself has no importance to me but the pictures are of course priceless. This calls for a drastic solution: backup, backup, and remote backup. First two steps are easy: multiply the copies of your pictures using rsync on various hard drives around the house and you are covered against single hard drive failure. Make sure you take the habit of sync’ing them all every time you get a new bunch of pics and you are set. Now what are the solutions for remote backups?

Store it at work

The obvious solution is to encrypt a disk and leave it somewhere in my office, but that has obvious drawbacks. First is that I have to think about bringing the disk home every time I add more data. I tried it for a while and could never think about updating the drive. Second point is that there are lots of people going through my office every day. Even if I trust my colleagues, it is always tempting to borrow a USB hard drive you have seen sitting around the office for ages. The contents are of course encrypted, which makes the drive appear as unformatted to the untrained eye.

I do not want to lock stuff in drawers. Last time I did, I lost the keys and had to destroy a drawer to get to my stuff. Kinda cryptography in the real world, except brute force actually works.

Network storage

Network storage solutions are a dime a dozen and literally exploding these days. I tried a lot of them and came to the conclusion that Dropbox is by far the best in terms of usability and functionalities. It is the only solution I tried that has clients for Windows, Mac and Linux and that can dig through the firewall and http proxy at work without me configuring anything. It also has an iPhone app to review your files on the go and this is absolutely gorgeous. I can finally have the illusion of having the same disk at home on all machines, at work, and in my pocket.

I will probably become a paid subscriber at some point. The remaining detail I have to fix is to figure out how to upload 20 gigs of data to their servers with my puny 100kB/s home DSL connection. Dropbox also does not offer encryption, I have to figure out a way to encrypt everything on the fly but still make contents accessible for easy retrieval like an index or equivalent.

Amazon S3

Another shot at network storage solutions brought me to Amazon S3. This service offered by Amazon is mostly aimed at developers who want to host large amounts of data like a database backend for a dynamic web site. It is a bit rough around the edges. Lots of people have tried disguising the whole thing as a network disk without much success. Reviewing existing Python APIs and fuse-based stuff did not reveal anything revolutionary or stable. Anyway, I felt I just had to try it out.

My tests consisted in creating a dedicated directory (a bucket in Amazon terms) and upload 100 Mb of data to see how easy it would be. I want both to be able to sync my picture directories and encrypt all contents on the way up without having to recode too much stuff. I ended up with a little bit of Python glu around rsync and gpg that was not too satisfactory. It worked for basic tests but I would not have relied on my own code for production :-)

Amazon S3 is not a free service, but it isn’t expensive either. Doing my whole test set ended up with a bill for less than 2 euros. Fair. But this is where it hurts: Amazon billed me in US dollars and that triggers international charges on my credit card that are far above these 2 euros. In the end I might make my bank richer and will not bring anything to Amazon.

Pained by what I had discovered on my bank monthly slip, I decided to close the lid on the S3 experience and deleted all data from the bucket I created. Next month I was charged $0.02 for this operation, which turned into an absolutely ridiculous amount in euros with a fair charge attached from the credit card because they did not appreciate my micro-payment.

This is probably the last time I ever use S3. I really do not understand why Amazon can bill me in euros for books (even when I buy in the UK) and not for services. Another good idea could be for them to cumulate bills until they reach a reasonable sum like 10 or 15 euros. It would not change much to their cash flow and would really avoid un-necessary bank feeding.

My 2c on Amazon S3 have cost me more than my phone bill this month.

Written by nicolas314

Thursday 10 December 2009 at 11:09 pm

Python web development woes

with 6 comments

Writing a 2c web app is a lot harder than it looks.

Objective

Write a mockup prototype for a web-based application. Tools: anything you like, as long as the job is done quickly and can easily be modified to accomodate rapidly-evolving requirements.  In fact, this application is meant as a living demonstration of the future full-fledged stuff re-programmed later with something industrial-grade. Think of it as specifications that compile and can actually be run.

As the local Python expert I thought I would demonstrate how it can quickly get the job done with a minimal amount of efforts. Little did I know…

Choosing the web server

We are dealing with something that will essentially be web-based, choosing the appropriate web server seems the first thing to do.

I have worked with two web servers in the past: Apache and lighttpd. Apache is notoriously difficult to configure, the config file is full of traps and possible inconsistencies. There are complete books and tutorials on the Net about how to configure your own server and believe me they are all worth consulting. Apache is a really good server but if you have never used it you’d better plan a couple of weeks in advance to learn how to use it correctly.

lighttpd (pronounced “lighty”) is a really fast and lightweight server, much easier to configure. Once you have it installed you can literally have it run within minutes. Unfortunately it does not support https client-side certificates (yet) and that feature is needed for what I want to do. One guy recently submitted a related patch but unfortunately I could not get it to work against every version of lighttpd I could find. Exit lighttpd, welcome Apache!

First attempt: Python CGI script

Python comes with a cgi module that is great for writing demo scripts but quickly becomes a serious pain whenever you want to implement anything more complicated like logins, sessions or database-related stuff. This is really bare-bone but maybe a little too much. After spending half a day re-coding a session mechanism I finally gave up and moved on to the next stage.

Second attempt: Apache + mod_python

mod_python is great! A Publisher algorithm browses through your Python files and publishes on the web server anything that looks like a string or a callable. Imagine a static server responding to these URLs:

http://[server]/hello
http://[server]/world

If you have a Python module containing two top-level strings named ‘hello’ and ‘world’, they will be published by mod_python and displayed verbatim. More interesting of course is to use callables (functions or instances) for a dynamic site.

Took me a complete week to finish the site with mod_python but development was a breeze. I spent more time in my application business than with the tools themselves, which is the main reason for using tools like Python.

And then I reached a stopping point: I need to authorize web clients to upload XML files to the server in an unusual MIME type. Unfortunately mod_python offers no support to do this and even worse: it silently absorbs the uploaded files and does not even bother warning your application that it missed a client request. Going through mod_python forums I could find that somebody else already mentioned this to the developers but the feature was rejected because if you want to do serious web stuff you should move to WSGI.

At that point I could have gone back to CGI for the file uploading stuff but I did not want to live with a schizophrenic code being half CGI half mod_python. Besides, I do not even want to know how much time I would have needed to make this work in the Apache configuration file. Time to leave mod_python behind and move on.

Third attempt: Python WSGI

Now I have to wade through this infamous WSGI stuff and see if it is really worth all the buzz. To make things short: WSGI is a pure Python standard that specifies how a Web framework should behave at its lowest level. The intention is to make it easier to port a WSGI-compliant application from one framework or web server to another without having to re-code anything.

I read the full specifications for WSGI and I have to admit I did not really understand the motivations behind this design. But oh well, I trust the guys to have done a good job at factorizing web frameworks. The WSGI standard itself is really low-level.  There is no way you can develop a web site just armed with it, it is only meant for middleware providers. So let’s hunt for WSGI middleware!

WSGI stage 1: Django, Pylons

Django and Pylons are full-fledged frameworks that come up with all bells and whistles. Nothing bad with these but they do suffer from the same issues, namely:

  • They offer about a zillion features I do not care about
  • They cover almost everything I need, but not quite

Which means that I will probably end up deploying lots of packages I will never use and will have to code additional functionalities into their framework just to cover my own needs.

Both packages come with half a million dependencies on various additional packages, and every package means more maintenance.  I spent a couple of days on each to try and explore and came to the conclusion that it must be really great to use them as a basis for a larger project but I would not want to do it.

Mental note: I need to train myself on these frameworks, it might come handy some day.

WSGI stage 2: Bottle

Bottle is a lightweight WSGI environment all contained within a single file. Can’t beat that in terms of the fewest dependencies!  It offers a very simple syntax to route your URLs to your objects and makes for clean code like:

@route('/admin')
def administration():
return '.... html page here ....'

@route('/')
def index():
return '.... html page here ....'

Nice package, but I would largely prefer having the framework pick routable objects directly from my Python modules, like mod_python’s Publisher does. There were some other features missing from it and Bottle does not seem to be maintained any more, so I reluctantly decided not to use it.

Just out of curiosity I also tried to run Bottle within lighttpd, loosing another evening in the process. lighttpd does not support WSGI, you have to install yet another middleware layer (python-flup) and run the server in FCGI mode. After a whole evening of messing around I still could not get any Hello World out of my setup and ended up tracking an obscure bug in the way lighttpd spawns sub-processes. I do not have the courage to get into that in depth.

My conclusion on lighttpd: great for serving static files, still a long way to go before it can compete with Apache. I have no doubt the lighttpd guys will eventually get there though.

WSGI stage 3: Colubrid, Werkzeug

Colubrid offers exactly the kind of thing I need: a Publisher algorithm that goes through your objects and publishes them at predictable URLs. It took me no more than an hour to transform my mod_python application to Colubrid and see it run. Documentation for this project is pretty sparse though, and it is unfortunately not maintained any more. The authors refer to Werkzeug as the tool of choice now.

Enters Werkzeug: described as a library of WSGI helpers, it tends to suffer from the same overweight issues as Django or Pylons. A lot of dependencies on other libraries and a model that is really hard to understand. I spent a couple of hours going through the tutorial and could not make sense out of it. It is probably very powerful but seems inadequate for my brain.

So Colubrid it will be. It is unmaintained but the library does not force other dependencies and even if it has little documentation I can at least understand it. If I ever face issues I will modify it to suit my needs without fear of seeing my own patches overwritten by a new version. I found a couple of bugs but no showstopper for the moment.

Wrapping it up

I learned quite a lot in the process. Python is sufficiently high-level to expect development to be quick and to the point.  And in a way, that goal is pretty much achieved. Getting a dynamic web application is just a matter of coding your business logic into classes and then hooking them into a View and a database.

On the other hand, the sheer number of dependencies for most frameworks is a definitive showstopper for production. Many of these libraries are still relatively young and lack the polish needed to adapt from the package developer’s needs to your own.

Another thing I learned is that as time goes on, Python frameworks tend to become more and more complicated, to the point that there is little left for people like me who want to just have something that handles the HTTP protocol and lets you hook in the tools you need one at a time.

Oh well… Give me enough time and I might just end up writing my own.

Looks like you are writing a framework

Looks like you are writing a framework

Written by nicolas314

Thursday 27 August 2009 at 11:43 pm