Nicolas314

All my geeky stuff ends up here. Mostly Unix-related

Posts Tagged ‘lighttpd

Fast Python webapp

leave a comment »

Just spent the last few days trying to find the fastest way to put together a Python webapp. Not an easy task, especially since documentation on the topic is really abundant and (I found) rarely self-sufficient. I ended up choosing what I believe is the most straightforward alignment of code to get a Python webapp up and running in minutes, and make it portable to production mode without efforts.

web.py

web.py is the simplest Python framework there is. Straight and to the point: you can program very basic stuff but if you really want to, you can add templates and database and model-view-controller design as you see fit. The basic hello world in web.py would be:

 

!/usr/bin/python
import web

class hello:
    def GET(self):
        return 'Hello world'

urls = ('/', 'hello')
app  = web.application(urls, globals(), True)

if __name__=="__main__":
    app.run()

 

Cannot get simpler than that! URLs are mapped to callables by regexes, which gives you perfect flexibility for URL design. Your classes can implement different methods for GET and POST, keeping closer to the real REST philosophy. Without having to install any further software you can immediately test your app by running:

 

python hello.py

 

web.py is friendly enough to embed its own (pure-Python) web server for test purposes. Debug mode is also automatically activated in that mode so you will be able to get usable messages when things go wrong during development.

lighttpd

I have spent a lot of time with lighttpd now, browsing documentation (I even bought the book!), parsing the source and even participating on their forum. Now is time to get my return on investment. I just found out that lighttpd can launch web.py-based apps directly in fastcgi mode without need to write your own boilerplate code to convert fastcgi to wsgi. Here is a minimal lighttpd configuration that just works:

 

server.port = 8080
server.modules = ("mod_fastcgi", "mod_rewrite")
server.document-root = "/home/www/"
fastcgi.server = ( "hello.py" =>
    (( "socket" => "/tmp/fastcgi.socket",
       "bin-path" => "/home/www/hello.py",
       "max-procs" => 5
    ))
    )
url.rewrite-once = (
    "^/sta/(.*)$" => "/sta/$1",
    "^/(.*)$" => "/hello.py/$1"
)

 

The above specifies a fastcgi handler called ‘hello.py’ that is always called thanks to the last rewrite rule, except for stuff located in /sta which is directly served by lighttpd. /sta is where you are going to store your served static content like images and css.

In production you can launch a bunch of lighttpd front-ends and configure them to talk to a fastcgi app possibly located on another server or farm of servers.

Took me quite a while to converge to this simple solution. Other paths I reviewed where:

  • Apache+mod_wsgi: too heavy
  • cherokee+uwsgi: cherokee is really nice but uwsgi is an ugly duckling
  • lighttpd+SCGI+flup+cherrypy: works but heavy and boilerplate code is ugly and un-maintainable

Not saying the other solutions are bad, they are just not as straightforward.

Advertisements

Written by nicolas314

Thursday 7 October 2010 at 11:24 pm

Posted in python, webapp

Tagged with , , ,

Python web development woes

with 6 comments

Writing a 2c web app is a lot harder than it looks.

Objective

Write a mockup prototype for a web-based application. Tools: anything you like, as long as the job is done quickly and can easily be modified to accomodate rapidly-evolving requirements.  In fact, this application is meant as a living demonstration of the future full-fledged stuff re-programmed later with something industrial-grade. Think of it as specifications that compile and can actually be run.

As the local Python expert I thought I would demonstrate how it can quickly get the job done with a minimal amount of efforts. Little did I know…

Choosing the web server

We are dealing with something that will essentially be web-based, choosing the appropriate web server seems the first thing to do.

I have worked with two web servers in the past: Apache and lighttpd. Apache is notoriously difficult to configure, the config file is full of traps and possible inconsistencies. There are complete books and tutorials on the Net about how to configure your own server and believe me they are all worth consulting. Apache is a really good server but if you have never used it you’d better plan a couple of weeks in advance to learn how to use it correctly.

lighttpd (pronounced “lighty”) is a really fast and lightweight server, much easier to configure. Once you have it installed you can literally have it run within minutes. Unfortunately it does not support https client-side certificates (yet) and that feature is needed for what I want to do. One guy recently submitted a related patch but unfortunately I could not get it to work against every version of lighttpd I could find. Exit lighttpd, welcome Apache!

First attempt: Python CGI script

Python comes with a cgi module that is great for writing demo scripts but quickly becomes a serious pain whenever you want to implement anything more complicated like logins, sessions or database-related stuff. This is really bare-bone but maybe a little too much. After spending half a day re-coding a session mechanism I finally gave up and moved on to the next stage.

Second attempt: Apache + mod_python

mod_python is great! A Publisher algorithm browses through your Python files and publishes on the web server anything that looks like a string or a callable. Imagine a static server responding to these URLs:

http://[server]/hello
http://[server]/world

If you have a Python module containing two top-level strings named ‘hello’ and ‘world’, they will be published by mod_python and displayed verbatim. More interesting of course is to use callables (functions or instances) for a dynamic site.

Took me a complete week to finish the site with mod_python but development was a breeze. I spent more time in my application business than with the tools themselves, which is the main reason for using tools like Python.

And then I reached a stopping point: I need to authorize web clients to upload XML files to the server in an unusual MIME type. Unfortunately mod_python offers no support to do this and even worse: it silently absorbs the uploaded files and does not even bother warning your application that it missed a client request. Going through mod_python forums I could find that somebody else already mentioned this to the developers but the feature was rejected because if you want to do serious web stuff you should move to WSGI.

At that point I could have gone back to CGI for the file uploading stuff but I did not want to live with a schizophrenic code being half CGI half mod_python. Besides, I do not even want to know how much time I would have needed to make this work in the Apache configuration file. Time to leave mod_python behind and move on.

Third attempt: Python WSGI

Now I have to wade through this infamous WSGI stuff and see if it is really worth all the buzz. To make things short: WSGI is a pure Python standard that specifies how a Web framework should behave at its lowest level. The intention is to make it easier to port a WSGI-compliant application from one framework or web server to another without having to re-code anything.

I read the full specifications for WSGI and I have to admit I did not really understand the motivations behind this design. But oh well, I trust the guys to have done a good job at factorizing web frameworks. The WSGI standard itself is really low-level.  There is no way you can develop a web site just armed with it, it is only meant for middleware providers. So let’s hunt for WSGI middleware!

WSGI stage 1: Django, Pylons

Django and Pylons are full-fledged frameworks that come up with all bells and whistles. Nothing bad with these but they do suffer from the same issues, namely:

  • They offer about a zillion features I do not care about
  • They cover almost everything I need, but not quite

Which means that I will probably end up deploying lots of packages I will never use and will have to code additional functionalities into their framework just to cover my own needs.

Both packages come with half a million dependencies on various additional packages, and every package means more maintenance.  I spent a couple of days on each to try and explore and came to the conclusion that it must be really great to use them as a basis for a larger project but I would not want to do it.

Mental note: I need to train myself on these frameworks, it might come handy some day.

WSGI stage 2: Bottle

Bottle is a lightweight WSGI environment all contained within a single file. Can’t beat that in terms of the fewest dependencies!  It offers a very simple syntax to route your URLs to your objects and makes for clean code like:

@route('/admin')
def administration():
return '.... html page here ....'

@route('/')
def index():
return '.... html page here ....'

Nice package, but I would largely prefer having the framework pick routable objects directly from my Python modules, like mod_python’s Publisher does. There were some other features missing from it and Bottle does not seem to be maintained any more, so I reluctantly decided not to use it.

Just out of curiosity I also tried to run Bottle within lighttpd, loosing another evening in the process. lighttpd does not support WSGI, you have to install yet another middleware layer (python-flup) and run the server in FCGI mode. After a whole evening of messing around I still could not get any Hello World out of my setup and ended up tracking an obscure bug in the way lighttpd spawns sub-processes. I do not have the courage to get into that in depth.

My conclusion on lighttpd: great for serving static files, still a long way to go before it can compete with Apache. I have no doubt the lighttpd guys will eventually get there though.

WSGI stage 3: Colubrid, Werkzeug

Colubrid offers exactly the kind of thing I need: a Publisher algorithm that goes through your objects and publishes them at predictable URLs. It took me no more than an hour to transform my mod_python application to Colubrid and see it run. Documentation for this project is pretty sparse though, and it is unfortunately not maintained any more. The authors refer to Werkzeug as the tool of choice now.

Enters Werkzeug: described as a library of WSGI helpers, it tends to suffer from the same overweight issues as Django or Pylons. A lot of dependencies on other libraries and a model that is really hard to understand. I spent a couple of hours going through the tutorial and could not make sense out of it. It is probably very powerful but seems inadequate for my brain.

So Colubrid it will be. It is unmaintained but the library does not force other dependencies and even if it has little documentation I can at least understand it. If I ever face issues I will modify it to suit my needs without fear of seeing my own patches overwritten by a new version. I found a couple of bugs but no showstopper for the moment.

Wrapping it up

I learned quite a lot in the process. Python is sufficiently high-level to expect development to be quick and to the point.  And in a way, that goal is pretty much achieved. Getting a dynamic web application is just a matter of coding your business logic into classes and then hooking them into a View and a database.

On the other hand, the sheer number of dependencies for most frameworks is a definitive showstopper for production. Many of these libraries are still relatively young and lack the polish needed to adapt from the package developer’s needs to your own.

Another thing I learned is that as time goes on, Python frameworks tend to become more and more complicated, to the point that there is little left for people like me who want to just have something that handles the HTTP protocol and lets you hook in the tools you need one at a time.

Oh well… Give me enough time and I might just end up writing my own.

Looks like you are writing a framework

Looks like you are writing a framework

Written by nicolas314

Thursday 27 August 2009 at 11:43 pm