Thursday, April 2, 2015

Introducing mod_wsgi-express.

The Apache/mod_wsgi project is now over 8 years old. Long gone are the days when it was viewed as being the new cool thing to use. These days people seeking a hosting mechanism for Python WSGI applications tend to gravitate to other solutions.

That mod_wsgi is associated with the Apache project doesn't particularly help as Apache is seen as being old and stale. Truth is that the Apache httpd server has never stopped being improved on and is quite a lot better now than it was 8 years ago around the time when mod_wsgi was started.

Even though the Apache httpd server itself has an even longer history going back almost 20 years, it is still the workhorse of the Internet and provides a rock solid platform for hosting web sites. It can still hold its own against competing solutions and for hosting Python WSGI applications using mod_wsgi, is a proven reliable solution.

Now of those 8 years since mod_wsgi was started, there was actually about 3 years where very little development was done on it. This was because personally I got burnt out over the whole WSGI on Python 3 saga. I finally got myself out of that hole about a year and a half ago, and have been working away since on quite a significant number of changes to mod_wsgi that I haven't publicly said much about to date, let alone documented.

It is therefore long overdue to formally introduce one of the projects I have been working on. This project is mod_wsgi-express.

Setting up of mod_wsgi

One of the major bugbears of mod_wsgi has been the perception that it is too hard to setup, especially if building from source code yourself. The task of getting it installed was only slightly easier if you used a pre-built binary package provided by your operating system, but using such a pre-built package could in itself result in a whole host of other problems when it wasn't compiled for the particular version of Python you wanted to use.

With the module at least installed, configuring Apache was no less of a problem, especially on Linux systems which come with a set default configuration which was tailored for static file hosting or PHP.

The end result is that most people walked away with a bad experience and a production system which was operating at a level no where near what it was actually capable of. For the case of using Apache/mod_wsgi for development, the need for rapid iteration on changes in an application and the need to therefore be constantly restarting the web server, made the use of Apache/mod_wsgi seem all too hard.

A large part of what I have been working on for the past year and a half has therefore been about improving that experience. Key was coming up with a system which provided an out of box configuration which was much better suited for Python web applications than the standard Linux defaults, yet was still customisable as necessary to further tune it to suit the specifics of your particular Python web application.

Installation from PyPi using pip

The first major difference with mod_wsgi-express over the traditional path of installing mod_wsgi is that you can install it like any other Python package. In other words you can 'pip install' it directly from PyPi. You can even list it in a 'requirements.txt' file for 'pip.

pip install mod_wsgi

If you have a complete Apache httpd server installation on your system then that is all that is required. The resulting mod_wsgi module for Apache will have been compiled against and will be installed as part of your Python installation or virtual environment.

There is more though to mod_wsgi-express than just the ability to easily compile the module for Apache. In addition to compiling the module, a separate script called 'mod_wsgi-express' is installed. It is in this script that all the magic actually occurs.

Before I get onto what exactly the 'mod_wsgi-express' script does, I do want to point out that if for some reason you don't have a complete Apache installation, so are perhaps missing the development header files that are required to build Apache modules, or the installed Apache is not the latest recommended version, then that is also covered.

For this case where you also need to be able to install a fresh version of the Apache httpd server itself, you can do:

pip install mod_wsgi-httpd
pip install mod_wsgi 

In this case we are installing two packages. We are first installing 'mod_wsgi-httpd' and then 'mod_wsgi'.

What installation of the 'mod_wsgi-httpd' package from PyPi will do is actually pull down the source code for the Apache httpd server as well as other libraries it requires and automatically compile it and install it also.

The Apache httpd server is quite a big project and so this will take a little while, but it allows you to ignore the system Apache installation, with the 'mod_wsgi' package when subsequently being installed, detecting the version of Apache installed by 'mod_wsgi-httpd' and so using it instead.

Important to note is that install Apache using 'mod_wsgi-httpd' will not interfere with any existing Apache installation you may have. Like the 'mod_wsgi' package, it will be installed as part of your Python installation or virtual environment.

Hosting the WSGI application

So we have the Apache httpd server installed and the 'mod_wsgi' module for Apache also compiled and installed. We haven't though yet configured Apache as yet.

This is where the 'mod_wsgi-express' script comes into play.

If we have a WSGI application defined in a WSGI script file called 'hello.wsgi', all we now need to do is run:

mod_wsgi-express start-server hello.wsgi

Doing this will yield something like:

Server URL : http://localhost:8000/
Server Root : /tmp/mod_wsgi-localhost:8000:502
Server Conf : /tmp/mod_wsgi-localhost:8000:502/httpd.conf
Error Log File : /tmp/mod_wsgi-localhost:8000:502/error_log (warn)
Request Capacity : 5 (1 process * 5 threads)
Request Timeout : 60 (seconds)
Queue Backlog : 100 (connections)
Queue Timeout : 45 (seconds)
Server Capacity : 20 (event/worker), 20 (prefork)
Server Backlog : 500 (connections)
Locale Setting : en_AU.UTF-8

You can then access the WSGI application on the specified URL, that by default being port 8000 on the localhost.

As to the configuration of Apache, there actually wasn't any.

The key benefit of the 'mod_wsgi-express' script is that it does all the configuration for you, setting up a configuration purpose built for running your specific WSGI application right there on the command line.

Running Apache/mod_wsgi has therefore become as easy as running other pure Python WSGI servers such as gunicorn.

Alternatives to a WSGI script file

Like when using mod_wsgi in Apache in the more traditional approach, the 'mod_wsgi-express' script defaults to requiring a WSGI script file. There are specific reasons, deriving from how Apache works, that a script file path is used rather than a Python module name. There are however also some benefits to how a WSGI script file is used which are lacking when a module name is used.

I'll try to explain those reasons and the benefits another time, but if you really want to use a module name instead, then that is also possible. So if instead of 'hello.wsgi' you actually had '', making it a Python module, you could instead run:

mod_wsgi-express start-server --application-type module hello

It is also even possible to provide a Paste 'ini' file as input by specifying the 'paste' application type.

mod_wsgi-express start-server --application-type paste hello.ini

Hosting static file assets

Python web applications are usually never just dynamically generated pages. Instead they are generally accompanied by a bunch of static files for CSS stylesheets, Javascript and images.

This is where 'mod_wsgi-express' being based around mod_wsgi running under Apache brings additional value. That is that the Apache httpd server was primarily intended for service static files. Even though we are hosting a dynamic Python web application, we can still make use of that capability. This can be done in a few ways.

First up, if all static file assets are to exist at a sub URL of the site, then they can be readily mapped into place using the '--url-alias' option. The arguments to this are the sub URL and then the path to the directory containing the static files.

mod_wsgi-express start-server --url-alias /static ./htdocs/static hello.wsgi

For any site though, there are often special static files which need to exist at the root of the site. These are files such as 'robots.txt' and 'favicon.ico'.

These could be mapped individually using '--url-alias' as it does also allow the file system path to be that of a file:

mod_wsgi-express start-server --url-alias /static ./htdocs/static \
--url-alias /favicon.ico ./htdocs/favicon.ico \
--url-alias /robots.txt ./htdocs/robots.txt hello.wsgi

A better alternative though is to simply contain all the files in the one directory, here called 'htdocs', with the location matching the URL they should appear at, and declare that as the document root.

mod_wsgi-express start-server --document-root ./htdocs hello.wsgi

If you are a long time mod_wsgi user you may be familiar with the problem that mounting a WSGI application at the root of the site actually hides any static files that exist in the document root for the server. In the case of mod_wsgi-express though, specific Apache configuration is used such that any static files in the directory will actually overlay and take precedence over the WSGI application.

Thus if a URL matches a static file in the document directory the static file will be served up, otherwise the request will be passed on as normal to the WSGI application. Addition of new static file assets is therefore as simple as dropping them into the document directory with a path matching the URL it is to be available at.

By using Apache/mod_wsgi we therefore get the best of both worlds. A performant way of serving up static file assets as well as the dynamic Python web application.

This is something you don't get from a pure Python WSGI server such as gunicorn. For gunicorn you would have to use a Python WSGI middleware to intercept requests and map them to any static files. This is in contrast to using Apache where handling of static file assets is all done in C code by Apache below the level that the Python interpreter would even be involved.

Hosting just static files

Since mod_wsgi-express actually provides such a convenient way of hosting static files, there is even a mode which allows you to say that you aren't actually wanting to run a Python web application at all, and only want to host static files.

Thus instead of the quick command often used by Python users to run up a server to temporarily host some static files, of:

python -m SimpleHTTPServer

you can with mod_wsgi-express do:

mod_wsgi-express start-server --application-type static --document-root .

You are therefore running a production grade server for the task rather than the Python SimpleHTTPServer implementation.

This may not seem a big deal, but can be very convenient where you also need to be able to use a secure HTTP connection, or even use client certificates to control access to the files. These are things that you cannot do with SimpleHTTPServer, but can do with mod_wsgi-express.

And much much more

This only starts to scratch the surface of what one can do with mod_wsgi-express and what sort of configurability it provides. In future posts I will talk about other features of mod_wsgi-express, including using it to run a secure HTTP server, using it as a development server, as well as how to set it up for use in production environments, taking over from the normal Apache installation.

If you want to play with mod_wsgi-express and get a head start on what some of its other bundled capabilities are, then you can run the command:

mod_wsgi-express start-server --help

Also check out the PyPi page for 'mod_wsgi' at:

If you have any questions about mod_wsgi-express, use the mod_wsgi mailing list to get help.


stuaxo said...

Fantastic work, this should help many projects sort out Apache related issues early, instead of panicking at deployment time.

Iman Yeckehzaare said...

This is just amazing. Thank you so much.