Tuesday, September 2, 2014

Using Python virtual environments with mod_wsgi.

You should be using Python virtual environments and if you don't know why you should, maybe you should find out.

That said, the use of Python virtual environments was the next topic that came up in my hallway track discussions at DjangoCon US 2014. The pain point here is in part actually of my own creation. This is because although there are better ways of using Python virtual environments with mod_wsgi available today than there used to be, I have never actually gone back and properly fixed up the documentation to reflect the changes.

When using mod_wsgi embedded mode, one would use the 'WSGIPythonHome' directive, setting it to be the top level directory of the Python virtual environment you wish to use. If you don't know what that is supposed to be, then you can interrogate it using the command line Python interpreter:

>>> import sys
>>> sys.prefix

Most important is that this should refer to a directory. It is an all too common mistake that I see that people set the 'WSGIPythonHome' directive to be the path to the 'python' executable from the virtual environment. That is plain wrong, so please do not do it, doing so will see the setting be ignored completely and the default algorithm for finding what Python installation to use will be used instead.

If using daemon mode of mod_wsgi and you are hosting only the one Python WSGI application, then you can again just rely on the 'WSGIPythonHome' directive, pointing it at the Python virtual environment you want to use. If you are hosting more than one WSGI application however, and you want each to use a different Python virtual environment, then you need to do a bit more work.

The mod_wsgi documentation on this steers you towards a convoluted bit of code to include in your WSGI application to do this, explain in part why this is the safest option.

ALLDIRS = ['usr/local/pythonenv/PYLONS-1/lib/python2.5/site-packages']
import sys 
import site
# Remember original sys.path.
prev_sys_path = list(sys.path)
# Add each new site-packages directory.
for directory in ALLDIRS:
# Reorder sys.path so new directories at the front.
new_sys_path = []
for item in list(sys.path):
if item not in prev_sys_path:
sys.path[:0] = new_sys_path

Part of the reasoning behind giving that as the recipe was a distrust of the 'activate_this.py' script that is included in a Python virtual environment and advertised as the solution to use for embedded Python environments such as mod_wsgi.

The reason I was cool on 'activate_this.py' was that it stomped on the value of 'sys.prefix'. In the context of mod_wsgi, because the Python installation that mod_wsgi was actually compiled against or using may be at a different location, I was worried about whether modifying 'sys.prefix' would cause something to break.

I therefore gave only guarded approval to using 'activate_this.py'.

In the many years mod_wsgi has been available though, I have to admit that no issues ever came up around 'sys.prefix' being overridden.

So, if you do not have access to make changes in the Apache configuration files for some reason, then the easiest way to activate a Python virtual environment in your WSGI script file is:

activate_this = '/usr/local/pythonenv/PYLONS-1/bin/activate_this.py'
execfile(activate_this, dict(__file__=activate_this))

This is still a pain to have to include because you are adding to the WSGI script file knowledge of the execution environment it is being run in, which is notionally a bad idea.

The alternative to modifying the WSGI script file was to add just the 'site-packages' directory from the Python virtual environment in the Apache configuration.

For embedded mode of mod_wsgi you would do this by using the 'WSGIPythonPath' directive:

WSGIPythonPath /usr/local/pythonenv/PYLONS-1/lib/python2.5/site-packages

If using daemon mode of mod_wsgi you would use the 'python-path' option to the WSGIDaemonProcess directive.

WSGIDaemonProcess pylons python-path=/usr/local/pythonenv/PYLONS-1/lib/python2.5/site-packages

What was ugly about this was that you had to refer to the 'site-packages' directory where it existed down in the Python virtual environment. That directory name also included the Python version, so if you ever changed what Python version you were using, you had to remember to go change the configuration.

The good news is that since mod_wsgi version 3.4 or later there is a better way.

Rather than fiddling with what goes into 'sys.path' using the 'WSGIPythonPath' directive or the 'python-path' option to 'WSGIDaemonProcess', you can use the 'python-home' option on the 'WSGIDaemonProcess' directive itself.

WSGIDaemonProcess pylons python-home=/usr/local/pythonenv/PYLONS-1

As when using the 'WSGIPythonHome' directive, this should be the top level directory of the Python virtual environment you wish to use. In this case the value will only be used for this specific mod_wsgi daemon process group.

If you are therefore using a new enough mod_wsgi version, and using mod_wsgi daemon mode, then switch to the 'python-home' option of 'WSGIDaemonProcess'.


Kernel Kiddy said...

Graham, thank you for the post!
This is the only place where I could finally find option python-home of WSGIDaemonProcess to set path to Python executable for my virtual host.

It is very strange that this option is not mentioned in documentation: https://code.google.com/p/modwsgi/wiki/ConfigurationDirectives#WSGIDaemonProcess

Graham Dumpleton said...

Documentation not up to date, simple as that. Has been present since mod_wsgi 3.4