Friday, March 20, 2009

Are you having WSGI problems?

Do you have problems passing data through your WSGI middleware? Is your memory fading and you are finding you have to parse your HTTP headers multiple times? Are you seeing inconsistent behaviour when your WSGI input stream is prematurely terminated? Well, PyCon'09 in Chicago is just around the corner and that means lots of smart Python people are going to be crammed into a room together, so maybe they can help.

Humour aside, what I am hearing is that there are a few people going to PyCon with a bit of interest in having some constructive discussions about WSGI and the road forward. Personally I have issues with the WSGI 1.0 specification itself, that it is poorly defined in various areas and that there are certain features of HTTP that technically you can't even make use of in WSGI applications, such as chunked request content.

Other people have in the past expressed opinions about WSGI being too low level and the desire for a standardised higher level abstraction for request/response objects and middleware that is possibly even divorced from the low level WSGI specification. After all, WSGI was in part really only aimed at being a standard interface to the web server underneath, and although one could build application middleware with it, it is arguable whether it really should have been allowed to percolate up into the higher levels of an application as it has.

Anyway, the point of the post is that if there is a chance that all these smart people are going to talk about WSGI, lets give them some constructive input about what we all feel the problems of WSGI are. Yes they may have their own list of issues, but I am sure others will have different problems they are trying to solve. So, bring out your dirty laundry and have your say by posting your own observations about WSGI and any problems it may have and maybe we can make some forward progress rather than just accepting WSGI as is, warts and all. 

If you do post your own blog entries, perhaps add a link to it in comments to this post so it is easy for those going to PyCon who are interested in your feedback to find them. If you know of any good past blogs about problems with WSGI, perhaps also link them here.

BTW, no I will not be at PyCon, so expect me to try and post about my own gripes with WSGI as well. Also note that mod_wsgi is not WSGI, it is only an implementation that provides support for the WSGI interface. As such, I am not talking here about bringing up issues with mod_wsgi, just the WSGI specification and how it is applied in web applications. If you want to talk about mod_wsgi, bring it up separately on the mod_wsgi mailing list over on Google Groups.

Thursday, March 19, 2009

Future roadmap for mod_wsgi.

Because of family commitments, progress on mod_wsgi has been slower than I would have liked for the past year. I have a bit more spare time these days, so time to talk a little about where I see mod_wsgi going from here.

Up till now, the little time I have been able to spend on mod_wsgi which hasn't been chewed up in answering questions on the mailing list and other forums, has been going towards mod_wsgi version 3.0. This has mainly consisted of a lot of bug fixes and minor refinements, however there are also a few interesting new features, the main ones being described below.
  • Support for Python 3.0. Now support Python 3.0 based on proposed amendments to WSGI specification for this latest incarnation of Python. The mod_wsgi package was the first major WSGI server to support Python 3.0, albeit that you have had to use it from the subversion repository. Just a pity that it seems it will be a while before any of the larger frameworks support Python 3.0.
  • User chroot environment. Specific daemon process groups can be delegated to run in the context of a chroot environment. Direct support for chroot environments by mod_wsgi means that you do not have to run Apache as a whole in the chroot environment and you could support many WSGI applications running in different daemon process groups as different users each in their own chroot environment.
  • Ownership of WSGI script files. Enforce that the WSGI script files corresponding to a WSGI application delegated to a specific daemon process group, are owned by or are a member of a specific group. The permissions of the directory containing the WSGI script file are also checked to ensure they are consistent. This is an additional security measure that can be applied in the case where all WSGI scripts in a directory are being delegated to run in a daemon process as a specific user, to ensure that sloppy directory permissions don't allow arbitrary other users to place WSGI scripts in that directory and therefore have it run as a different user.
  • Chunked request content. This is not something that the WSGI specification actually allows, but if you are happy to step slightly outside of the WSGI specification you can support clients which use this feature of HTTP. This appears to becoming more and more important as some mobile phone devices are automatically using chunked request content on HTTP POST requests where data is greater than a certain size. At the moment WSGI applications are ham strung in not being able to support this unless the underlying WSGI adapter uses the rather hacky method of reading in the whole request up front, calculating the CONTENT_LENGTH and passing it through as if it wasn't a chunked request.
  • Internal web server redirection. For mod_wsgi daemon mode, now support the CGI method of being able to return a HTTP 200 response with the Location header defined, to trigger an internal server redirection. The target URL in this case could be within the same Python web application, another web application, another web server proxied via the web server or a static file. The latter would give something akin to X-Sendfile, although actually more like nginx X-Accel-Redirect as the target is a URL and not a physical file.
  • Limits on processor CPU time. For mod_wsgi daemon mode, this allows one to trigger an automatic restart of a daemon process when the accumulated CPU time used by the process exceeds a specified amount. The intent of providing this feature is as a fail safe to capture when a process may go berserk and starts chewing up processor time due to some programming error.
  • Override application error pages. For mod_wsgi daemon mode, now allow any error response page returned by the WSGI application to be ignored and the default Apache error page, or any defined by Apache ErrorDocument directive, to be used instead. This is for where multiple applications, implemented in different systems or programming languages, are hosted and they must present error pages in the same style.
  • Authentication and HTTP headers. The HTTP headers are now available to authentication providers allowing them to qualify their behaviour based on information in the headers, such as cookies.
  • Preloading of WSGI applications. The process group and application group to which a WSGI application is to be delegated can now be defined as parameters to the WSGIScriptAlias directive. As a side effect of this, the WSGI script so designated will be preloaded automatically and do not have to separately use WSGIImportScript to preload it.
Although these have all been implemented and mod_wsgi 3.0 is pretty well all ready to go, have decided at this point to first back port as many as possible of the bug fixes from mod_wsgi 3.0 back to mod_wsgi 2.X stream and release mod_wsgi 2.4. This is to close off the mod_wsgi 2.X stream with a stable version, given that many people don't like jumping up to a new major version due to the perception that it will introduce its own new problems.

Once mod_wsgi 2.4 has been released, and last tidy up tasks related to mod_wsgi 3.0 are completed, would expect mod_wsgi 3.0 to follow not long after.

In respect of mod_wsgi 3.0, we did have a bit of a discussion on the mailing list about whether to disable embedded mode in mod_wsgi 3.0 by default and force that it be enabled to be used. The reasoning here was that on UNIX systems most people unknowingly use embedded mode without realising that in doing so they really should be tuning the Apache MPM settings to values more appropriate for fat Python web applications. This is necessary as the default MPM settings are more appropriate for PHP and static file serving and certainly not for Python web applications. Don't adjust things properly and you could see yourself suffering memory issues and load spikes. So, the thought was that maybe disabling it by default may act as a good flag to people to say, 'do this and you better know what you are doing'. In the end, decided to leave it as is, because without also configuring daemon mode, they would just get an error message and most likely be more confused about what is going on.

What is really needed in order to be able to disable embedded mode, or get rid of it all together, is for daemon mode to support a means of dynamically creating daemon processes without the need for any up front configuration to specifically enable it. For example, the default might be that first time that WSGI application is triggered which is associated with a specific virtual host, that a daemon process be automatically started for that virtual host. If multiple WSGI applications are mounted under that same virtual host, they would execute within the context of different sub interpreters of that daemon process.

Such a feature along with other mechanisms for supporting transient daemon processes has been on the TODO list for a while and had originally wanted to include it in mod_wsgi 3.0, but the lack of time meant I had to defer trying to implement it.

As such, one of the main tasks down for mod_wsgi 4.0 is the ability to dynamically create daemon process groups based on some template or parameterised configuration definition. This may see a daemon process created for each virtual host as above, or maybe one for each authenticated user, with each of those daemon process groups automatically running under the account of that authenticated user. Obviously, lots of possibilities exist here, it just depends on how flexible one can make it and from where one draws the input values to fill out the parameterised parts of the configuration.

Originally the thought had been to just focus on that one task for mod_wsgi 4.0, but some recent feedback from Doug Napoleone about some hacks he had done with mod_wsgi to make it possible to host Python web applications where each used a different version of Python, rekindled some old plans I had for exactly that.

The problem to be solved here is that mod_wsgi is compiled against a specific version of Python and so you are stuck using that one version of Python. If you wanted to use a newer version of Python, you would need to recompile mod_wsgi and thus upgrade all the Python web applications you host to use the newer version of Python at the same time.

Obviously this is a big drawback in a shared environment, be it a university setting or a web hosting provider. This is in part why FASTCGI is still seen as a much better solution than mod_wsgi for Python web hosting, ignoring even the fact that FASTCGI can also be used for other languages.

Now, in order to be able to support using multiple versions of Python, one has to do away with embedded mode. This is necessary as to make embedded mode reasonably efficient on memory and startup costs, one has to load the Python interpreter into the Apache parent process and first initialise it there. That way the Python interpreter is ready for use immediately after the Apache child server processes are forked. Having the Python interpreter linked in and initialised at such an early stage though prevents forked daemon processes from using different versions of Python.

Next issue is that mod_wsgi as it is implemented now, uses the Apache parent process as the monitor process. That is, it is the Apache parent process from which daemon mode processes are directly forked. As such, it was also benefiting from Python being preinitialised in the Apache parent process.

What we would now need to do instead, is to separate out from the Apache parent process the function of creating and monitoring the daemon processes. To that end, a separate monitor process would be forked from the Apache parent process and it would be that process which would then in turn create the daemon processes.

Now that we have a separate monitor process, the whole linking in of and intialising of the Python interpreter can be delayed until the monitor process has been created. To support multiple versions of Python at the same time, we just create multiple monitor processes, one for each version of Python to be supported and with each loading up a specific instance of the mod_wsgi code for that version of Python.

It is just then a matter of specifying which version of Python one wants to use for a specific daemon process group, and the appropriate monitor process would take on the responsibility for ensuring the daemon process is created and subsequently managed.

Because it is problematic to support embedded mode at the same time as supporting the use of multiple versions of Python, the intent would thus be that mod_wsgi 4.0 deliver two distinct Apache module variants. These will be the existing mod_wsgi module and a new mod_wsgid module.

The mod_wsgi module would provide the same level of functionality as is currently provided, but with the addition of dynamically created daemon processes as explained above. It would be bound to a specific version of Python.

The mod_wsgid module would only support daemon mode. Thus, no embedded mode and no ability to implement Apache authentication providers or group authorisation in Python. Using this version one would be able to use different versions of Python at the same time for different WSGI applications. This will be possible because the mod_wsgid module wouldn't actually utilise any part of Python directly. Instead there would be companion plugin modules for each different version of Python. It would be these companion modules which would be loaded by the monitor process corresponding to the desired version of Python. Since nothing to do with Python would be done in the Apache parent process, the Apache child server processes would thus be a bit slimmer as a result. A default configuration would also exist which would automatically create a daemon process group for each virtual host using the version of Python designated as the primary version to be used. 

Note that mod_wsgid and mod_wsgi would not be able to be loaded into Apache at the same time, nor would mod_python be able to be used at the same time as mod_wsgid.

Because use of embedded mode is the more specialised case and daemon mode the preferred deployment scenario, the mod_wsgid module would actually likely become the recommended module to use.

Now, if you think that is about as far as one could take mod_wsgi then you would be wrong. For mod_wsgi 5.0 would like to revisit embedded mode and look at adding in support for some of the features of mod_python that mod_wsgi doesn't provide. This would include looking at support for Apache input and output filters implemented using Python, plus exposing the internal Apache APIs for use by Apache style handlers.

A reasonable amount of work has already been done on creating SWIG bindings for Apache APIs but so far it looks like one would be better off using hand crafted bindings instead. This is what mod_python does, but mod_python doesn't really follow that closely the Apache APIs. Would prefer that the hand crafted bindings be SWIG like in the sense of being a much closer mapping of the actual C APIs. By doing this it would be much easier for people to apply any knowledge they have of the C APIs into their Python equivalents. At this point it is also unknown what is going to happen with SWIG and Python 3.0. It may just be easier to support hand crafted bindings for Python 2.X and Python 3.X at the same time.

So, that be the current vision of where mod_wsgi is heading. This is by no means final and always open to suggestions. If you really want to get into a discussion about it, do suggest though using the mod_wsgi mailing list hosted on Google Groups for that, rather than trying to use comments to this post to carry out a conversation.

Monday, March 9, 2009

Load spikes and excessive memory usage in mod_python.

A common complaint about mod_python is that it uses too much memory and can cause huge spikes in processor load. Fact is that this isn't really caused by mod_python itself, but indirectly by virtue of how, or more so how not, Apache has been configured for the type of web application that is being run.

Where the problem stems from is the choice of Multi-Processing Module (MPM) chosen for the Apache installation and the default settings for that MPM.

On UNIX systems there are two main MPMs that are used. These are the prefork MPM and the worker MPM. The prefork MPM implements a multi process configuration where each process is single threaded. The worker MPM implements a multi process configuration but where each process is multi threaded.

Which MPM is used is a compile time option and not something that can be changed dynamically at run time. Thus, your decision has already been made by the time you have installed Apache from source code or from the binary operating system package. Often the choice is already made for you by what the operating system supplied as the default.

Traditionally the MPM used for an Apache installation has been prefork. This is because that is all the older Apache 1.3 supported, but also partly because modules for web development, such as PHP, were not generally multi thread safe and so required that prefork MPM be used.

With the MPM having being selected, either explicitly or through ignorance of there being a choice, that is where the majority of people stop. What most do not realise is that the default settings for an MPM will generally need to be modified based on what you are using Apache for and how much memory your system has available. Customising these settings is even more important for Python web applications as I will explain.

Lets first look at the default settings for the prefork MPM. The values for these as shipped with the original Apache source code is:
# prefork MPM
# StartServers: number of server processes to start
# MinSpareServers: minimum number of server processes which are kept spare
# MaxSpareServers: maximum number of server processes which are kept spare
# MaxClients: maximum number of server processes allowed to start
# MaxRequestsPerChild: maximum number of requests a server process serves
<IfModule mpm_prefork_module>
StartServers 5
MinSpareServers 5
MaxSpareServers 10
MaxClients 150
MaxRequestsPerChild 0
</IfModule>
What this all means is that when Apache starts up it will create 5 child server processes for handling of requests. The number of child server processes used isn't a fixed number however. Instead, what will happen is that Apache will dynamically create additional child server processes when the load increases. Exactly when this occurs is dictated by the setting for the minimum number of idle spare servers. Such additional child server processes may be created up to a number determined by the maximum number of allowed clients. In this case, because each child server process is single threaded, that means a maximum of 150 child server processes may be created.

This is actually quite a lot of child server processes that can be created. If Apache is being used only to serve static files then this number is quite reasonable however. This is because each child server process should be quite small in size. Even when using PHP the server child processes shouldn't grow to be overly large. This is because PHP is CGI like in the sense that each application script is reconstructed on each request and then thrown away at the end of the request. Thus, nothing of the application persists between requests and thus any memory use is always transient.

The other key aspect of PHP which means that memory use of the individual child server processes is kept down, is that the extensions available to the PHP user is fixed when PHP is initialised. Further, all the PHP internal libraries and any optional extensions are preloaded from shared libraries/objects in the Apache parent process before any child server processes are even created from it. Thus, all the code which a PHP application uses is not only preloaded, but shared between all child server processes and isn't counting as private memory to the child server processes. This is significant when one considers that the PHP library alone, not counting optional extensions, can be about 7MB in size.

We now need to contrast what happens with PHP to what happens with mod_python and Python web applications.

When using mod_python the only thing that happens in the Apache parent process is that the Python interpreter is initialised. There is no preloading of any modules which a Python web application may want to use. This is the case as Python works the opposite way to PHP in that it does as little as possible up front, only importing specific modules when actually used by an application.

The next difference with Python web applications is that once application code is loaded it remains loaded for the life of the process. That is, unlike PHP which throws away the application between requests, everything persists between requests in Python web applications. If a Python web application spans a large set of URLs, the application code may not even all get loaded upon the initial request. Instead it may only get progressively loaded as different URLs are accessed.

The important thing to realise here is that all this loading of Python application code is occuring in the child server processes which handle the requests and not the Apache parent process. Except where Python modules are implemented as C extension modules, all the code that is loaded is going to use up private memory of the process. It is not unheard of for even small to moderate sized Python web applications to consume 30MB or more of private memory in each child server process.

It is this significant amount of memory per process which is where problems start to occur. If you remember, the default settings for the prefork MPM were such that up to 150 child server processes could be created. This means that for such a small to moderate sized Python web application, if Apache decided to create up to the maximum number of child server processes, you would need in excess of 4GB of memory.

If you are running a small VPS system with an allocation of only 256MB, you can see that it just isn't going to work very well. You might just squeak by with having all the initial 5 child server processes having loaded up your application, but as soon as you get a sudden increase in requests and Apache decides that it needs to create more child server processes, your system will quickly run out of memory.

So, although the default settings for the prefork MPM may be reasonable for handling of static file requests or PHP, they are going to be completely inapproriate for any sizable Python web application, especially if running a system with only limited memory.

This then addresses one of the main complaints one often sees made against mod_python. That is that it consumes huge amounts of memory. In reality it isn't mod_python at all here which is the problem.

First off the memory is being consumed by the Python web application and not mod_python. If you ran the same Python web application in a standalone process on top of a Python web server, that single instance of the application would still use about the same amount of memory.

The real problem here is that so many instances of the application have been allowed to be created by not changing the default settings for the prefork MPM. Specifically, the number defined for the maximum clients should be dropped comensurate with the amount of memory available to run it and how big the application gets.

A very crude measure for determining the maximum number of clients, and therefore how many child server processes will be created, is to divide the maximum amount of memory you want to allow the web server as a whole to use, by the amount of memory a single instance of the Python web application consumes.

Do note though that this is a very crude measure, things are in practice a bit more complicated than that. One thing that complicates the issue is whether keep alive is enabled for connections and what the keep alive timeout is set to.

Whether keep alive is enabled or not isn't going to change what the maximum number of clients should be set to, but it does in practice limit how many concurrent requests you will be able to effectively handle. This is because the Apache request handler threads will be busy waiting to see if a subsequent request is going to arrive over the same connection. Eventually the request handler thread will timeout, but during that time it will not be able to handle completely new requests.

If keep alive is a problem, one course often taken which can help out is to offload serving of static media files to a separate web server. Keep alive can then be turned off for the Apache instance running the Python web application where it generally isn't as beneficial as for static file requests. Web servers such as nginx and lighttpd are arguably better at serving static files anyway, and so you will actually get better performance when serving them that way. Offloading the static files also allows you to configure Apache properly for the specific Python web application being hosted, rather than having conflicting requirements.

As to the load spikes which can occur, what this comes down to is the startup costs of loading the Python web application being run. Here the problem is that Apache will create additional child server processes to meet demand. Because Python web applications these days generally have a lot of dependencies and need to load a lot of code they will not start up quickly. That startup is costly actually serves to multiply the severity of the problem, because although the additional processes are starting up, if they take too long, Apache will decide that it still doesn't have enough processes and will start creating more. In the worst case this can snowball until you have completely swamped your machine.

The solution here is not to create only a minimal number of servers when Apache starts, but create closer to what would be the maximum number of processes you would expect to require to handle the load. That way the processes always exist and are ready to handle requests and you will not end up in a situation where Apache needs to suddenly create a huge number of processes.

The catch here to watch out for is that the startup cost of the Python web application is simply transferred to when Apache is being started in the first place. If you find that even when a larger number of processes are created at startup, the initial burst of traffic and the subsequent loading of the actual Python web application strains the resources of your system, then you need to seriously look at whether you are creating many more processes than you need anyway.

First off, don't run PHP on the same web server. That way you can run worker MPM instead of prefork MPM. This immediately means you drop down drastically the number of processes you require because each process will then be multithreaded rather than single threaded and can handle many concurrent requests. To see how this works one can look at the default MPM settings for the worker MPM.
# worker MPM
# StartServers: initial number of server processes to start
# MaxClients: maximum number of simultaneous client connections
# MinSpareThreads: minimum number of worker threads which are kept spare
# MaxSpareThreads: maximum number of worker threads which are kept spare
# ThreadsPerChild: constant number of worker threads in each server process
# MaxRequestsPerChild: maximum number of requests a server process serves
<IfModule mpm_worker_module>
StartServers 2
MaxClients 150
MinSpareThreads 25
MaxSpareThreads 75
ThreadsPerChild 25
MaxRequestsPerChild 0
</IfModule>
The important thing to note here is that although the maximum number of clients is still 150, each process has 25 threads. Thus, the maximum number of processes that could be created is 6. For that 30MB process that means you only need 180MB in the worst case scenario rather than the 4GB required with the default MPM settings for prefork.

Keep that in mind and one has to question how wise the advice in the Django documentation is that states "you should use Apache’s prefork MPM, as opposed to the worker MPM" when using mod_python.

All well and good if you run your own computer with huge amounts of memory and little traffic, but a potential recipe for disaster if you don't know that you should be changing the default MPM settings and you are using a memory constrained VPS and your site becomes popular or becomes subject to the Slashdot effect.

With Django 1.0 now believed to be multithread safe, which was in part why prefork was recommended previously, that advice should perhaps be revisited, or it made obvious that one would need to consider tuning your Apache MPM settings if you intend using prefork MPM.

Now, it needs to be stated that all of the above about mod_python equally applies to embedded mode of mod_wsgi. Thus, using mod_wsgi isn't necessarily some magic pill which will solve all your problems overnight.

Most people who change to using mod_wsgi don't actually have a problem though, but that that is the case is usually more by accident rather than design. This is because they see the additional benefits they get from using daemon mode of mod_wsgi and choose to use it over embedded mode. By this simple decision they have escaped the main issue with embedded mode, which is that Apache can lazily create processes and that for prefork MPM the maximum number of processes is excessive.

Some have realised that mod_wsgi daemon mode seems to offer a more predictable memory usage profile and performance curve and as a result fervently recommend it, but at the same time they still don't seem to understand what the problems with embedded mode, as outlined above actually were. So, hopefully the explanation above will help in clearing up why, not just in the case of mod_wsgi daemon mode vs mod_wsgi embedded mode, but also for the much maligned mod_python.

So, what should you be using? The simple answer is that if you don't understand how to configure Apache and see it as some huge beast then you should certainly be tossing out mod_python. Instead, you would be much better off using mod_wsgi daemon mode.

Should one ever use embedded mode? Technically running embedded mode with prefork MPM should offer the best performance, especially for machines with many cpus/cores. If however you don't have huge amounts of memory, don't dedicate the system to just the dynamic Python web application and you don't change the default MPM settings for Apache, then you are potentially setting yourself up for disaster.

In practice one also has to realise that the underlying web server is never usually going to be the bottleneck. Instead the bottleneck will be your Python web application and the database it uses. So, just because mod_wsgi embedded mode and prefork MPM may be the fastest solution out there for Apache and saves you a few milliseconds per request, that gain is going to be completely swallowed up by the overheads elsewhere in the system and not end up giving you any significant advantage.

You see a lot of people though still obsessing about the underlying raw performance of the web server. Frankly, you are just wasting your time. You will get greater benefits from concentrating on the performance of your application and using techniques such as caching and database query optimisation and indexing to make things run faster.

The final answer? Stop using mod_python, use mod_wsgi and run it with daemon mode instead. You will save yourself a lot of headaches by doing so.

A Python interpreter is not created for each request in mod_python.

There are various myths about mod_python floating around the net and it gets a bit annoying when one keeps seeing them posted again and again, especially when used to support some conjecture that some other system is so much better as a result. It gets more annoying though when they are posted on sites whose commenting feature doesn't work, or you have to jump through hoops to register for the site. As such, often one can't even correct the misconceptions and so the misinformation just keeps propagating and you can never kill it.

The latest instance of this is at www.pypi.info, and it isn't the first time I have seen this person making incorrect statements about mod_python and how it works.

So, let me set the record straight on one such myth about mod_python. That myth is that mod_python creates a separate Python interpreter instance for each request. This is just not the case.

What actually happens is that the first time a request arrives for an application which hasn't yet been loaded, and which hasn't been marked to run in the main Python interpreter, is that a new sub interpreter is created. That sub interpreter is then persistent within that process though, not being destroyed until the process itself is destroyed. Any subsequent requests for that application are then handled within the context of that same sub interpreter which has already been created, so no new sub interpreter is created. Even when multithreading is used, all the request handler threads execute within the context of that one sub interpreter. So, neither is a separate sub interpreter created for each distinct request handler thread in the Apache thread pool, as is also sometimes claimed.

As to why that poster had the problems he had with mod_python, they are more likely because of which Apache MPM was selected when Apache was compiled, as well as not changing the default MPM settings used. Especially with prefork MPM, the default settings are more appropriate for static file serving and PHP. If you do not change the default MPM settings as well as tweak HTTP keep alive settings to make them more appropriate for a large persistent Python web application, you will see load spikes and memory issues, especially when a system is put under load.

From what I have seen, the majority of people setting up mod_python don't understand the need to change the default Apache configuration. It may well have been the case that back in time when mod_python was somewhat newer that you didn't have to, but this is because back then Python web applications were typically much smaller and so didn't have significant start up times or use large amounts of memory. These days though they do, so you need to change how Apache is configured to have it perform adequately.

Now, can we please stop trying to do performance and memory usage comparisons if you don't know how to setup the systems being compared properly. :-(

BTW, everything above also applies to mod_wsgi when using embedded mode, as it is implemented in a similar manner to mod_python. Because I'd rather not see these misconceptions about mod_python transfers onto mod_wsgi, I will blog later more about the real source of these load spikes and memory issues. In the mean time, if using mod_wsgi use daemon mode and you will not have to think about it, as these issues in the main only relate to embedded mode.

Sunday, March 8, 2009

Final chance for providing input into mod_wsgi 3.0.

I am starting to get close to winding up development work on mod_wsgi 3.0. If you had any thoughts about changes or new features you would like to see, now is your last chance.

If you want to know what changes have already been made for mod_wsgi 3.0, then see:


As well as still having to fix a few issues, am also giving consideration to X-Sendfile support for daemon mode, support for fixing up WSGI environment based on X-Forwarded-? header information, better integration with Python logging module and a few other things.

In respect of documentation, yes I know it still needs more work, so no need to ask for better documentation. :-)

BTW, I am also slowly preparing a mod_wsgi 2.4 release with various back ported fixes as well.

If you have any quick feedback, then post a comment here. If what you want is a bit more complex, then use the mod_wsgi group on Google Groups.

Thanks.