I have been hanging off releasing mod_wsgi 3.0 due to uncertainties in how WSGI 1.0 specification should be implemented in Python 3.0. Despite a number of discussions on Python WEB-SIG about it, there has never really been any final consensus on the issue. This isn't helped one bit by the fact there is no formal process for agreeing, by way of vote or otherwise, on changes or amendments to the WSGI specification. Thus, all the conversation just amounts to a lot of hot air at the end of the day as nothing ever gets agreed upon.
As a result, I am going to move ahead with releasing mod_wsgi 3.0, but am going to disable support for Python 3.0. I will not remove the code entirely, but will make it necessary to go through some hoops to allow you to build mod_wsgi with Python 3.0, with suitable large disclaimers that if you use what is there, you will likely have to change your code at a future date when any amendments are agreed upon. In the mean time I will not be supporting the code related to Python 3.0.
It has come to this as it appears that other WSGI adapters attempting to support Python 3.0 are not implementing it the same way. This isn't a complaint against the authors of those other WSGI adapters as they are in the same boat as I am. The real problem is that there is no WSGI specification which covers Python 3.0. It is already bad enough that the WSGI 1.0 specification has various areas that aren't well defined or which are too restricting, to the extent that many of the major frameworks possibly don't even adhere to it. So, one can see this as being my protest at the lack of any formal processes for the development of the WSGI specification.
And before you have a go at me and say then that I should instigate such a formal process, let it be known that I have tried that already and there was no interest. So called consensus was that consensus was sufficient.
Some have suggested to me that I would be effectively setting the standard with what ever I released in mod_wsgi. Well, I don't want that to be the case. Although I wrote mod_wsgi I don't write web applications myself and am not across all the nuances of what would or wouldn't work for Python 3.0. Thus, I rely on the experience of others in helping define what the WSGI specification should be and merely implement that specification. So, no specification, no support.
Tuesday, April 28, 2009
Monday, April 20, 2009
Accreditation For Python Web Hosting
I have already described the need for better Python web hosting. Some of the comments in that discussion make me wonder if we need some sort of accreditation for Python web hosting. After all, there is a huge difference between a web hosting company who only offers CGI and whose only goal is to maximise profit by cramming as many unsuspecting users into one machine as possible, and a web hosting company who consciously regards quality of service as being equally or more important, and as such offers a much higher quality of hosting than just CGI, with a ratio of users to machines which also benefits the users and not just themselves.
So, maybe one of the things that could come out of any project to improve the quality of Python web hosting, is a checklist of what would be regarded as the minimum criteria to be regarded as a provider of quality Python web hosting services. If the Python community saw this as worthwhile, maybe the Python Software Foundation itself might want to give it its blessing. In order to get the accreditation, as well as satisfy the criteria, part of the deal could be that web hosting companies give some sort of donation back to the Python Software Foundation for the right to carry the accreditation.
Obviously trying to run such a program could be fraught with danger and maybe accreditation should be only for some set period before being reviewed, but maybe something to think about.
Friday, April 17, 2009
Improving Commercial Python/WSGI Hosting Options
I'd like to think that through my work with mod_python and mod_wsgi that Python web hosting options have improved, but truth is that neither mod_python nor mod_wsgi (at this stage) are really suitable for mass virtual hosting. As such, for low cost commodity Python web hosting the only real options are still CGI and FASTCGI.
In the case of FASTCGI this usually means mod_fastcgi or mod_fcgid under Apache, and although many web hosting companies do use these modules and so can provide support for Python, they often don't, or the support provided is less than ideal.
In taking the view that support for Python isn't very good, one does have to be careful however. This is because when you read support forums and irc channels, you obviously are only going to see the complaints and the calls for help to get things working. It may well be the case that this is an outspoken minority and the bulk of people are having no problem at all. Either way, there is still a perception that the Python community isn't being well serviced by web hosting companies and that something better is required.
As I have previously described in the mod_wsgi roadmap, the intention is to support features that would allow mod_wsgi to be used in mass virtual hosting, but there is a lot more to it than just providing yet another option that they might be able to use. In fact, there is no real reason why good Python web hosting couldn't be offered using FASTCGI right now.
I tend to think that the real problem is in part one of education. That is, lack of good documentation on how to setup FASTCGI for running Python within a commercial web hosting operation, and a clear indication of what the Python communities expectations are as to what should be available.
Some of the problems which arise are web hosting companies that provide only woefully out of date Python versions, no easy ability to install Python modules/packages, and in the case of FASTCGI, not even providing flup or some other FASTCGI bridge. End result is that although one may be able to use Python, it isn't necessarily easy and a lot of the hard work is pushed onto the user, rather than the web hosting company providing an environment which is easy to use to begin with.
With that in mind I am currently contemplating whether to start up a distinct uber project which has the specific goal of improving commercial Python/WSGI hosting options. This would not be done with the intent of just pushing my separate mod_wsgi software, but would look at all available software and come up with guidelines and other documentation on how best to use whatever is available, including CGI and FASTCGI.
I can also see this going beyond just documentation, with it also producing code libraries and applications. For example, at the moment for someone to host a Python WSGI web application under CGI they need to know about what CGI/WSGI adapters are available. Similarly for FASTCGI you need to know about what FASTCGI/WSGI adapters are are available. That or you need for the Python web application being used to internally somehow support CGI or FASTCGI directly.
Frankly, with WSGI, these days it is pretty stupid for Python web applications themselves to be worried about CGI or FASTCGI. At the same time, the user also should not have to need to know about them either. What would be much better is that no matter what underlying Python hosting mechanism is used, that the web hosting company provide a means of hosting WSGI applications themselves.
As example, when using mod_wsgi all you need to do is provide a WSGI script file which contains an 'application' object as entry point for the WSGI application. That WSGI script can also include any other code required to set up the environment for the WSGI application. There is no reason why this couldn't also be applied to CGI and FASTCGI.
So, instead of a user having to provide a .cgi or .fcgi file, they would provide a .wsgi file. It would then be up to the web hosting company to automatically ensure that the right thing happens.
Obviously, web hosting companies are going to be clueless as how to make that work and this is where one product of the project would be to provide a small set of Python wrapper applications which perform that mapping along with the instructions on how a web hosting company would integrate that into their systems. This would therefore need to include guidelines on how to set up Apache, including how to integrate it into suexec or cgiwrap as appropriate.
One of the problems that this wrapper application can solve is fixing up WSGI variables like SCRIPT_NAME and PATH_INFO. At the moment Python web applications often have hacks in them, or the user themselves are forced to have hacks in the WSGI script file, to adjust these variables where they aren't passed through correctly from the web server.
Another problem than that can be solved here is ensuring that logging from Python web applications ends up somewhere where the user can actually see and make use of it. One often sees instances where people are having trouble with something like FASTCGI, but due to how the system is set up, any error messages output when the FASTCGI script is being started disappear, making it really hard to debug problems. Because the wrapper application is in control of loading the WSGI script file, it can ensure that any log files are setup properly. It could even provide a feature to capture the errors and return them in a error page to the browser rather than them going to the log only.
So, that is the dream. In part I need to indirectly do some of the ground work for this in order to work out what features I need to add to make mod_wsgi more useful in a mass virtual hosting setup. It would be nice though if there are others out there who have some measure of passion for seeing Python web hosting options improved contribute as well. Most of all, I would dearly like to get the web hosting companies themselves directly involved.
In respect of dealing with web hosting companies, to date my experiences in dealing with them have not been very inspiring. Where I have actively tried to contact them to try and learn how they run things, so I can work out what features mod_wsgi should provide to make it easy for them to use, they have been quite unwilling to give up any information. Even when web hosting companies have contacted me about mod_wsgi, it seems the contact is coming from managers or sales people and not the technical people. Even at the requests of these same people, their own technical people aren't necessarily forthcoming with the information I really need. Overall it has been quite frustrating to say the least.
Hopefully then if this project were to get off the ground and were seen to have active backing from the Python community, we might be able to make some progress. We may even be able to make web hosting companies see that there is more than just PHP out there.
Right now any feedback you may want to give on the whole idea and whether there is a need for it would be most helpful. Maybe I am barking up the wrong tree and all is actually wonderful after all. As much as I may believe there is a problem here needing to be solved, am sure that existing mod_wsgi users would prefer I concentrate on just mod_wsgi and not worry about all this other stuff. :-)
Saturday, April 11, 2009
Version 2.4 of mod_wsgi is now available.
Version 2.4 of mod_wsgi is a bug fix update. The most important of the bug fixes addresses a response data truncation issue when using wsgi.file_wrapper extension on UNIX with keep alive enabled in Apache.
A number of other issues are also addressed, including memory leaks, configuration corruption and request content truncation. A small number of other minor improvements have also been made.
Because of the issue related to truncation of response data, it is highly recommended that if you are using any prior version of mod_wsgi 2.X with a web application that make use of the wsgi.file_wrapper extension, such as Trac, that you upgrade.
A description of changes in version 2.4 can be found in the change notes at:
http://code.google.com/p/modwsgi/wiki/ChangesInVersion0204
If you have any questions about mod_wsgi or wish to provide feedback, use the Google group for mod_wsgi found at:
http://groups.google.com/group/modwsgi
A number of other issues are also addressed, including memory leaks, configuration corruption and request content truncation. A small number of other minor improvements have also been made.
Because of the issue related to truncation of response data, it is highly recommended that if you are using any prior version of mod_wsgi 2.X with a web application that make use of the wsgi.file_wrapper extension, such as Trac, that you upgrade.
A description of changes in version 2.4 can be found in the change notes at:
http://code.google.com/p/modwsgi/wiki/ChangesInVersion0204
If you have any questions about mod_wsgi or wish to provide feedback, use the Google group for mod_wsgi found at:
http://groups.google.com/group/modwsgi
Friday, April 3, 2009
WSGI and printing to standard output.
If you use WSGI on top of CGI, the WSGI adapter communicates with the web server using standard input (sys.stdin) and standard output (sys.stdout). Available WSGI adapters for CGI do not do anything to try and protect the original sys.stdin and sys.stdout. This means that if you use 'print' to output debug messages for your application, without redirecting 'print' to sys.stderr explicitly within your code, then you will actually screw up the response from your WSGI application.
Although CGI may not be the most popular platform to host WSGI applications, with the intent of trying to promote the cause of writing portable WSGI application code, in mod_wsgi the decision was made to restrict access to sys.stdin and sys.stdout to highlight when non portable WSGI code was being written.
The result of doing this is that when 'print' was used in a WSGI application hosted by mod_wsgi, a Python exception would be raised of the type:
IOError: sys.stdout access restricted by mod_wsgi
This was all done with good intention, but what has been found is that people can't be bothered reading the documentation which explains why it was done and even when they do, they still can't be bothered fixing up the code not to use 'print'. It seems the convenience of using 'print' out weighs the ideal of writing code that may actually work across different WSGI hosting mechanisms.
More annoying is that whenever questions arise about this error on the irc channels, rather than people being told to read the documentation and/or fix their code not to use 'print', voodoo is summoned and they are instead told to use the magic incantation of:
sys.stdout = sys.stderr
Yes this is given as one of the workarounds in the documentation, the other being to disable the restriction using the configuration directive specifically for the purpose, but the only reason the workaround is given is for where you have no choice because you cannot change the code to remove the 'print' statement. People aren't told this though, all they are told is to make that change and effectively ignore the whole issue.
The whole mythology that is developing around this is now getting to the extent that some have been saying that neither 'sys.stdout' or 'sys.stderr' are working in mod_wsgi. The suggestion is starting to come out now that if you want to get any debug output from your WSGI application that you have to use a separate log file of your own creation, optionally hooked up to the 'logging' module. In one case, a BuildOut recipe is explicitly providing an option to define the separate log file that they believe has be used to replace 'sys.stdout' and 'sys.stderr'.
So, what is the real answer? Well, if you care about writing portable WSGI application code, then do not use 'print' by itself, instead redirect it to 'sys.stderr' by writing:
print >> sys.stderr, 'message ...'
This is especially important if you are writing framework libraries or plugins to be used in some other application or by other users. You shouldn't be making an assumption that 'sys.stdout' can always be used. If it is a debug or error message, then use 'sys.stderr' as it is meant to be.
If for some reason you really don't want to care about the issue, then rather than use the magic voodoo above, you should simply disable the restrictions that mod_wsgi puts into place altogether. This is done by putting in the main Apache configuration file:
WSGIRestrictStdin Off
WSGIRestrictStdout Off
Anyway, because of all the contention arising over all of this, in mod_wsgi 3.0 I will be giving up and will be making the restrictions off by default. If you want to write non portable WSGI application, you can quite happily do so. If you do care about portable WSGI application code, then you will be able to optionally reenable the restriction using the same directives above.
Subscribe to:
Posts (Atom)