Google makes me think about lots of stuff. Today I'm thinking about the state of web frameworks.
This is a plea for help. Please educate me!
After years of resistance, I'm finally finding myself building a web application again. I think the last time I did that was the "faq wizard" which still lives in Python's Tools directory. It was a CGI script using nearly standard Python string interpolation as a templating language, environment variables (plus cgi.py) and print statements to talk to the server, and the file system plus RCS for persistency. It was used for years to maintain the Python FAQ on python.org, but was eventually retired. (Is anybody still using it?!?)
My new project (my so-called "starter project" at Google) is an internal tool for Google developers. It will never be used outside Google. I don't want to have to explain what it does, but I'll hint that it is a fairly standard database-backed dynamic application with authenticated users. About the only slightly unusual characteristic is that it talks to another internal tool in addition to the database. It will eventually be used by thousands of developers (millions if Google's exponential growth doesn't stop soon :-), so there are some performance needs, but nothing serious compared to the typical xyzzy.google.com property. (This is also the reason that there aren't a ton of Google-developed frameworks that I can use -- there's a ton of reusable web server code, but it's mostly geared to the needs for massively parallelized servers where each box handles 1000s of hits per second, and consequently not really good for rapid prototyping. And of course it's all C++ code.)
Knowing myself, I'd happily go off and build my own web framework at this point, based on exactly the requirements of this particular application, but I figure that a framework written to serve the needs of a single target application wouldn't necessarily be better than some of the web frameworks that already exist for Python. So where to start?
I took a brief look at Django, and while I like their website (pretty and easily navigable and chockfull of useful information), I'm not keen on the particular tools they provide (it doesn't help that they begin every example with "from mumble.something import *"). For example, Django's templating language is rich and powerful, but it doesn't look very Pythonic to me -- in fact, it's so rich and powerful that it might as well be PHP. Similarly, I'm not keen on their object-relational mapping approach. There's too much magic based on name correspondence, and the automatically generated APIs feel a bit unpythonic (e.g. lots of getter and setter methods where a normal Python object would use public attributes and perhaps properties). I imagine that it works best if you know exactly how it is mapped to SQL.
One thing in Django that I like: the URL mapping API. You specify a bunch of regular expressions, and for each regex you specify the function to be called. Groups in the regex become arguments; named groups become keyword arguments. Very simple and clean. I'm not sure that I like having to put quotes around the function (path) names; but I can see how this actually saves typing because you won't have to write an import statement for it, and in rare cases it can save loading stuff you never use.
Then I decided to have a look at Ruby on Rails, just to see what I could learn from the competition. I watched two fascinating movies, but they went a bit too fast to really understand what was going on, and there seemed to be a fair amount of sleight of hand in the examples (a lot of default behavior that just happens to do the right thing for the chosen demo). Again, the templating language seems a weird mixture of HTML and Ruby, and I find Ruby's syntax grating (too many sigils). I believe I heard Greg Stein say recently that if you are really good in Ruby, CSS, HTML and SQL, you can produce great websites quickly with Rails -- but if you don't, you produce lousy websites quickly (just like with PHP).
For a bit I pondered Quixote. I used it to write a prototype application at Elemental two years ago, and I liked it fairly well. I remember that setting it up was a bit weird (some strange config file that you had to get just right) but I like its approach to templating: instead of inventing a brand new templating language, it makes one tiny modification to Python so that you can use bare string literals (and expressions) instead of print statements to produce HTML. It also has a really cool trick, due to Neil Schemenauer, that avoids the security issues that are so common in naively written PHP applications (just read BUGTRAQ for a while and you'll know what I'm referring to): by default string expressions are automatically HTML-encoded, except string literals, which are assumed to contain valid HMTL. This means that you can write '<h1>' + title + '</h1>' where title is some variable that you just received from the user, and HTML punctuation in title will be encoded, but the <h1> and </h1> tags will be passed through unchanged. But (as far as I recall) it doesn't have an interpolation strategy that's much more sophisticated than standard Python.
Next I took a quick look at Michelle Levesque's PyWebOff blog. It's nearly a year ago that she last did much about comparing Python web frameworks; I saw the last entries about Nevow (a templating system in Twisted) and it scared the hell out of me. It takes 8 lines of inscrutable Python code and 12 lines of template HTML to produce a list with text in alternating colors. The template uses XML namespaces. I happen to know a lot about those (I was at Zope when we designed Zope's TAL templating language) but it is and will always be my opinion that XML was not intended for humans to be edited (except very occasionally as part of bootstrapping or debugging). And that goes doubly so for XML with namespaces. (Here I have to contend that Rails has the right idea -- "no XML sit-ups" is a great slogan!)
I should probably read Michelle's blog for her experiences with other frameworks; but I got distracted and tried to figure out what the Python web-sig is up to. There I found fequent mention of something called WSGI -- there's even a PEP, PEP 333! I should definitely study that. Although I fear that it's too low-level to really help me much; from the intro it appears to be more of a (very useful!) standard middleware API for Python web frameworks than that it provides much functionality that I could use right away in my application. And the word middleware (just like much of Phillip Eby's work, alas) scares me.
Before I post this, let me attempt at a brief classification of the features that every web framework needs.
Independence from web server technology.
You should be able to run the same application under Apache, as a CGI script,
as a stand-alone server (e.g. BaseHTTPServer or Zope's or Twisted's built-in server),
etc. (The Java Servlet API does this really well IMO -- I used it at Elemental.)
This should include logging and basic error handling (an API to generate
any HTTP error, as well as a try/except around application code that returns
a 500 error code if the application code fails.
Templating with reuse. Every web application needs to mix computed data
(in which category I include data retrieved from a database) with HTML mark-up,
and often a lot of the HTML markup is common for many pages (e.g. global navigation).
Cookie handling. For authentication, preferences, sessions, etc.
Query parsing. The bread and butter of form handling.
URL dispatch. You've got to be flexible in how URL paths are mapped to callables.
Zope's URL-to-object mapping is extremely flexible. Django's approach is nice too.
I expect everything else is optional. You can write your own SQL (as we did at Elemental), use an object-relational mapping library (like Django or RorR), or use an object database like Zope. You can even persist things directly to the filesystem (just make sure it's being backed up :-). While every dynamic website eventually develops authentication needs, there are many different existing approaches to authentication, and I suspect that it's not particularly hard to do this as part of the application. Some frameworks go wild on predefined CSS and HTML templates. (I believe Plone does this -- if you see a site with frequent use of 1-pixel rectangular borders and a calendar widget in the margin, you can bet it's somebody's first Plone project.)
Please set me straight. What did I miss? Where is the WSGI standard implementation?
I am primarily a Python developer, but decided to use Ruby on Rails for a project on my dayjob (I was very lucky that they let me choose my own tools) and I'm really satisfied with it. I still prefer Python to Ruby as a programming language, tho.
If I were to start a personal project today I would probably pick webpy. It's very nicely written and concise (not to mention it's written by Aaron Swartz, whose coding skills are very trustable), and doesn't get in my way. It uses Django-style URL routing and SQLObject (which is very mature and the only ORM to date that hasn't given me headaches about UTF-8 handling). Other than that, it just routes requests to classes and let you structure the application in your own way. It does have a set of libraries it's supposed to work with (such as Cheetah, for templating, which I rather like) but I don't think they're a requirement.
There's no "standard" WSGI implementation. And since it really is a protocol more than an API, there doesn't need to be a standard. PJE wrote the wsgiref library, but it's just a very simple framework for making WSGI compatibility a bit easier to handle, and to suggest some particular styles. But I'm not sure it's actually that much easier to use than to not use.
Paste (http://pythonpaste.org) provides several WSGI tools, but you wouldn't use all of them, and it isn't a "framework" at all. It's tools to build a framework. Though for some kinds of software it might actually be a good level to work on, if you want to stay close to HTTP and aren't looking for MVC style programming. But that doesn't sound like what you are doing.
One thing Paste gives you is that you can build your own framework without feeling guilty, because you are really only building the "framework" part of it, you aren't building infrastructure. And a lot of the work you do can be factored out into something that any framework can use.
For instance, Julian Krause -- who is pretty new to Paste -- was able to make a framework with a CherryPy interface in the space of a week with a little help from me (http://rhubarbtart.org). Paste handles things like configuration, hooking up a server, catching exceptions and providing interactive debugging, scaffolding for getting a new working app up quickly, and other typical framework features. In about 300 lines of code. It's not a "full stack" framework like Django or TurboGears, but you might not be of the personality to be happy with full stack. I think some people will never be happy with a full stack.
Another framework built on Paste that is closer to a full stack -- and integrates several projects beside just Paste -- is Pylons (http://pylons.groovie.org/). For URI resolution it uses Routes (http://routes.groovie.org/ -- same author but fully decoupled packages), which is similar to the regular expression resolution except using a pattern language more targetted at matching paths than just arbitrary strings, and a resolution that is reversable. The other core piece of Pylons is the Myghty templating language. The full stack that Pylons adds from there -- forms and validation and persistence and all that -- is really more about suggesting a good layout and complimentary packages, not providing tight integration.
For someone who's willing/ready to understand WSGI and the kind of things middleware can do (I don't like the "middleware" term either), Paste can help you build a very pleasing architecture. I think WSGI is formal in the right ways, and agnostic on the right matters, that it is the right basis for building a foundation for web programming. Foundations aren't suitable alone -- I sleep on a soft bed, not on concrete, no matter how strong the concrete is. A solid foundation in programming provides the tools to move forward in a deliberate and consistent way, building a body of code that you can rely on and build into ever higher abstractions without creating something hopelessly confusing.
I'm personally not involved in this project. And my experience with it is still at an evaluation level. Too bad it is still below an official release 1.0. The SVN version is recommended.
At my work we use Zope2 and are planning to migrate to Zope3, eventually. Z3 has many nice features. It prevents TTW scripting. Allows for reuse of partial components, classes and interfaces. But we find Z3 very hard to master. So, we echo your cry for a Pythonic web framework!
If you liked Django's URL mapping, you might have a look at Ben Bangert's Routes project (http://routes.groovie.org/). It's a nice system (closely modeled on a portion of Ruby on Rails), and is independent of any web framework. I'm using it with CherryPy.
I like the fact that it associates two-way URL/code mapping with named patterns. This means I can easily map a URL to some code, AND generate URLs from within code or templates. Since you don't construct URLs using simple string manipulations (or equivalent), it's easy to re-arrange things within a site.
I find myself in a similar position (the difference is that I'm also fairly new to Python, I gather that's not the case for you!). I'm having some good success with http://turbogears.com which seems to take a similar approach to RonR but with Python (i.e. combining best-of-breed toolkits with some additional utility code to speed development).
Pros: Built on established tools: kid, cherrypy, and sqlobject; so there is loads of documentation out there for these. I especially like kid templates as they feel very flexible should you ever need to support Rest style XML over HTTP interfaces.
There's also a nice "screencast" in the style of the rails guys.
Cons: Most of the actual TurboGears documentation seems to refer to the CVS version rather than the latest release, but in practice I haven't found this to be too much of a problem as long as I don't get distracted by the "ooh, shiny things!" factory that some of the upcoming features have!
> Again, the templating language seems a weird > mixture of HTML and Ruby, and I find Ruby's syntax grating > (too many sigils). It doesn't seem to be, it is, RoR just uses Ruby as a templating language, inserted with standard CGI tags (<% %> and <%= %>) and linked to the content of the controller associated with the template
Other than that, I'd give you the same advice as RR Nederhoed : give a look at Turbogears, it's not a web framework in the regular sense (it's more of concatenating what the maintainers consider "best of breed" in the current Python world in a single tool) and I especially like it's templating system (Ryan Tomayko's Kid, which I also use as a standalone tool), it's extremely clean (you just write pure HTML/XML and add control attributes in Python for loops and conditions), very easy to call from the code and allows a rapid generation of anything based on HTML or XML (and both the template itself and the template output have to be valid XML, which is not a bad thing in this case).
TurboGears is in active development with good momentum. It's on the way to a polished 0.9 release, and much has been done since the last 0.8 release. To get all the new goodies working with svn is inevitable atm.
About the raised requirements in regard to TurboGears
> Independence from web server technology. You should be able to run the same application under Apache, as a CGI script, as a stand-alone server (e.g. BaseHTTPServer or Zope's or Twisted's built-in server), etc. (The Java Servlet API does this really well IMO -- I used it at Elemental.) This should include logging and basic error handling (an API to generate any HTTP error, as well as a try/except around application code that returns a 500 error code if the application code fails. TurboGears is based on Cherrypy, which has wsgi support. Also it is deployable on mod_python and lighthttp. There's some bugs in the wsgi adaption, but these are ironed out.
> Templating with reuse. Every web application needs to mix computed data (in which category I include data retrieved from a database) with HTML mark-up, and often a lot of the HTML markup is common for many pages (e.g. global navigation). * TurboGears supports primarly KID templating, but can support any number of other template systems via plugins. So far Stan (from Nevow), ZPT and Cheetah have been provided as plugins.
> Cookie handling. For authentication, preferences, sessions, etc. * Cherrypy gives the means to do fairly good cookie handling. It has buildin functionality for sessions over cookies or other forms of cookie storage, and it has the possibility to store the session data on different data storage backend. Also TurboGears implements an identity framework that let's you authenticate users and maps them based on configurable credecentials (cookie default afaik) to users on the database.
> Query parsing. The bread and butter of form handling. * Cherrypy follows the aproach to translate URL parameters to named arguments.
> URL dispatch. You've got to be flexible in how URL paths are mapped to callables. Zope's URL-to-object mapping is extremely flexible. Django's approach is nice too. * Cherrypy is not entirely dry in this compartment, but some fairly sophisticated dispatching can either be hacked with __getattr__ or by adding a patch to the easely modifyable path traverse function.
And to throw around a few buzzwords, TurboGears exclusively has: * good ajax integration (via MochiKit and JSON) * catwalk, which must be one of the slickest Data administration tools I've seen, and it's all web/ajax based. * a model designer in Web for database schemas complete with graphics generator and code generator is sweet. * Internationalization framework for mulit-language application that gets the fuzz out of your code's way. * Widgets for sophisticted form building.
Personally I've choosen TurboGears for now, and it has worked out nicely (also because of the choice of subframeworks to stand on)
Last but not least, the Subframeworks TurboGears uses all have their own development, documentation and bugtrackers. The interaction between TurboGears and those frameworks is not single direction, it goes both ways. TurboGears benefits from features, testing and documenation done independently for each Subframework. The subframeworks benefit from increased development speed due to a bigger and driven audience.
It's very pythonic, it doesn't require any templating language or learning anything new. It's just Python.
If you include Python within html, you use <% and %>. If you include html within python, you simply use quotes. You can also define functions that correspond to URLs, therefore creating a full site with only one script.
Python inside html (.pih) is good enough as a templating language (but you can use Cheetah if you want).
For something more powerful in trems of performance, I use bare bones mod_python (psp or publisher + psp).
"The template uses XML namespaces. I happen to know a lot about those (I was at Zope when we designed Zope's TAL templating language) but it is and will always be my opinion that XML was not intended for humans to be edited (except very occasionally as part of bootstrapping or debugging)."
Given that XHTML is XML, and HTML either is, or is very close to, XML, I find it very natural to have a template system that is XML-based. At work, we use PHPTAL (a PHP version of the Zope Page Templates that you mention), and it works great. It means our templates may be viewed and edited in HTML-editors like Dreamweaver, as well, as there's nothing in there that isn't (X)HTML.
In other words, even if the template is XML, it doesn't mean you have to edit it as XML - it might be edited in a WYSIWYG-editor.
PHPTAL has also proven to be a powerful tool for general XML production, and can be used in other cases where you want to produce XML (or even text), and it works well, as both the input and output is XML, so you have "closure" (which means you can for example chain multiple transformations after each other), like XSLT. Also, PHPTAL/ZPT/etc. templates are useful in that the template (the input) has the same form as the result, which is not the case with for example constructing XML using DOM, nor using XSLT.
Django has a "magic removal" branch, which will be merged in a few weeks, I believe. Check out how things get going with Django after the merge. It's worth the shot. It's a very good thing that Python is so important in Google. When I introduced Python to my boss, he asked me who was using it. Telling him that Google uses it a lot was enough :-)
Flat View: This topic has 104 replies
on 7 pages