I've watched the ecosystem from the pre-1.0 days and it is still steadily evolvi...

nostrademons · on Nov 3, 2013

It's interesting, I've used Django for several apps (GFiber, various search quality prototypes) that don't at all fit the "database-backed webapp" paradigm. Obviously, the ORM, Form libraries, admin, or authentication libraries didn't factor into that decision at all. What did were:

* A standardized, battle-tested system for organizing your code.

* Lightweight view definitions. Views are functions from Request to Response in Django, which makes them very easy to define. Django, Flask, and Pyramid all get this right; Python, web.py, and webapp2 get it wrong. Also, Request/Response objects are very sanely designed.

* Very full-featured templates & helper functions. If you're going to run into a problem, chances are that Django has already solved it. Lighter-weight microframeworks like Flask don't have this property.

* The ability to swap out components you don't need, and to bring in other components as necessary. For example, none of the Django apps I've used since joining Google have used an RDBMS - they all use RPCs as the backend. Oftentimes they need to run proprietary Google code in a controller. This is all not a problem - it's just code, passed between functions as necessary. (Granted, many of them use the Django-nonrel fork.)

irahul · on Nov 3, 2013

> Views are functions from Request to Response in Django,

The thing that irritates me about django is request is passed as an argument, and if you need if further down the chain, you have to pass it around. I want the request object to be available throughout the request cycle without it being explicitly passed. I like the flask model of importing request when you need it. In Django, I can have a middleware to handle it for me:

    from threading import local
    _locals = local()
    class ThreadLocalMiddleware(object):
        def process_request(self, request):
            _locals.request = request

        @classmethod
        def request(cls):
            return _locals.request

        @classmethod
        def locals(cls):
            return _locals

But then there are edge cases. What if the threads are being reused? What about greenlets? I am thinking of taking out the flask/werkzeug implementation of locals and LocalManager

> Very full-featured templates & helper functions.

Django templates are nice; Jinja2 is nicer:-)

> Lighter-weight microframeworks like Flask don't have this property.

I think it's not just about Flask being light weight. It's more a function of Django existing for longer and being more popular. Granted that Django itself includes lot of batteries which Flask doesn't, but the "somebody has already solved it" is mostly either an extension, or a forum/stackoverflow solution.

nostrademons · on Nov 3, 2013

That's one of the things about Django that I really like and one of the things about Flask that really scares me.

I'm writing a retrospective now about things I've learned in almost 5 years of working on Google Search, and the entry I just finished was "In almost every case where we allowed non-local effects, it has bitten us. Badly." The cost is in understandability, and is usually not apparent when the choice to make it a global is made. But 2 years later, things inevitably grind to a halt because bugs are being shipped and nobody has a handle on the code any more.

I've cursed out having to explicitly pass arguments around many, many times before (not just with Google work; when I was doing Haskell stuff I used to curse how inconvenient it was that I needed to explicitly pass around stuff, and often wrote a state monad to hide it from myself). But I've found that over time, those apps never hit the brick wall where I'm like "I cannot deal with this anymore"; they remain maintainable, if ugly, indefinitely.

irahul · on Nov 3, 2013

> "In almost every case where we allowed non-local effects, it has bitten us. Badly."

But the request object in flask is not global. If the guarantees break(request locals leak to other requests), it will be a problem. I never have had a request local leak in flask. On the other hand, I have ran into cases where I need the request object in django signals. A simple task of getting the full uri requires either a request object(I would say that's bad api but that's how it is in django) or configuring sites. Bad api or not, if I am in a request, I should have access to it, and Django makes it hard. Dress it up if you want to comply with "no globals" - RequestManager.get_current_request(). Though I don't see how that's any better than `from flask import request` when you know it does the equivalent RequestManager.get_current_request() internally.

masklinn · on Nov 3, 2013

> But the request object in flask is not global.

So you're saying that if I don't use multithreading my global variables are not, in fact, global?

irahul · on Nov 3, 2013

> So you're saying that if I don't use multithreading my global variables are not, in fact, global?

The request object is context local. If there are multiple concurrent requests, all request have different request objects. I don't know what definition of global variable you have, but this is not it.

masklinn · on Nov 3, 2013

> The request object is context local.

The request object is a threadlocal (or, if greenlet is installed, a greenlet-local).

Which is exactly the same as a global variable in a single-threaded program.

irahul · on Nov 3, 2013

> The request object is a threadlocal (or, if greenlet is installed, a greenlet-local).

request objects aren't stored in python's thread local storage. If they were, a thread re-use will leak context local. Werkzeug uses the thread/greenlet id as key in its storage and cleans up when the request completes.

> Which is exactly the same as a global variable in a single-threaded program.

In a single threaded server, the context locals won't leak from one request to another. If they were global, they would. I don't see how request object has anything to do with global variables.

masklinn · on Nov 3, 2013

> request objects aren't stored in python's thread local storage. If they were, a thread re-use will leak context local.

No more so than werkzeug since it works the same way.

> Werkzeug uses the thread/greenlet id as key in its storage

Which is exactly how Python's threadlocals work.

> and cleans up when the request completes.

You do realize that nothing stops you from "cleaning up" your threadlocal "when the request completes" right? Yes werkzeug's locals and localmanagers conveniently provide hooks for that cleanup, but that's what it is: convenient. The primary purpose and the one explicitly mentioned in werkzeug's own documentation is having a single system for both greenlets and threads (the stdlib's threading.local obviously doesn't work with greenlets, just as werkzeug's locals don't work with non-greenlet coroutines)

> In a single threaded server, the context locals won't leak from one request to another. If they were global, they would.

Not if you cleaned them up after the request has executed. And apparently that's all you need to make a global variable not-global-after-all.

> I don't see how request object has anything to do with global variables.

It's an object available from all scopes. That's pretty much the definition of a global variable.

irahul · on Nov 4, 2013

> No more so than werkzeug since it works the same way.

I don't know if you are being intentionally dense. Werkzeug doesn't work the same way. Thread locals restrict themselves to thread lifecycle; werkzeug locals restrict themselves to request lifecycle. They are 2 very different things.

> Which is exactly how Python's threadlocals work.

Except for different lifecycle.

> You do realize that nothing stops you from "cleaning up" your threadlocal "when the request completes" right?

That "cleaning up" makes it request local, not thread local. They are global for the duration of the request and one request scope is separate from another request which makes them request local(how it is done is immaterial). Spare me the definition of global. There is no point in continuing this discussion any further since I don't seem to have any epiphany from your insights; and doesn't seem like you are going to have one either.

nostrademons · on Nov 3, 2013

Leakage between requests is only one of the problems with globals. There's also poor testability, inability to re-use functions that reference the request internally, inability to statically analyze dependencies (less of an issue in Python, since the lack of types makes it hard to analyze them anyway), additional context that the developer must keep in his head, and inability to cross-reference parameters & call-sites and identify where a given variable is coming from in code browsers or debuggers.

Singletons are no better - if you dress it up as RequestManager.get_current_request(), it's just as bad. If you don't understand why something is a bad idea, complying with the letter of the law without complying with the spirit doesn't really get you much.

I think this is the way in which my programming style has changed the most over the last 10 years. I used to want my programs to do everything and have access to everything. So I'd write PHP scripts that would do "SELECT * FROM table INNER JOIN table2 WHERE ..." and pass around the whole result set to every function and every template in the app in case they wanted to access some data in it. That way, I only had to change the DB schema and the template when requirements changed. (Anyone who's thinking of doing this, you're introducing massive data leakage and security issues into your app.)

Now I want my programs to do as little as possible and have access to as little as possible. Usually I end up pulling out the values I want inside the view function and calling any helpers using only those values, so I don't pass the request around anyway. The actual work of the program happens on strings, or ints, or structs, or other values specific to the domain.

I think what changed, for me, were a few things. One is that I got better at visualizing the operation of a large program as a whole, so instead of thinking "Hmm, I need a request here, how can I get access to one", I started thinking in terms of "This is how data enters the system, this is what we need for each computation it performs, and this is how we combine those results to render a page - how can I change it so the system still hangs together but does what we want now?" Another was that I kept running into instances where I could almost re-use a function, but was blocked because of hidden dependencies that it didn't really need. A third was testability - I learned how crucially important it is to be able to test parts of the system in isolation, and that's really hard when you have to setup the whole system context to run any function. A fourth was that I worked a bunch with both systems and found that adding a parameter to each function along a call path is easy (but tedious - if only there were a refactoring tool to do this for us), while disentangling a tightly-coupled function that depends upon a lot of implicit state is quite challenging.

irahul · on Nov 4, 2013

> There's also poor testability, inability to re-use functions that reference the request internally, inability to statically analyze dependencies (less of an issue in Python, since the lack of types makes it hard to analyze them anyway), additional context that the developer must keep in his head, and inability to cross-reference parameters & call-sites and identify where a given variable is coming from in code browsers or debuggers.

I don't know. Most of them seem like a non issue to me. Shouldn't the api cover that testability(of request)? At least, it does in Flask. Re-using a function which uses request won't be any different from re-using a function which uses datetime. I do get your point about difficulty of analyzing Python owing to dynamic typing and monkey patching, but I couldn't connect it to context local. As for cross referencing, you just treat request as you treat datetime.

> Singletons are no better - if you dress it up as RequestManager.get_current_request(), it's just as bad. If you don't understand why something is a bad idea, complying with the letter of the law without complying with the spirit doesn't really get you much.

I won't do that. That is just a tongue in cheek comment to appease "death to globals" police.

> Now I want my programs to do as little as possible and have access to as little as possible.

I don't disagree with that but assuming that the request object is available for the request lifecycle is pretty reasonable.

> Usually I end up pulling out the values I want inside the view function and calling any helpers using only those values, so I don't pass the request around anyway.

You don't pass around request in a general workflow(why would you?). But consider this https://github.com/tschellenbach/Django-facebook. This does a post save signal when a new user registers. I need to send a welcome mail to the user asking him to activate his account. I need request.build_absolute_uri to construct the uri, and guess what, I don't have request in the callback and I don't have a way to obtain it. Now there can be other ways I can solve it - configure sites and generate url using sites; simply have the base url in settings and construct the url by doing a urljoin; may be in some alternate universe, django's reverse generates external url. But the thing is, I have needed access to request object on more than a few occasions. Granted if the callback passed me the request object, it won't have been an issue but it doesn't and I had to resort to thread local middleware to save the request object.

> A fourth was that I worked a bunch with both systems and found that adding a parameter to each function along a call path is easy

The issue is apis which you haven't wrote. I can always change the packages myself, but I don't want to sync up with upstream for something I see as a small change.