Abusing python can really save your life sometimes. I think my favorite recent hack has been accessing and MODIFYING callers variable, due to inability to get it passed through the Django module to the callback.
I love how everything is available at runtime. Python's "we are all consenting adults here" philosophy has been super useful in a pinch.
I was monkey patching an external library in JavaScript today, rather than having to fork the thing for one slight modification, and thought about this philosophy again.
Monkey patching is something that you should know how to do. And then learn not to do.
Monkey patching is how you get into situations where the order in which libraries get loaded causes software to unpredictably work or break. Or, worse yet, to work reliably on the developer's machine and break in the wild. (Because the external library on someone else's server loads slower for the developer than their local library, and faster for some users.)
While actually using it in live code is definitely a bad idea, I'll have to say, it makes debugging and prototyping a lot easier and doable interactively.
One of my biggest pet-peeves is how since popular media hyped Y2K up as a potential world ending disaster when nothing happened people thought the whole thing was a myth. When it was not a myth (though it was overblown), it was just avoided because people engineered around it. I can't wait to see how people handle the Year 2038 Problem.
Also I've seen managers pushing people to use prototype code directly for the production app, because "why fix what's not broken" and "we need this last week". So I write my prototypes as if they are going to be shipped.
I am curious how this effects the compiler in particular?Does the compiler measure this in some way and respond with different behavior? Or is it already designed less efficient to account for these types of operations? Or a mixture of both and/or more nuanced hurdles? Thanks for any info you can provide.
Python 3.6 added a version field to dictionnaries. As all variables are looked up in dictionnaries (aka namespaces), a compiler can replace a call to a builtin by a test of the version of the builtins dict (aka a guarde), which chooses between an optimized version and the "naive" code.
The effort required to unwind these technical debts is actively inhibiting our ability to deliver real value. That's the reality we've got.
Sloppy Python is particularly bad, because Python gives you a massive footgun to play with. Like you could write functions that radically alter its behaviour depending on where it's getting called - that's similar to the hack that the parent post was talking about.
It's easy to build systems that are coupled in really weird and unexpected ways, and that makes reasoning about code much harder than it needs to be.
That's not really the languages fault though. I've seen some amazingly bad Java, Python, JavaScript, PHP and bash. Be cross that technical debt was introduced and not paid back rather than at the tools used to do it.
Most languages don't facilitate directly accessing the variables of other stack frames. Of the languages you listed, only Python does that (though bash has equivalent terrible bits), so that in particular _is_ Python's fault.
The reality is that Python provided you with a paying job, because it carried you to where you are. The effort and sheer dumb luck involved in keeping a business alive to the point that some dev can complain about technical debt is a very underappreciated luxury.
Oh yeah, the first time I monkey patched a method, it was an "aha" moment. I really understood the 'self' variable, and class as variable, and python just clicked all of a sudden for me.
It is also the main reason python is so hard to speed up and optimize ; JavaScript is not quite as dynamic, so it can be speeded up more (though super dynamic constructs like “with” do kill all optimization’s and hinder JITting)
A caveat: this example shows that you can read the value of a local variable, not that you can modify it. If you try actually assigning to an element of f_locals, as opposed to just dereferencing it, you'll find that the modification doesn't take effect due to optimizations in CPython's bytecode interpreter.
But then again, I was playing around with the function bytecode to transfer functions between instances in some of my for-fun hacks (you can access the byte code of a function and manually create functions from byte code in Python through "code objects").
What security holes can pickle cause? We used it for a while while training and saving our ML models which otherwise would have taken a lot of time to retrain each time the system starts.
Traceback (most recent call last):
File "<pyshell#1>", line 1, in <module>
from pickle import loads; loads(payload) # don't do it...!
File "...\lib\pickle.py", line 1388, in loads
return Unpickler(file).load()
File "...\lib\pickle.py", line 864, in load
dispatch[key](self)
File "...\lib\pickle.py", line 1139, in load_reduce
value = func(*args)
ValueError: bad marshal data (unknown type code)
Well, like eval(), it's only a security issue if you're reading pickled files from untrusted sources (that is, anywhere an attacker could have modified them). If you just ship them along with your Python source files, then it's a moot problem, since the attacker could just edit the source files.
In that example, subclasses of Runnable can be transferred. However, only the corpus of the "execute" method and the "properties" object is transferred. If you need to access modules, you'd have to import them inside the execute method.
Can you give an example? It's a callback passed to the decorator implemented by the library. The library calls my callback with just the arguments it wants ;(
I think the parent misunderstood "inability to get it passed through the Django module to the callback". At first it sounded to me like you were trying one of your own values.
But if you actually plan to use this code in production, I would strongly advise against it :). There is a better solution available that will survive django upgrades:
Add a middleware that stores the session object in a thread-local variable for the duration of that request. You can then access that variable from your event handler.
> Add a middleware that stores the session object in a thread-local variable for the duration of that request
This has come up many times, and I've never been able to figure out how to do this without something redis. Do you know any open source examples of this?
Well, the idea is that I would have to store a variable and access it without races, so I would use an external storage library. This is pretty cool though, I've never used python threading for state, thanks a lot!
This doesn't use threading, just a thread-local variable (i.e. a variable that points to different memory locations for each thread) to make the whole thing threadsafe.
Yeah, that's why I said I wasn't sure of the context. :) I guess the middleware suggestion someone else posted is probably the idiomatic Django way, then.
In general this is why I don't like the direction Django is going. The Python idiom for this would be to create the callback as a closure after the request is available, and then pass it into the library function that calls the callback. I suspect you might still be able to do this, but you'd have to use a function-based view, and that's no longer idiomatic Django.
It seems like what Django folks want is to write a configuration markup language that is a subset of Python classes and not actually ever write any function bodies. The problem with this approach is that you're limited to the configuration points the framework gives you (in this case, the arguments passed to the callback). They can give you workarounds (i.e. middleware) but this is still fairly hacky. It would be better, IMHO, to stick with function-based views and provide library functions to do what the framework does, which can be called through function-based views in a way that's idiomatic with the host language.
But Django has gone too far down this path to change now, so I guess I just have to accept it. :)
> The web devs tell me that fuckit's versioning scheme is confusing, and that I should use "Semitic Versioning" instead. So starting with fuckit version ה.ג.א, package versions will use Hebrew Numerals.
There was a (proposed? I don't remember) library or language that deleted source lines that caused errors. Repeated invocations would eventually yield source that did not produce any errors. Of course, it might be an empty file...
"On Error Resume Next specifies that when a run-time error occurs, control goes to the statement immediately following the statement where the error occurred where execution continues. Use this form rather than On Error GoTo when accessing objects."
The point of "On Error Resume Next" is that you can check the error code on the next line. It changes the error handling from exception-style to error-return-style.
This brings me maniacal joy :D. One time, I had to make the opposite functionality: a streaming module was eating all errors inside a while True loop. This was making debugging really aggravating. So I made a DieWithHonor exception that caused the program to call process.kill() on its own PID >:D
(Python's LOAD_FAST bytecode does `fastlocals[i]`: how can we abuse the lack of bounds checking on this array access?)
(We've also discussed potential extensions this to approach "lift" C-extension code in bytestrings into interpreter objects. This would be useful to escalate existing interpreter attacks in environments that try to lock things down.)
On somewhat similar note dfsch implements tail calls by essentially throwing C-level exception (the VM contains somewhat ugly C macrology that implements exception handling in C in a way that is somewhat similar to WinAPI's exceptions, this is then used to implement TCO, tagbody and condition system). The implementation is arguably ugly, but the end result is that anything that you can do in scheme code you can also do in C native functions.
I've got a sneaking suspicion this is one of those clever, elegant, kind-of-awesome-thanks-python, yet entirely evil things you can, but never actually should do in a production app.
Yes I can see how this makes sense, but I wouldn't say it is easier than recursion! Maybe a good exercise, but when traversing a tree the recursive solution is very easy to understand. The iterate solution ...
Then you have other problems, where the recursive solution is actually not super easy to understand why it "works" immediately (think, balanced paren generation) and going on to create an iterative solution, while not insanely hard, is not something super obvious.... what's your condition to stop iteration, again?
Maybe I'm thinking about that the wrong way, I'd love to learn more about it!
That problem is closely related to tree traversal, anyway, I should fool around with those two ideas a bit more to properly understand.
I never said it was easier, I just said it was not much harder. :)
Like I said, it is sometimes a bit less readable (tree-walking is a good example of such case), but at the same time, what the code actually does is much clearer. You have no hidden costs, and the code is easier to optimize this way.
You mention a case where you do not fully understand why the recursive solution works, in which case you obviously can't easily write an iterative solution. However, in this case, you are poorly equipped to make any implementation, recursive or iterative.
We started out trying to write a general method to convert top-down dynamic programming to bottom-up and this was as far as we got. Working around finite stack sizes and generating the topological sort of the function states are the only practical uses that I can think of.
This is a cool hack. Our project (Stopify.org) can do this transformation on source JS programs and automatically give you heap bounded stacks transparently.
A cool result of doing this in JavaScript is that any language that compiles to JavaScript (python, ocaml, Scala, c++, clojure...) can transparently get heap bounded stacks by just using our compiler.
For some more examples, look at pyret (pyret.org) that also supports heap bounded stacks. (In fact, we were able to strip out the Pyret compiler and just use Stopify to get all these benefits)
in general `f()()` means call the function returned by f() with no arguments. So f()()() means call the function returned by f()() with no arguments. In this case recurse() returns itself so
recurse == recurse()()()()...
It's more of an interesting observation than anything useful
The syntax can occasionally be useful if you have a function the generates function, but then you'd be calling initial function with some argument like this:
well "recurse" returns an object, "()" executes the returned object. As that execution returns an executeable object as well, you can just start to chain the "()"s ad infinitum.
It's a little more devious than that. The key is that the decorator maintains a stack of call parameters, and a map from parameters to results.
When a recursive call happens, it aborts the top-level call, runs the recursive call directly, memoizes the result, and then re-executes the original call. Hence the warning about this only being usable for pure functions.
EDIT: also I just noticed that the map keys aren't the parameters themselves, but their string representations. So heaven help you if you try this with a data type whose str() method isn't one-to-one.
My internship (trying to compile graphs into cobol) involuntarily ended up abusing java exception because I didn't know how to write graph rewriting systems, induction and backtracking.. I think my employer never ever used that.