Run, Python, Run

timmy-turner · on Sept 17, 2011

RunSnakeRun is a nice tool to view the pstats data the python profiler generates. But for me, the squaremap is hard to read and does not really add any value. Also there is no way to 'watch' a pstats file, or combine many files into one. Generally I prefer working with the commandline, even for starting gui applications.

My current setup to profile a python cgi application consists of a set of scripts to mock the cgi runtime, invoke the application with cprofile and accumulate profiling data in a single pstats file. To get a grasp of the hot spots, I'm using the magnificent gprof2dot script (http://code.google.com/p/jrfonseca/wiki/Gprof2Dot) to generate a callgraph with profiling data (example: http://wiki.jrfonseca.googlecode.com/git/gprof2dot.png). Gprof2dot hides all unimportant (by default less than 1% runtime) calls and colors functions according to the time spend executing them. To view absolute numbers, I wrote a small python script to print the first view entries of a pstats file to the terminal and use that with watch (http://linux.die.net/man/1/watch) to keep an eye on absolute runtimes.

Overall this is a nice setup and I'm very impressed by the rich amount tools available for python, though sometimes the APIs are kind of strange to me (for example cProfile, pstats).

brendano · on Sept 17, 2011

For profiling web requests for a WSGI app, I like to use the Scotch WSGI recorder [1], so you can set it up to record one request, do the request with your browser, then replay it over and over for profiling. (This would be equivalent to your CGI mocking, probably.)

[1] http://darcs.idyll.org/~t/projects/scotch/doc/

andrewcooke · on Sept 17, 2011

just to be clear - runsnakerun is a viewer that presents data from the standard python profiler package. you can use the profiler without it - see http://docs.python.org/library/profile.html

also, when you're using a new api, it's worth paying attention to the default arguments for functions. since .pop() has an implicit argument of -1 you shouldn't be surprised that popping from the end of an array is most efficient (this is related to the O(n^2) v O(n) issue mentioned in the article).

spenrose · on Sept 17, 2011

I've been very happy with yappi: http://code.google.com/p/yappi/ . The output is a little clunky to work with at first, but it tells you what you need to know.

thurn · on Sept 17, 2011

Since the post mentions it, can someone provide a summary of the asymptotic runtime of different python list operations? I have no idea how python lists are implemented.

MichaelSalib · on Sept 17, 2011

They're implemented like C++ vectors or Java ArrayLists. a[x] is O(1) and so is a.append. del a[x] is O(n) unless x==-1.

algorias · on Sept 17, 2011

Cpython lists are self-resizing arrays IIRC. That makes lookups O(1), insert / remove at the end O(1) amortized, insert / remove anywhere else O(n). In short: append, extend and pop methods are efficient.

andrewcooke · on Sept 17, 2011

they're described here - http://wiki.python.org/moin/TimeComplexity

(as others have said, they're what you'd expect for an amortized dynamic array that expands proportional to the contents)