Hacker News new | past | comments | ask | show | jobs | submit login
FunctionTrace: Graphical Python Profiler (functiontrace.com)
118 points by alex_hirner on July 5, 2023 | hide | past | favorite | 31 comments



It's really cool to see the Firefox Profiler UI reused in new projects. I'm biased having worked on it, but it's a very powerful visualization tool for complex multi-threaded performance data.


Is there something about the actual profiler that differs from existing tools like pyinstrument [1] or py-spy [2]? I know pyinstrument has various output options and I wonder if it could potentially output something readable by the Firefox Profiler tool.

[1] : https://github.com/joerick/pyinstrument [2] : https://github.com/benfred/py-spy


It uses FEE (function entry/exit) tracing so it generates a full execution trace of all functions called during your run like “History” [1], the C/C++ tracing profiler it was based on.

You can then post-process that trivially to get low resolution aggregate information like normal statistical/sampling profilers generate.

[1] https://www.ghs.com/products/MULTI_IDE.html


compare to pyspy, which is out of process and reads process memory to reconstruct stack frames as a sampler, which means marginal overhead for the program, re safer to use in prod. it would be interesting to see a function entry/exit on linux via epbf attached to USDT.


They claim tracing overhead is <10%. This is very believable, and actually frankly seems kind of high. You can do a full execution trace of a C program for a similar low double digit percent overhead and Python is usually on the order of 10x slower, so naively you would assume like 1% overhead.

I assume the hooks Python makes available are probably just kind of bad for this use case and they do not have access to a proper high speed log.


Yeah, it tends to be low single digit percents, but you can get some pessimal behavior if you have enough tiny functions because the overhead of what Cpython exposes starts becoming large. Our "benchmark" where the 10% comes from is tensorflow, which is overwhelmingly tiny functions and no sustained io/individual functions.


The license is a bit perplexing at https://gitlab.com/mbryant/functiontrace/-/blob/master/LICEN...

It says it is licensed under Apache License 2.0, but also under "Prosperity Public License 3.0.0" which limits use for a commercial purpose to 30 days.


I'm not sure why the Gitlab UI shows Apache 2.0, but PPL3.0 is the correct license (and is in LICENSE.md).


I don't see any pricing page or anything. So if I'm a business and after the 30 days decide I'm sold on it, what exactly do I do?


The website (https://functiontrace.com/) mentions that dual licensing is available at the bottom of the page, so my best guess would be that you need to contact the author for pricing details.


ah, that makes sense. Thanks!


Related:

Building FunctionTrace, a graphical Python profiler - https://news.ycombinator.com/item?id=24175395 - Aug 2020 (4 comments)


https://github.com/benfred/py-spy is also really nice, and has an actual oss license.


If I'm understanding this correctly you need to use Cargo to install part of this. That makes it a lot harder to start using as someone in the Python ecosystem (who is not also in the Rust ecosystem)


Python packaging is painful.

They will probably get a build system going so wheels can be used eventually.


It's nice to see how many different approaches to profiling there are these days in Python. I work on another (commercial but with free plan) Python profiler, Sciagraph: https://sciagraph.com.

The main use case is data science and other long-running batch jobs. Some differences:

1. It does memory profiling at basically no performance overhead; sounds like for FunctionTrace it's high overhead so off by default. And it catches _all_ memory allocations, not just Python API ones. This is based on using sampling, so it's not useful for profiling tiny functions (but for data science/scientific computing it'll work just fine).

2. Uses sampling for performance profiling, unlike FunctionTrace. Again, perfectly fine for any non-micro-benchmark data science program.

3. Also has a timeline view, without having to upload your data anywhere.

4. No native stacks yet.

5. Shows you if you're using CPU or I/O for every particular sample.


You should really not randomly speculate when doing adversarial comparisons. Note: I am not a author of FunctionTrace.

1. You claim ~5% overhead [1] and it is not useful for profiling tiny functions. They claim ~10% overhead when handling tiny functions and, who I believe to be the author of the package below, said it averages low single digit percents (i.e <5%) on normal non-worst cases. This is comparable or less overhead than you claim.

2. Sampling is so much worse than a full execution trace from a performance optimization and observability perspective it is ridiculous. Yes, a magnifying glass and a microscope are both okay for looking at ants, but only one is good for looking at cells. People prefer sampling because it was traditionally lower overhead, but for the same cost full tracing is so much better it is not even worth comparing, the result is patently obvious.

3. No “upload” is done. You are not sending data to them, everything is local. They are just opening the profiling data using Firefox to use it as a viewer like how you might open a PDF in Firefox.

[1] https://www.sciagraph.com/docs/understanding/fil/


Functiontrace is very good! I’ve used it a lot for hammering down bottlenecks, and it is easy to drop in to a script


Looks like it supports native stack as well, great! Most bottlenecks in seriously performance-sensitive programs are in the native code, which makes built-in tools like cProfile hardly useful.


It's a shame there isn't any support for coroutines. I'm really hoping some profiling tool that comes along to handle them + threads/processes.


Is this something that could be made to work with MicroPython?


Most of the work is done via a C extension, which I suspect would be incompatible with MicroPython. The same techniques are presumably applicable, depending on how many interesting hooks MicroPython exposes.


Came here to ask the same thing.


Functiontrace is awesome. It's easier to use than most other Python profilers in my experience, with a great viewer through Firefox too.


How does this compare to snakeviz?


snakeviz, tuna, pyprof2calltree etc. use data collected by cProfile which is probably the most flawed profiler anyone has been shipping for a long time, because cProfile only records the caller instead of the stack. Strangely enough the documentation for cProfile has an entire "Limitations" section which does not mention this.

Recording only callers instead of stacks means you literally cannot distinguish different control flows going through a common function at some point. All tools, like the mentioned ones, which pretend to reconstruct a flame graph or similar visualization don't actually work. A common cause for degenerate cases are function decorators used by multiple functions, especially if the decorator appears multiple times on the stack.

So for many practical programs the output of these is simply wrong and misleading.


Whatever happened to scalene?


Nothing at all, I was using it just fine earlier today :)


How the heck do you use the fancy web interface?

Whenever I go to the demo it just loads forever :(


How about Jupiter Notebooks?


looks great. wish jetbrains implemented this




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: