Profilerpedia: A Map of the Software Profiling Ecosystem

wyldfire · on Oct 4, 2021

> Build converters between popular profilers and analysis UIs to unlock better analysis.

This would definitely be handy - a tool to transform among the profiler formats? I think some subsets of this exist -- tools that will map one or two to one or two other formats, but nothing very general.

> Update profiler documentation to point to great analysis UIs that can accept the output for that profiler.

The chrome trace format is versatile and Perfetto [1] is a great visualizer for it. It's mentioned several times in the spreadsheet but I wanted to endorse it because I've had some positive experiences lately. There are some subtleties to how the trace format is rendered that don't seem like they're totally captured by the trace format spec though.

[1] https://ui.perfetto.dev/

mhansen · on Oct 5, 2021

Author here. Yeah, Chrome Trace Format + Perfetto is great and so easy to output. Anyone wanting to trace software could get some very quick wins by outputting Chrome Trace Format.

For converters, I'm trying to map those too, here's the tab of the spreadsheet tracking converters: https://docs.google.com/spreadsheets/d/1cVcHofphkQqk1yGeuBPV...

jeffbee · on Oct 4, 2021

What would be cool is if any of these profilers and compilers could put their heads together and make a profile format that A) is easy to generate (with `perf record` or whatever) and B) can be used as a compiler input without conversion (as in `clang -fprofile-use=perf.data`).

mhansen · on Oct 5, 2021

Author here. You're probably already aware of this, but you can convert the `perf record` format to a format Clang understands: https://clang.llvm.org/docs/UsersManual.html#using-sampling-...

I expect the problem is somewhat political; profiler and compiler teams release different features, at different frequencies and don't want to be tied together.

gnufx · on Oct 4, 2021

Free performance tools (tracing, profiling, visualization) from the HPC world, but more widely applicable, some with decades of R&D behind them, include:

BSC's https://tools.bsc.es/

TAU https://www.cs.uoregon.edu/research/tau/

The Scalasca set http://www.scalasca.org/ https://score-p.org https://github.com/orgs/score-p/repositories

HPCToolkit http://www.hpctoolkit.org/

Open|SpeedShop https://openspeedshop.org/

MAQAO (only partially free) http://www.maqao.org/

LIKWID https://hpc.fau.de/research/tools/likwid/

Ravel https://github.com/LLNL/ravel

ViTE https://solverstack.gitlabpages.inria.fr/vite/

and assorted other things, but that should be enough to be going on with.

mhansen · on Oct 5, 2021

Author here; thanks for sharing, the HPC world is a real blindspot for me and I think we're probably missing some of their insights when profiling regular applications. I'll research these and add them to the database

gnufx · on Oct 5, 2021

You are very wise. Visualization and analysis in particular, but also facilities, from what I've seen. I couldn't find anything in the Python world to measure a Python application calling native code (like linear algebra), but at least Score-P, Extrae, and TAU can, and for parallel applications.

mhansen · on Oct 9, 2021

Thanks, I've gone through and added these HPC tools to the spreadsheet.

FYI I'd consider py-spy for profiling Python+C extensions: https://github.com/benfred/py-spy#can-py-spy-profile-native-...

throwaway78981 · on Oct 4, 2021

There's also Prodfiler from Halvar Flake and friends.

https://prodfiler.com/

mhh__ · on Oct 4, 2021

AMD hardware is fast but I miss vTune :(