I've done some Xperf/ETW, and saw many similarities to Linux perf: various events to trace, in groups, and a trace->dump->post-process workflow.
Linux perf has good support for profiling performance monitoring counters, from instructions per cycle to uops events. I haven't seen them instrumented on Windows, other than by Intel VTune. Linux perf can also do kernel & user-level dynamic tracing; I've heard such support exists on Windows, but not seen it done yet.
The event documentation with Windows ETW looked better. On Linux, some of the events (tracepoints) have no documentation other than the kernel source. (But, at least you have kernel source. :)
perf, and Xperf, are pretty good, but what I think they really need is more kernel level programability, to do (more) custom in-kernel filtering and summaries. DTrace has shown the value that this can bring. It should lower the overhead involved, and allow these tools to be used more often. Linux perf may get there with eBPF (although it does currently have in-kernel counts and custom filters)...
I do some profiling on Linux, although nowhere near as much as on Windows. My observations of the differences are:
I love how easy it is to measure CPU performance counters on Linux, whereas this requires VTune or similar on Windows and is generally painful. I've heard that ETW does support these performance counters but that Microsoft has yet to expose them, which is sad.
I love how easy it is to get counters like context-switch counts on Linux. Just run your command under perf and the results are available the instant it finishes.
perf top is also a wonderful thing for real-time investigations.
So, Linux windows for quick access, access to real-time information, and CPU performance counters.
xperf/ETW/WPA wins a lot of points for the UI. This allows exploring details. For instance, on Linux I can trivially see how many context switches there are, but zooming in on an individual context switch to find out when and why it happened... I wouldn't know how to begin.
Brendan said:
> The event documentation with Windows ETW looked better.
At least they have tooltips now -- that's progress.
So, Windows needs access to CPU performance counters, kernel level programmability, and ability to quickly get back statistics. Linux needs a UI for exploring the data from "perf record", and Linux needs a better story for retrieving related symbols and source automatically.
In short, both platforms should try to copy the best features of the other because they are both missing vital items.
BTW, nice job on the idle-time flame graphs, but I'd definitely still give xperf/ETW/WPA the advantage, especially when tracing wait-chains that span processes.
Another general perf question: How do you generally explore perf problems? I find that I can narrow down performance differences to the point of "we spent more time in function foo" fairly easily. Understanding why can be difficult (was it an intentional change? Should we have been in there at all? Did some data structure change this time around?).
All in all, I feel like a detective, and don't have a rigorous method to apply when looking at problems. Just wondering if anybody feels similarly or if with more knowledge and understanding will come clarity.
I've got no silver bullet for that. Recently I found a bunch of functions that were running slower on Windows. I eventually proved that their code had not changed and they were being run the same number of times as before. I had to use perf on Linux to prove that the i-cache miss rate was elevated due to changes elsewhere, making the identical code run slower.
Having a good sense of how long something "should" take is certainly important, but investigating small performance regressions is always hard.
That said, there are often a lot of big performance problems, and those are easier to find and often easier to understand, and more important to fix. Hey, maybe we shouldn't have a 50 MB global variable in each server instance -- whaddya think?
Being a detective is good. Apply the scientific method, form hypotheses, test and reject them, rinse repeat. That's certainly way better than the off-applied "random guess based on Internet search" technique.
Removed licensing restriction: Zoom is transitioning into a free software product and is no longer available for sale
I used Zoom as a commercial product on Linux and found it helpful, although licensing bugs eventually made it work poorly and I have not used it since changing jobs. It looks like I may be using it again.
Linux perf has good support for profiling performance monitoring counters, from instructions per cycle to uops events. I haven't seen them instrumented on Windows, other than by Intel VTune. Linux perf can also do kernel & user-level dynamic tracing; I've heard such support exists on Windows, but not seen it done yet.
The event documentation with Windows ETW looked better. On Linux, some of the events (tracepoints) have no documentation other than the kernel source. (But, at least you have kernel source. :)
Symbols is also often more difficult to get going on Linux. Bruce Dawson (who can probably answer this Q better than anyone) has written about it in the past: https://randomascii.wordpress.com/2013/02/20/symbols-on-linu...
perf, and Xperf, are pretty good, but what I think they really need is more kernel level programability, to do (more) custom in-kernel filtering and summaries. DTrace has shown the value that this can bring. It should lower the overhead involved, and allow these tools to be used more often. Linux perf may get there with eBPF (although it does currently have in-kernel counts and custom filters)...