Hacker News new | past | comments | ask | show | jobs | submit login
Speedometer 3.0: A shared browser benchmark for web application responsiveness (browserbench.org)
241 points by cpeterso 3 months ago | hide | past | favorite | 168 comments



This is fantastic! Speedometer 1.0 was a breath of fresh air, and 2.0 was a much-needed refresh, but it's really been showing its age in recent years. 3.0 looks like a solid upgrade with many new kinds of sub-tests, contemporary frameworks, etc.

I'm looking forward to sucking at this, and then slowly and systematically improving. :^)


Good luck! FYI, there's a hidden developer menu that's handy for browser developers to change the number of iterations, select specific tests, etc: https://browserbench.org/Speedometer3.0/?developerMode, and ?startAutomatically avoids needing to click the button to start the tests.


I'd love some videos seeing you try to improve the score or trying to get it to run :^)


Related annoucement post:

Improving Performance in Firefox and Across the Web with Speedometer 3

https://hacks.mozilla.org/2024/03/improving-performance-in-f...


Another related announcement, with a bit more detail on specifics of the benchmark changes (and some history of Speedometer): https://webkit.org/blog/15131/speedometer-3-0-the-best-way-y...


Pretty neat:

> This is the first time the Speedometer benchmark, or any major browser benchmark, has been developed through a cross-industry collaboration supported by each major browser engine: Blink/V8, Gecko/SpiderMonkey, and WebKit/JavaScriptCore.


[flagged]


Are there any small engines that have enough of the web implemented that they can run it?


The only other (semi) alive browser engine today is Servo, originally by Mozilla (and the reason Rust was created for), which is these days a Linux Foundation project funded by Igalia.

There are small web engines anymore. Every other one, from khtml to presto to trident, is dead.


Opera?


Opera uses Chromium so it's covered under Blink/V8.


Presto hasn't been used for years


Try it here: https://browserbench.org/Speedometer3.0/

Very unscientific results using a Mac Studio - Chrome: 20.4, Safari: 17.9, Firefox: 20.1.

Safari on an iPhone 13 Pro Max - 16.5.


On my Early 2015 MBP

Safari 17.4. : 5.52

Chrome 122 : 6.25

Firefox 123 : 7.26

The results pretty much confirms my general feeling about how the browser behaves on my machine as well. Where Firefox being the fastest.

And 22.5 on my iPhone 14 on iOS 17.4

That is my smartphone is ~4x faster than my laptop.


What Safari version are you using? For me, with 17.4, Safari is ahead of Chrome and Firefox, though it is close if you use dev channel.


macOS 14.4 for the Mac Studio tests, and iOS 17.4 for the Safari-on-iOS test.


I'm getting 27.7 in Safari 17.4 on an M1 Pro MacBook. I'm puzzled how you got so low on a Mac Studio.


Another Mac Studio M1 with Safari 17.4:

  22.7 with Content Blocker
  28.7 without Content Blocker


That was it, thanks! I thought I'd controlled for this by using a Private Window, but I'm getting 25.9 with extensions disabled (compared to 17.9 in my original post).


PC i7 13700k (chrome) - 29.8

iPhone 15 Pro (safari) - 23.1

MacBook Air M1 (chrome) - 25.9

MacBook Air M1 (safari) - 18.0


[flagged]



> Is it though?

In my experience it's the buggiest browser out of the big three, and is often missing basic features like e.g.:

https://caniuse.com/?search=opus

Supported in Firefox for *12 years* now, in Chrome for 10, still no support in Safari.

They only "support" Opus audio in their special snowflake '.caf' container, which is super buggy and the last time I checked no open source program could even generate Opus '.caf' files that could be played by Safari on all Apple platforms. I ended up writing a custom converter which takes a standard '.opus' file and remuxes it on-the-fly (I only store '.opus' files on my server) into Safari-compatible '.caf' files, taking special care to massage it so that it avoids all of their demuxer/decoder bugs. You shouldn't have to do this to have cross-browser high quality audio!


> You shouldn't have to do this to have cross-browser high quality audio!

You don't, because HE-AACv2 is as universal as MP3 and better than Opus at low bitrates.

That said, Safari for macOS and iOS plays all the examples at https://opus-codec.org/examples/ except the last.


> That said, Safari for macOS and iOS plays all the examples at https://opus-codec.org/examples/ except the last.

That's because none of those samples are Opus files, except the last one. It even says so on the page.

> You don't, because HE-AACv2 is as universal as MP3 and better than Opus at low bitrates.

No. I did evaluate it before picking Opus. It only beats Opus at very low bitrates, and open source encoders for AAC suck.


> That's because none of those samples are Opus files, except the last one.

Ooof, I didn't even imagine that the official examples were WAV files. Here's an Opus audio file that plays fine in Safari on macOS and iOS: https://kur-static.biblica.com/audio/GEN_001.webm (Note: I have no idea what this content is, but could not find any English Opus content in the wild.)

> …and open source encoders for AAC suck.

Yeah, the disparity is a real bummer.


> Here's an Opus audio file that plays fine in Safari on macOS and iOS

Yeah, that has Opus packed into a Matroska container (which people usually use only for videos and not pure audio). I suppose that's another good way of getting around the problem!


Just go to home page https://wpt.fyi/ see chart "Browser-specific failures are the number of WPT tests which fail in exactly one browser." Safari leads by a longshot with over 3800 tests failing only in Safari. Firefox has 1700 and Chrome less which kinda correlates to my own personal development experience.


Interop is only a tiny subset of the entire suite of WPT tests, and it only contains tests that all vendors agreed upon, so no browser will look bad in Interop.

If you look at the full WPT test suite [1], you'll see that Safari is by far the one failing the biggest number of tests, i.e. the most buggy browser.

The Safari team likes to use Interop to trick people into thinking Safari is as good as the others. It's just a PR play.

[1] https://wpt.fyi/results/?label=experimental&label=master&ali...


For a less biased result, use Stable: https://wpt.fyi/results/?label=master&label=stable&aligned

> If you look at the full WPT test suite [1], you'll see that Safari is by far the one failing the biggest number of tests, i.e. the most buggy browser.

In Safari's case, most WPT test fails mean "hasn't been implemented yet".

> Interop is only a tiny subset of the entire suite of WPT tests, and it only contains tests that all vendors agreed upon…

Exactly. If you're happy building "Works with Chrome" web apps, Safari is not for you.


"Browser-specific failures are the number of WPT tests which fail in exactly one browser." From wpt.fyi

In other terms, WPT test failures for Safari means Safari has bugs or unsupported features that both Firefox and Chrome do not have.

As for Interop, it focuses on a specific, very limited areas, like "scrolling" or "subgrid" and is in no way representative of the overall feature set of a browser.

So no, contrary to what you're implying, it's not that Chrome is too advanced, or doing too much, it's really Safari that is buggy and lagging behind both Chrome and Firefox (by a lot).


> In other terms, WPT test failures for Safari means Safari has bugs or unsupported features that both Firefox and Chrome do not have.

Yep! Safari is not the browser for people who need cutting-edge features, especially not for ones still at the proposal stage.


> Nevertheless, it’s still a garbage browser.

That seems like quite an absurd statement. Why do you think so?


On my M2 MBA (16GB RAM, 256GB SSD):

REGULAR MODE

Safari 17.3.1: 27.1

Safari 17.3.1 (Private): 26.2

Firefox 123.0.1 (uBlock Disabled, Enhanced Tracking Protection Disabled): 29.5

Firefox 123.0.1 (w/ uBlock Enabled, ETP Enabled): 27.1

LOW POWER MODE

Safari 17.3.1: 17.37

Safari 17.3.1 (Private): 16.99

Firefox 123.0.1 (uBlock Disabled, Enhanced Tracking Protection Disabled): 20.0

Firefox 123.0.1 (w/ uBlock Enabled, ETP Enabled): 17.8

Safari has 0 extensions installed, and Firefox 0 extensions installed besides uBlock Origin. Benchmarks were run with each browser as the sole application open and plugged in to a power supply.


For a more diverse view, as seen from my M2 MBA 8gb RAM:

Speedometer 3.0: Arc: 22.6, Orion: 19.6, Safari: 19.0, Chrome: 22.6, Firefox: 20.7

Speedometer 2.1: Arc: 408, Orion: 467, Safari: 481, Chrome: 404, Firefox: 478

No changes beyond stock browser. No extensions beyond stock install. Battery.


I get 31 in Safari, 20 in Firefox (M2 Mac Mini base model, Safari no extensions, Firefox with Ublock and some about:config adjustments)



On Firefox iOS: open in a new window because you won’t be able to press back to come back to this discussion easily.


This collaboration is pretty exciting. I would expect that the teams of all three rendering engines (WebKit, Blink, Gecko) have done whatever they could to improve performance for the launch and that there won't be any outliers at the beginning with all of them having similar performance.

But the title of future performance king is up for grabs! And now we have a de-facto standard for browser performance benchmarking.


This may be a dumb question, but what do the scores even mean? Is this explained anywhere? Neither https://browserbench.org/Speedometer3.0/about.html nor https://browserbench.org/Speedometer3.0/instructions.html appear to explain it. Are lower scores better, or higher scores?


Higher is better. The analogy is speed. You want more speed.

It's not a physical speed, just a benchmark number. Think of it as arbitrary units, which allows you to compare different version of browsers on the same machine.


If the analogy isn't working for you, you can see the actual durations when you click on "Details".


> You want more speed.

On the other hand, premature optimization is the root of all evil.

> Think of it as arbitrary units, which allows you to compare different version of browsers on the same machine.

That's precisely the problem. It's arbitrary, meaningless. Without any physical units, I don't know what's good or bad, fast or slow. And why do the scores go from 0 to 140 when the web browsers are all getting approximately 20?


> On the other hand, premature optimization is the root of all evil.

The web ecosystem is extremely mature and widely used. The workloads are fairly well understood. It is a magic unit, but the factors that go into it have a lot of thought from real-world scenarios. Bringing up "premature optimization" is completely irrelevant because that's not what this is, it's about as far as you can get from that.


> that's not what this is, it's about as far as you can get from that.

I don't know what it is. How exactly does the score relate to the experience of the web browser user?

I'm a browser extension developer, and I've occasionally had people ask me about Speedometer scores, but I have no idea what they're supposed to mean or what to tell these people.


Speedometer measures web app responsiveness. Roughly, it simulates a series of user operations on web apps built with various frameworks (as well as vanilla JS), and measures the time it takes to complete them and paint the results to the screen.

The score is a rescaled version of inverse time - if it goes up, that implies the browser can handle more user operations per second, or alternately, it takes fewer milliseconds to complete a user operation in a complex web app.


> Speedometer measures web app responsiveness.

We know that, but you haven't said anything specific about scores other than higher scores are faster, in an abstract sense, which has already been established.


"The score is a rescaled version of inverse time" is the key here.

If you run all the tests in half the time, your Speedometer score will double. If your score improves by 1%, it implies that you are 1% faster on the subtests.

(There are probably some subtleties here because we're using the geometric mean to avoid putting too much weight on any individual subtest, but the rough intuition should still hold.)


(I work on SpiderMonkey.)

Benchmarking is hard. It is very easy to write a benchmark where improving your score does not improve real-world performance, and over time even a good benchmark will become less useful as the important improvements are all made. This V8 blog post about Octane is a good description of some of the issues: https://v8.dev/blog/retiring-octane

Speedometer 3, in my experience, is the least bad browser benchmark. It hits code that we know from independent evidence is important for real-world performance. We've been targeting our performance work at Speedometer 3 for the last year, and we've seen good results. My favourite example: a few years ago, we decided that initial pageload performance was our performance priority for the year, and we spent some time trying to optimize for that. Speedometer 3 is not primarily a pageload benchmark. Nevertheless, our pageload telemetry improved more from targeting Speedometer 3 than it did when we were deliberately targeting pageload. (See the pretty graphs here: https://hacks.mozilla.org/2023/10/down-and-to-the-right-fire...) This is the advantage of having a good benchmark; it speeds up the iterative cycle of identifying a potential issue, writing a patch, and evaluating the results.


This doesn't say anything about what the scores mean.

21 is apparently better than 20, but how much better? You could say "1 better", tautologically, but how does that relate to the real world?

Driving a car 1 mile per hour faster may be better, in a sense, but even if you drove 24 hours straight, it would only gain you 24 total miles, which is almost negligible on such a long trip. Nobody would be impressed by that difference.


It means it is 5% faster. You are overcomplicating it.


Percentages are rarely informative without an absolute reference.

A 5% raise for someone who makes $20k per year is $1k, whereas a 5% raise for someone who makes $200k is $10k, which would be a 50% raise for the former.


You've demonstrated you understand how to use the score to compare both inter-browser performance (analogous to the amount each makes per year) as well as individual browser performance improvements (analogous to the amount of the raise). Seems pretty informative to me?


Iain explained that in a reply to your other comment: https://news.ycombinator.com/item?id=39672279

> "The score is a rescaled version of inverse time" is the key here.

> If you run all the tests in half the time, your Speedometer score will double. If your score improves by 1%, it implies that you are 1% faster on the subtests.

> (There are probably some subtleties here because we're using the geometric mean to avoid putting too much weight on any individual subtest, but the rough intuition should still hold.)



That's irrelevant. The speedometer reading is an absolute reference. The percentages being discussed are simply comparisons, and they're only being discussed to say "they behave like you'd expect."

To directly answer your original question: a reading of 21 is 5% better than a reading of 20 because 21 is 5% greater than 20, and this means that a 21 speed browser should do things 5% faster than a 20 speed browser.

TL;DR: They behave like you'd expect.


> The speedometer reading is an absolute reference.

To what?

I talked about driving a car. Miles and hours are an absolute reference. We still have no absolute reference for Speedometer.


To... itself? Go measure something. You now have a reference!

If you scratch out the labels of your car speedometer and forget which is which, it still measures speed. 80 is still 33% faster than 60, regardless of the units.


This reply is ridiculous. I'm done.


Suit yourself!

I suspect your questions would be answered better by playing around with the tool in question for a few minutes anyway, as you seem to be asking about capabilities the tool does not purport to have.


I guess that’s why it’s fairly interesting to see scores thrown out in this thread on random hardware. It’s anexdata, but gives a sense of the spread/variance of scores for common platforms. I don’t think this is a number that is ever going to make much sense for consumers to use because without this sort of context it’s just going to be like the spinal tap ‘this one goes to 11’ sort of problem.


They say something about the speed of the browser, so it doesn't really make sense to ask extension developers about it, I don't think. Possibly that your extension might make the browser slower, so you could compare scores with and without the extension and see whether it negatively affects performance? (Although I'm not sure it can necessarily tell you anything about to what extent it affects performance, only that it does.)


> Although I'm not sure it can necessarily tell you anything about to what extent it affects performance

Exactly.


The score goes from 0 to 140 so that there's some room for when the computers will get faster. When we started working on this, all browsers were maxed at 140, so the computation got changed.


I thought the front page goes to 140 just because it is modeled after actual GM dashboard speedometers produced ~1960-1990, sometimes having range 0-85 MPH, or 0-140km/h in metric markets.


Yes.

The speedometer graphic was inherited from Speedometer 2. When Speedometer 2 was released, scores were in a reasonable car-speed range. The combination of hardware and software improvements meant that early versions of Speedometer 3 (which includes a subset of Speedometer 2 tests) were consistently scoring above 140, so we adjusted the scaling factor (IIRC, by ~20x) to give plenty of room for future improvements.


Nothing actually stops the score from going higher than 140, it will just max out the visual dashboard at that point. On Speedometer 2, Safari on M3 Macs ended up over 500. At scores that high it’s harder to have intuition, thus the changed scale of the new test.


The detailed results screen shows actual times.


Hopefully at some point actual click latency will be fixed in general after some dark decades.

Still incredible that a gameboy or an 80's computer with a CRT feels more responsive than most devices these days.

Bring back tactility. I'm convinced the choppiness and weird waits are actually psychologically stressing us out. That's why good keyboards + old low latency OS'es or typewriters are so soothing to use.


What is the latency being referred to here?

I don’t see noticeable lag on pressing/tapping buttons or other ui components in day to day browsing, even on my quite old iPhone.

There are obviously ways to make delays in web content anyway (user action->synchronous network request being the canonical one), but assuming there’s nothing silly like that lag isn’t an issue I’ve noticed.

Actual execution latency is something I worked on for many years in JSC, and so there are a lot of engine optimizations to reduce that latency as much as possible (the interpreter itself, the interpreter performance, byte code caches, hilarious amounts of lazy parsing and source skipping, etc) so even the first time you have a ui element trigger code there shouldn’t be any significant delay.

Obviously if a developer makes poor choices there’s only so much you can do, but by and large there aren’t that many bad things a web developer can do that a native dev can’t also do (and devs in both environments frequently do :-/).


Latency will always be an issue as long as developers use web technologies. Nothing beats native.


> Hopefully at some point actual click latency will be fixed in general after some dark decades.

Meaning, the 300ms delay (if the site developer does no optimization) with mobile browsers?




Platform Wars Results

Samsung S24 Galaxy - 13.8 (very new, with the good processor)

iPhone 13 Mini - 22.3

iPhone 15 Pro - 23.1

MBA M2 - 24.2 (Safari)

Win 11 i9-13950HX - 26.8 (Edge)


Wow. A S24 is that far behind? Does the browser make a difference?

I figured the iPhones would be faster but not 2x.


My iPhone 11 scores 16.8, which really highlights how much of a lead Apple opened up during Qualcomm’s complacent decade.

I think that matters less for benchmark wars between tribes, fun as those always are, than as a stark reminder that any of us building for the public should remember than an S24 is _really_ fast for an Android phone and your median user probably bought whatever was on sale a couple years ago. That means that if you’re one of the many developers using an iPhone which isn’t roughly a decade old, you have no idea how your app feels to the median user because your device can run so much more code before it feels sluggish.


Yeah this tracks, Apple clearly have sold their souls to the devil to get the performance they have on iOS. It's basically the sole reason I have an iPhone.


> Samsung S24 Galaxy

Curious: Is it the Exynos version (Rest of the world) or the Snapdragon version (US, China)?


They appear to be in the United States


It crashes on my iPhone 14 Pro.


Comparing a 3 year old iPhone mini to the current Galaxy is a Platform Warcrime.


> The primary goal of Speedometer 3 is to reflect the real-world Web as much as possible, so that users benefit when a browser improves its score on the benchmark.

As with any other benchmark its results will be interpreted incorrectly and will have little effect on real world.

Google already has vast amounts of real-world data. The end result? "Oh, you should aim for a Largest Contentful Paint of 2.5 seconds or lower" (emphasis mine): https://blog.chromium.org/2020/05/the-science-behind-web-vit... Why? Because in real world the vast majority of sites is worse.

Browsers are already optimised beyond any reasonable expectation. Benchmarks like these focus on all the wrong things with little to no benefit to the actual performance of real-life web.

Make all benchmarks you want, but then Google's own Youtube will load 2.5 MB of CSS and 12 MB of Javascript to display a grid of images, and Google's own Lighthouse will scream at you for the hundreds of errors and warnings Youtube embed triggers.

Edit:

Optimise all you want, and run any benchmarks you want for the "real world", but performance inequality gap will still be there: https://infrequently.org/2024/01/performance-inequality-gap-...

Optimise all you want, and run any benchmarks you want for the "real world", but Lighthouse will warn you when you have over 800 DOM nodes, and will show an error for more than 1400 DOM nodes (which are laughably small numbers) for a reason: https://developer.chrome.com/docs/lighthouse/performance/dom...


Speedometer 3 is designed to handle the real world you describe. Edge’s post has some details - https://blogs.windows.com/msedgedev/2024/03/11/contributing-...



The last two versions have lead to demonstrable speedups in major browsers.

Why wouldn’t this?

Bad developers (or management dictates) will be bad no matter what. That’s not a reason to give up.


The problem is complex and has many sides.

1. These companies themselves don't practice what they preach. No matter how fast Speedometer 3 is, Google's own web.dev takes three seconds to load a list of articles, and breaks client-side navigation. Google's own Lighthouse screams at you for embedding Youtube and suggests third-party alternatives [1]

2. The DOM is a horrendously bad, no-good, insanely slow system for anything dynamic. And apps are dynamic.

There's only so much you can optimise in it, or hack around it, until you run into its limitations. The mere fact that a ToDo app with a measely 6000 nodes is called a complex app in these tests is telling.

And the authors of these tests don't even understand the problem. Here's Edge team: "the complexity of the DOM and CSS rules is an important driver of end-user perceived latency. Inefficient patterns, often encouraged by popular frameworks, have exacerbated the problem, creating new performance cliffs within modern web applications".

The popular frameworks go to extreme lengths to not touch the DOM more than it is necessary. The reason the DOM and CSS end up being complicated is precisely because apps are complex, and the DOM is ill-equipped to deal with that.

This only goes to further show that browser developers have very little understanding of actual web development. And this is on top of the existing problem that web developers have very little understanding of how fast modern machines are and how inefficient web tech is.

This brings us neatly to point number 3:

3. Much of the complexity on the web in the modern web apps is due to the fact that the web has next to no building blocks suitable for anything complex.

https://open-ui.org was started 3(4?) years ago by devs from Microsoft and you can see just from the sheer number of elements and controls just how lacking the web is.

So what do you do when you need a proper stylable control for you app? Oh, you "use inefficient patterns by modern frameworks" because there's literally no other way.

And even if all of those controls do end up being implemented in browsers, it will still not be enough because all the other things will still be unavailable: from DOM efficiency to ability to do proper animations to ability to override control rendering to...

[1] I'm not kidding. Here's the help page it links: https://developer.chrome.com/docs/lighthouse/performance/thi...


This _already_ proved effective, as outlined in this post: https://hacks.mozilla.org/2023/10/down-and-to-the-right-fire...

Of course website authors should also do their part :-)



I see really insane outliers on some tests, some of the time, and this seems to kill the score on my platform (Firefox 122 on Linux x86_64).

As an example "NewsSite-Next" has 8 out of 9 repetitions between 215 and 236 ms, but 1 out of 9 (iteration 5) is 1884 ms. This is such a radical outlier I have trouble believing it could be a browser bug. Visually, the interface seems to get "hung" when switching between tests sometimes. I don't have a great explanation for that.

This specific issue in this case in with NewsSite-Next/NavigateToUS which reports a 144.5% variance as a result of this one outlier.

I see several others like this in the results, although none quite as extreme.


On Firefox mobile I got a score of 3.

So while Firefox is now super fast the performance of webapps might still be very bad on mobile.

And I think this applies to all modern browsers: they are fast at rendering very slow webapps and websites.


Probably hardware dependent too as I get 8.11 on my fairly old Samsung S21. My newer laptop gets 14.9 so I'm not sure whether I agree performance is "very bad" on mobile -- it may be less performant but that is surely to be expected given the hardware constraints on mobile. YMMV.


Got 7.89 on laptop with Core Ultra 155H in battery mode, 17.9 in AC mode. Zenfone 10 shows 7.75. All in Firefox.


If you’re on iOS, Apple gates Firefox from using JIT JS compilation which massively hinders performance.

E: I was wrong/extremely out-of-date - it does have JIT but relies on the Safari/Webkit implementation. In ancient versions of iOS, the WebView widget that third-party browsers were forced to use had JIT disabled, but that’s long since changed.


For this benchmark I get 12.0 in FF and 13.9 on safari. I’m glad it’s not so big of a gap as I already pay a penalty for not wanting to use safari on iOS (in terms of integration with iOS and usability from Apple’s artificial limitations on third party browsers)


It's more than that right? They have to use all of webkit. So it's pretty much a reskinned Safari in terms of layout/rendering/JS.


That’s not accurate. Firefox and all third party WebKit apps get the same JOT as Safari.


Right, but expecting the same behaviour from "Firefox" on iOS as on desktop is just not going to happen, since they have no control over the core engine. It's why, in general using iOS devices for cross-browser testing is pretty useless.


This is a fair point, though it is possible for app-level things that the browsers do to regress performance from the baseline pure engine level.

In this case, I think the 3 score must be either very old/low-end Android hardware or a measurement error. I don’t think any iOS browser gets 3.x scores, on even remotely modern hardware.


10.9 on Win10 2017 Xeon @ 4.3GHz w/ 64GB. This instance of Ffx has had ~50 tabs (across 5 containers) open for a couple of weeks.

What do these values represent? I can guess the last. Unsure of the first 2.

     96.84 ±(5.0%) 4.87 ms


Approx 6.3 and 6.7 in chrome and Firefox respectively on my low-end Pixel 6a


Weird I have the Google Pixel 8 Pro and get a score of 4.83 in Fennec and 6.61 in Vanadium (hardened chromium fork).


11.6 on s23

Sounds like old hardware or some other issue


Are you on iOS or Android?


4.7 brave on a pixel 7


Why is the scale 0-140? My modern windows 10 desktop using FireFox latest gives 15.0/140 with no other programs running besides FF and discord. Surely 15 is horrible in that context? I have 1 extension, ublock origin, allowed to run on the site by default.

I have never felt performance has ever lacked, outside of a few outlier sites (youtube, facebook, twitch). But those are tightly coupled with their (crappy) implementations.


15 is quite correct, definitely not horrible, so don't worry too much about that. It's not a rating, more a way to compare how browsers and hardware improve over time.

The scale goes up to 140 so that there's some space for software improvements as well as future hardware.


Ryzen 5600H + RX 5500M laptop, AC plugged in with power mode in Windows set to "Best performance", Windows 11 Home 23H2

Chromium (122.0.6261.112) without any extensions:16.7 ± 0.34

Brave Beta (122.0.6261.111): 12.8 ± 0.62

Brave Beta (122.0.6261.111) Private: 15.9 ± 0.58

Floorp (11.10.5 based on Firefox ESR 115): 7.40 ± 0.20

Floorp (11.10.5 based on Firefox ESR 115) Private: 7.91 ± 0.19

Librewolf (123.0-1): 8.41 ± 0.19

Librewolf (123.0-1) with uBlock Origin disabled: 8.86 ± 0.17


By far the biggest speed complaint I have about Firefox is not in everyday use, but whenever I restore a previously saved session - it basically stops reacting for a few minutes(!) before eventually I can use it again. I suppose it's due to the anti-virus interfering with some kind of memory image or whatever but whatever it is, it's so annoying.


> but whenever I restore a previously saved session - it basically stops reacting for a few minutes(!)

I haven't run into anything like this; you may be an outlier. I interact with ~7 Firefox instances (Win) each week. Each has diff configs and plugins.


Yeah, I've used on linux/win/mac restoring sometimes 20+tabs and never had that issue.


If anyone was going to, it'd probably be me and my ~300 tabs, but I haven't run into this, either. My phone currently has 435 tabs open in Firefox but it's just as responsive as ever.


Can you please file a bug in https://bugzilla.mozilla.org/enter_bug.cgi?product=Core&comp..., and fill in some of the information in the template? Thanks!


I would if I didn't have to first create an account. Cross-logging in with an existing account from a different website isn't an option for me either, unfortunately.


Are you a tab hoarder by any chance? I get 2-3s of lag when restarting with a few dozen tabs open, but nowhere near "a few minutes".


I've probably got 60+ tabs open at any given point in time, spread out over multiple windows.


I've the same problem on Linux with no antivirus installed


Posting just because nobody else has posted one this high:

Mac mini M2, macOS 14.4, Chrome 122: 30.2 ± 1.6

Good ol' Apple Silicon.


I got 35.7 ± 2.3 on a MacBook Pro M3, Chrome 122.


33.3 with an Intel 14900k and Ubuntu. And people will sit there and try to tell you that computers aren't getting faster any more.


Got "Infinity" after testing my Firefox Dev Edition 123b9. Is this because of my FF config because my browser is perhaps blocking something (e.g. canvas, fingerprint, etc) or any result north of 140 is considered infinity?


>because my browser is perhaps blocking something (e.g. canvas, fingerprint, etc) or any result north of 140 is considered infinity?

I vaguely remember there's a privacy protection that rounds timer information. eg. all timers get rounded to the nearest 100ms. If you have a bunch of tests that take less than 100ms to complete, those tests might seemingly complete at the same time they start, which causes them to have infinite score.


Do you see anything (errors or something else) in the web console?


Please file a bug there if necessary => https://github.com/WebKit/Speedometer/issues/new :-)


I also got Infinity, on Firefox 123.0.1, with a bunch of privacy extensions.

There is only one warning in the Console: Ignoring ‘preventDefault()’ call on event of type ‘wheel’ from a listener registered as ‘passive’. react-dom.production.min.js:29:112


HP EliteBook 850 G8 Notebook PC, i7-1185G7 @ 3.00GHz, Windows, plugged in:

Firefox 123.0: 12.1 ± 0.62

Edge 122: 14.8 ± 0.68

Google Pixel 7a, GrapheneOS, on battery:

Fennec 123 with Dark Reader: 2.91 ± 0.066

Fennec 123 without Dark Reader: 5.28 ± 0.089

Vanadium 122: 6.96 ± 0.39


And yet no one cared to make a responsive CSS to the site. Try to read it in a phone.

This kind of thing drives me nuts. And it’s just text, it’s not like it’s rocket science.


Edge results on my laptop:13 (all extensions disabled) Private: 20


Crashes, or gets killed, by Safari on iOS 17.3 on iPhone 15 Pro.


Try it with a Private tab. Mine did the same until I did that, after which I got 16.5 on an iPhone 13 Pro Max.


Nice, an update well received by enthusiast.


iPhone 15 Pro Max; iOS 17.4; Safari: 26.2 in a private tab

Pixel 8 Pro; Android 14 QPR2; Chrome: 9.47 in an incognito tab


I was debating between a Galaxy S24 and a Pixel for an Android test device. 13.8 vs 9.47 on a Pixel is not a good result for Google's custom chip.


No, it's definitely not. I was pretty surprised...


In iOS, all browsers (at the moment) use Safari under the hood. Imagine my surprise to see these noticeable differences in some of them.

Vivaldi:12.2

Brave:18.1

Safari:18.2

Chrome:19

Firefox Focus:21


This is because browsers on iOS do not 'use Safari' but use WebKit and there is huge amount of browser app software built on top of it, which will contribute to variance on benchmarks (and also to these being very different browsers ultimately).


13.1 on MBA 15" M2.


In descending order....

MacBook Pro, M2 Pro, 16GB, plugged in, external display: Safari=31.2 Chrome=29.4

iPhone 12 mini, plugged in: Safari=19.4

HP Z2 mini (i7): Edge=15.9

Panasonic Toughbook CF19 (win 10): Edge=4.7 Chrome=5.6

Galaxy Tab S5e: Chrome=2.2

Oculus Quest 2: browser crashed

Tizen TV: displayed, wouldn't run

Nintendo 2DS: displayed, no css, wouldn't run


> Nintendo 2DS

You brave fool. I love that you tried it.


It was the tv that took the most effort.


Huh. I would have thought that was easier for some reason.


On my machine, Firefox got 12.3, and Edge (Chromium) got 12.8. I don't believe that the performance characteristics of these two are that close unless I'm missing something. For example, audio players on Edge stutter a lot while Firefox plays them smoothly. An example is: https://deepsid.chordian.net/ I believe Edge is slower not because Chromium is slow, but because of Microsoft's overreaching efforts on energy conservation.

Machine: AMD 5950X, 32GB RAM, 3080 GPU, Windows 11 Pro 23H2

Firefox v123.0.1 Edge v122.0.2365.80

EDIT: Interesting, I tried both in private windows later to bypass extensions, and Edge got 10.8 this time, Firefox got 16.9. I now have more questions.


I ran v3 on my machine while listening to "Voyage" by "Yahel & Eyal Barkan" in Chrome and doing a bunch of background stuff. The background stuff took up about 20% of my CPU. While testing, the music played perfectly without any buffer underrun pops.

Ran it in each browser one at a time while the music played in Chrome.

Chrome 122.0.6261.112: 21.3 +/- 0.64

Edge 122.0.2365.80: 20.1 +/- 0.78

Firefox 121.0.1: 18.5 +/- 0.75

Machine specs: Intel Core i9 12900k (24 core) / 64GB RAM / 3080Ti / Windows 11 Pro 23H2

After finishing the tests, I played that same song on Firefox and Edge. Both Firefox and Edge played it perfectly.

> audio players on Edge stutter a lot while Firefox plays them smoothly

I'm curious about what could be leading to this inconsistency as I use Web Audio for a number of projects, so I have a bit of a vested interest. It is notoriously easy to do WebAudio wrong or to do just a bit too much computation which leads to buffer underruns (pops). It also may have a lot to do with specific tracks on DeepSID, could you share some tracks that perform inconsistently for you?


Any track plays completely garbage on Edge. The beat skips, the sound cuts off. This one for instance: https://deepsid.chordian.net/?file=/MUSICIANS/F/Fate/World_R...

I think Edge's problems come from some kind of power efficiency setting, not necessarily performance-related. (Like a low-granularity JS timer, or something like that)

EDIT: Turning off all efficiency settings on Edge didn't make any difference: 11.0


Do you have your Windows power plan on Performance? Maybe thats the difference?

Definitely would be a bummer if web audio didn't work reliably on the default power plan (which is Balanced iirc)


It’s on High Performance, yes.


Other interesting notes:

- Firefox on WSL2 on the same machine gets 10.1 despite that the rendering must be horribly slow since it goes through Remote Desktop layer.

- Firefox gets a 25% speed boost when I disable 1Password extension. Disabling it on Edge makes no difference.


Audio wouldn't be going via the DOM or JS, right? I know that Firefox has its own codec support and that Safari on Mac uses different AV stuff than other browsers.

I don't think that AV stuff would be tested by Speedometer.


It's more than channeling audio files to browser's codecs though. A SID player for example runs a 6502 CPU emulator and a SID chip emulator on the browser using JS. So, it's problematic in such scenarios. Otherwise, I can watch Youtube videos, or listen to Internet radios without issues.


> don't think that AV stuff would be tested by Speedometer

It probably isn't, but fwiw yes web audio is controlled by JavaScript. Doing it right means using web audio worklets, which is a special purpose JS context that has no access to your main page context.


Some of my results:

Desktop Firefox: 25

Desktop Chrome: 26

Laptop Firefox: 16

Laptop Chrome: 20

Laptop Safari: 21

Phone Firefox: 12

Phone Chrome: 10

---

Desktop: 5900X, 3090, Linux

Laptop: M1 Pro 14"

Phone: S24 Ultra

Ran all tests in private window to avoid extensions, and gave a minute to cool between tests. Laptop/phone was plugged in.


M1 MacBook Air:

Safari: 24.1

Firefox: 21.7

Extensions can really slow things down:

Safari w/ Ghostery: 6.83

Safari w/ AdBlock: 13.0


chrome 27.4

firefox 26.3

desktop: 7800xd3, 2060, linux


I tested this with Firefox stable release and Brave stable release, 3 runs on each. Same exact extensions across both.

Highest scores across tests: Firefox: 6.34 +- 0.31 Brave: 11.3 +- 0.37 on Ryzen 9 7940HS + RTX 3060 mobile

Which really sucks since I highly prefer Firefox but this past week I've been trying out Brave and I think its noticeably faster and smoother to me. Even with the reduced speed I'm still swayed toward Firefox for the customization factor you can achieve with userchrome.css file.


The 'same' extensions on each browser control for your experience, but not for the browsers' performance:

The browsers have the same or very similar APIs for the extensions but that is just the interface; each browser executes the extension's instructions differently (a lot or a little - I don't know the browsers' code). The same extension will impact Brave's performance differently than it will impact Firefox's. In other words, the same extension is not, in this sense, the 'same' on each browser.

In this sense, an extension is part of the user experience, like a website. The Speedometer test suite doesn't include those extensions (I assume) and that is the experience the browsers are optimized for.

The parent's test doesn't represent that; it does represent their desired experience, of course.


What's your score in Firefox with extensions disabled?


Firefox extensions disabled: 16.8 +- 0.59 Brave extensions disabled: 19.0 +- 0.88

Interesting because I have only 5 extensions. The heaviest extension seems to be Dark Reader which causes over 5 point changes.


Yeah, Dark Reader is known to totally plummet Firefox performance.

Not sure if it's a coding issue in the Firefox version of Dark Reader, or it's hitting some slow path in Firefox itself.


Even if you have the exact same extensions the fact that you have an old Firefox profile may be hindering the results. Try comparing with a fresh Firefox profile with the same extensions.


The Firefox profile I'm using is no more than 3 weeks old. Fresh install of Windows was done around that time.


We should stop "speed-shaming" browsers and focus on websites, which negate all performance improvements made by browser developers by adding more useless features.


I'm more worried about Firefox' stability these days...


It helps if you file bugs in Bugzilla. You can link to them here, there's a good chance a developer will find them.


[flagged]


When that metric is "performance in real workloads", I can't imagine it ever becoming irrelevant. Just look at their new tests:

> In particular, we added new tests that simulate rendering canvas and SVG charts (React Stockcharts, Chart.js, Perf Dashboard, and Observable Plot), code editing (CodeMirror), WYSIWYG editing (TipTap), and reading news sites (Next.js and Nuxt.js).

> We’ve also improved the TodoMVC tests: updating the code to adapt to the most common versions of the most popular frameworks based on data from the the HTTP Archive. The following frameworks and libraries are included: Angular, Backbone, jQuery, Lit, Preact, React, React+Redux, Svelte, and Vue; along with vanilla JavaScript implementations targeting ES5 and ES6, and a Web Components version. We also introduced more complex versions of these tests which are embedded into a bigger DOM tree with many complex CSS rules that more closely emulate the page weight and structure from popular webapps today.

Improving these benchmark results will at least partially make those libraries faster in the real world, and most likely also many additional libraries and workloads.


The article even links to real-world performance measurements that are completely separate from the benchmark. I have no idea what the purpose of the OP's comment was. I guess they just wanted to blurt out the first cynical thing that came to mind. Awesome contribution.


I love the fact that an honest question is met with such hostility. I knew there was a quote, but was unable to think of it well enough for a proper search. It is easier to find an answer from other people with just a shard of detail that search engine will not find as there's no SEO for the broken fragment.

I'm so happy to see this place is alive and well with the attitude to support a curious mind. What a tosser


I've probably spent too much time on the internet, then, because I definitely wouldn't have interpreted your original post as an honest question without having seen the clarification in your followup comment. Probably a defense mechanism built from past pain caused by my assuming good faith and then being ridiculed from the non-honest-question-asker.

But with that nastiness out of the way - I looked back at your original question and thought "seems like the kind of query Kagi would eat for breakfast". Here's what it responded to your question: https://kagi.com/search?q=What%27s+the+quote+about+optimizin...

Specifically, the "what" at the start and the "?" at the end triggered the LLM-powered quick answer at the top (and it passes the smell test for correctness).

To Google's credit, it also returns reasonable results for this question.


So, why are you bringing up the quote in this context, then?


Because it's directly what spurred the thought? What, I'm supposed to post it randomly?


Probably Goodhart's law: when a measure becomes a target, it ceases to be a good measure.

To my mind, it means that the metric becoming the main focus, it makes easy to forget the original relevant goal and even works against it. That is not the case here.

https://en.m.wikipedia.org/wiki/Goodhart's_law




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: