Please, using more CSS selectors than DOMs isn't going to kill a site. In fact, it isn't even close to being a bottleneck. We are talking milliseconds. If you want speed gains, properly compress your images!
I've known about how selectors are processed for some time, I discovered it when I was looking into optimization strategies for JavaScript and figured I should look into how CSS is processed. I made a variety of tweaks, including finally just removing nearly all the CSS, and (somewhat disappointed) did not see any visible changes in the speed at which the browser rendered the page. Not even on IE6/7. Anecdotal evidence only, but to Jim72's point, I wouldn't waste time on it.
Wait, are you saying that a computer capable of executing millions of instructions per second doesn't take very long to do a few comparisons on a dozen or so elements? It takes me a long time to do by hand, so I figured the computer would be really slow too!
The core i7 can issue 4 instructions, and execute 6 micro-ops per cycle. As long as your code is not a single huge dependency chain, you are mostly reading data out of L1 cache, and you don't use complex instructions like division, you can expect to get at least 2 instructions per clock. I have profiled real code (crypto stuff) that got a tad bit over 3 instructions per cycle. That's 10 billion instructions a second on the processor it ran on.
Most instructions have not taken over 2 cycles on x86 since P4. This is a mesofact, update your worldview. :)
Well, it doesn't happen in real life, but an Intel i7 is capable of issuing 4 instructions per clock cycle (sometimes 5, with a little microcode cheating). A 3.2 GHz quad-core could issue 51.2 billion instructions per second.
Presumably any modern processor would use some form of pipelining, which I think would take instructions back up into the billions. (The pentium 4, if I recall correctly, had a 20+ stage pipeline)
Exactly. I have an obsession with "optimized" markup, that means avoiding unnecessary elements, ids and classes. Let's not forget that all this stuff sits in markup and gets downloaded on each request. Even if that does not add up to much, I still want to have my code "clean". There is one particular site that inspires me: http://camendesign.com/ — take a look, not a single id or class.
He did this, however, at the expense of convoluted selectors, pseudo-elements, and pattern matching. Personally, I don't find his implementation to be very clean. I guess it's better to have the mess in your closet than wide out in the open, though.
It's not every day you can experience all the astonishment of overturning a fundamental assumption without any of the pain of forced changes. I enjoyed that conversation. (I'm Alex K, btw.)
Thanks for all the comments, everyone! Very cool to have that post picked up by Hacker News.
most of the time this certainly falls under pointless microptimisation, but if you are doing a complex web application with cpu intensive operations, it can be really handy to know how css is applied (and how reflow / redraw works) to help the user experience.
Can anyone point to where this is required by the css specification? Or is this an implementation detail? The reality may be that all the browsers do it this way, but why couldn't someone add some more smarts to choose selectors based on selectivity, just as a SQL query planner would?
Surely the intelligent thing for the CSS engine is to look for the one-and-only #alert, rather than work backwards from all the <p>'s? (I am assuming that nodes with ids have their own fast index.)
It's a little trickier when you do it left to right.
Suppose you have a DOM tree like this:
div #alert - p, p, p
- div #alert - p [lots of children]
- p - div #alert [lots of children]
[right to left]
In order to find all Ps with an #alert parent, you first get a set of all paragraph elements. Most likely a list of all paragraphs is kept in memory, so it can be enumerated quickly. Since DOM trees are generally well balanced the number of parents of any one paragraph is going to be roughly O( 2log(|DOMSIZE|) ) and it can be enumerated in C, so this is basically nothing (compared to how grossly inefficient javascript is). Total complexity: O( |P NODES| * 2log( |DOMSIZE| ) ). Memory complexity: 1 linked list of size of result set.
[left to right]
If we were to look at it from left to right, it gets trickier. You have to find all #alert nodes (easy & fast), but then you have to take all children of all #alert nodes. Every sub-tree of the DOM can be a significant fraction of the entire DOM tree, or perhaps the _entire_ dom tree. Not only that, but you CANNOT filter for all "p" children, because then you'd need to have lookup tables for every level in the tree (hugely inefficient). So you're left with one solution, that is to enumerate all children (and a DOM node can have a few thousand child nodes easily). But, wait, one #alert contains another #alert. We're already looking at O( |DOMSIZE| * |DOMSIZE| ), and now we have to consider uniqueness as well. We cannot alloc a hash map to keep track of uniqueness, because that would completely fragement your memory space; you'd end up allocating&deallocating hashmaps for every selector call. So you end up having to traverse the result set linked list for every node you add, to see if it isn't already in there. This, needless to say, is another O(N^2) detour. Total worst case complexity for M selectors:
O( M * |DOMSIZE|^2 + M * |RESULTSET|^2 ).
Memory complexity: M + 1 linked list.
Difference becomes more profound as complexity increases:
#uniq p p span b
(R2L) find all "b" nodes, append those with matching parents to linked list. Return linked list
(L2R) find all #uniq nodes, enumerate all "p" children. Make sure all "p" children are unique. Take all children of these "p" nodes, filter for "p" children, make sure result set is unique. Take all children of these "p" nodes, filter for "span", make sure.... and so it goes.
Your examples aren't valid HTML, so that approach is fine for quirks mode. But what about valid pages that declare a DOCTYPE? Shouldn't those be parsed from left to right, so the first matching ID that is encountered wins (with any other matching IDs considered errors and disregarded)?
OT, this was one of the problems XHTML was supposed to solve, since it was originally assumed by many that pages that weren't well-formed XML would not be rendered at all, forcing the developers to fix the code. This didn't happen, and developers are still at the mercy of how individual browsers implementat quirks mode.
You have to write R2L parsing logic anyway, because it's what you want most of the time (e.g. "div span a.quote"). Yes, in the special case where #uniq identifiers are used and you know the page to be valid (X)HTML and you know that you get a substantial speedup by filtering on the #uniq node first L2R parsing is preferable. But how realistic is that, really?
Isn't it just easier to teach web developers to write their DOM selectors in a specific way? The R2L approach is (a) easy to understand (b) has predictable (and stable) performance (c) doesn't malloc (d) is easy to implement. I see this as a simple case of "good enough".
id attributes are unique in well formed pages, but then a lot of HTML is not well formed. Browsers do maintain a hashmap (or equiv) of id to node though, so in that case you’re right, that should be faster. Maybe the overhead of parsing and choosing multiple code paths is slower than just walking the tree?
A DOM node can have multiple children but only one parent. To go up the tree and check for a hit you simply call (in pseudocode) node.parent.isOK?.parent.isOK? etc. To go down the tree you have to not only choose your search strategy (depth-first, breadth-first) but also check a whole lot more nodes in the hope that you’ll find what you want.
From what I can tell, it looks like OE nouns could have either "strong" or "weak" declensions, and this is different from the dual form. Someone on another forum claims that "brethren" is another example.
may well be true, my source is an old State of the Onion talk from Larry Wall, which although a linguist is probably not the most qualified in the world :)
Take Alex's, Alexi, Alexim, Alexen, and boxen out back to be euthanized along with the awful virii and anything else that would lead Cory Doctorow to chortle while typing.
Hackanonical, however, is without flaw. It should be in the name of this site!
John Resig (of jQuery fame) talks about why this approach was taken in the design of the sizzle engine that powers jQuery. Its towards the end of this talk (which is quite good by the way), I would find the exact time but my work internet thinks its inappropriate content and won't let me watch the video :(
It's Chrome-only but you can sometimes get cross-browser insights. It's also a lot easier to conduct a cursory investigation using this tool than it is to start changing your selectors.
FWIW, the post is a slightly cleaned up version of an actual conversation I had with my husband. (I'm K, he's M.) It's not a format I'm likely to reuse (or have used before), but it fit the occasion and captured very well our feelings of surprise at learning how CSS parses selectors.
Plus, it's always fun to try out new formats, even -- or maybe especially -- if they don't stick.
11:44:33 AM Alex K: that makes it sound like it may not be worth huge optimization?
11:44:46 AM Alex M: no, not really
11:44:56 AM Alex M: I’m a purist though
11:45:18 AM Alex M: so this makes my world
I don't want to read several pages of badly spoken chat english to find the conclusion. Could someone be so kind and explain what the shocking truth is?
The "Two Alexes" are just discovering (Feb 2010) what Steve Souders explained in detail at http://www.stevesouders.com/blog/2009/06/18/simplifying-css-... [which is probably the site you want to visit on "How CSS strategies affect site performance"] (June 2009).
I don't do those things, but that's not because it's a bad method, but because I don't care. If there was some bug in some system call, or if I needed to investigate why hardware is misbehaving, of course I'd read the Linux source. That's one of the great things about open source software; you don't have to treat it like a black box. You can just look at the internals and see how they work.
(And I have proof that I do actually do this; I had a misbehaving joystick, and the linux fork in my github contains the fix.)
Anyway, in conclusion, if you want to know what algorithm Firefox uses to process CSS selectors, just read the Firefox source code. If you don't give a damn, then don't do that instead. But if you care and you do nothing, that's just silly.