A big step towards Firefox generational and compacting GC

an_opabinia · on Jan 20, 2014

From Terrence Cole:

>Exact rooting is required for us to use GC algorithms that relocate objects in memory. It will allow us to implement compacting GC to save memory and generational GC to push our performance to the next level.

For a comprehensive overview of a good open source GC, check out Mono's description of theirs:

http://www.mono-project.com/Generational_GC

Browser GC is really tricky. The browser provides all sorts of complicated native services, like a flexible rendering engine (you know, HTML), recording audio, LocalStorage—really complex stuff implemented efficiently in C++ and with cross-platform support. All of these native objects can be accessed/instantiated from Javascript. Hence the GC is especially well tuned, on top of the security requirements around dealing with untrusted code running trusted code in general. It's a big undertaking for sure.

saurabhnanda · on Jan 20, 2014

Please excuse my naivety, but I have only recently gotten interested in compilers, memory management, etc. [1]

Can someone please explain to me why GC isn't implemented as a library that can be used by any runtime/VM? We recently had a generational GC land in Ruby 2.1 Few languages out there specially target JVM because of its GC story. Wouldn't everyone benefit from a common code-base with rock-solid GC algorithms & implementations? Analogous to crypto libraries.

[1] After getting sick of the fact that all the "new age" languages use shit-loads of resources for, essentially, achieving trivialities. We were able to do the same things using 386 PCs with a few megs or RAM for which, today, we need 2 GHz multi-core CPUs and 8 GB of RAM.

PS: Not trying to downplay the work done in bringing better memory management to Firefox. I would really appreciate Firefox not using up 1.3GB of memory -- that was one of the reasons I had to fork out INR 5,000 for an extra 8GB of RAM recently.

vidarh · on Jan 20, 2014

The hard part is not the high level algorithm - most of the common ones can be prototyped for toy systems in trivial numbers of lines.

The hard part is system interactions. E.g. the GC needs a way to identify all the "roots" (references from outside the heap, such as registers, the stack, or pointers that may have been retained by external libraries etc.).

It also needs a way to identify which parts of an object are pointers and this varies with language. "Generic" GC's like Boehm's solve this by assuming you've done nothing to obscure the pointer, and then assumes everything that could be a pointer into the heap is, but this fails miserably for any language implementation that uses bits for type tagging, for example, and retains too much memory in many other situations because it isn't precisely determining which fields are pointers.

Then you have issues like threading models (threading and GC are intimately linked, as if you don't block all threads during marking or moving/collection, you have all kinds of additional exciting issues to deal with to ensure you don't break things)

saurabhnanda · on Jan 20, 2014

Thank you for the replies. A follow-up question, how does any GC/runtime/VM know which parts of the memory are actual data vis-a-vis pointers to other parts of the memory? Some sort of global pointer table, perhaps? If so, why cant the hypothetical GC lib define an API to share said pointer-table?

blt · on Jan 20, 2014

pointers that may have been retained by external libraries

Oh, if only this were easy...

jeffffff · on Jan 20, 2014

There are several issues in implementing gc as a library. The first thing a concurrent GC must do to make progress is identify the stack roots. This involves stopping all the threads and reading their stacks and register values. To stop all the threads safely you have to generate safe points in the code, which requires tight integration with the compiler. The second issue is that in a concurrent mark sweep collector for a language with mutable data (so everthing except haskell) you must put write barriers on all your writes. If you have object A that has been scanned and object B that hasn't been scanned, and you make object A have the only reference to object B, the write barriers will make sure that B is known to be referenced and that the things B references will be scanned. Emitting write barriers as well as their implementation also requires tight coupling between the compiler and the GC. Write barriers can alternatively be implemented with memory protection and signal handlers but usually with a performance penalty.

Generational GC introduces more problems. To be able to only scan the young generation in isolation, you must know either that there are no pointers from the old generation into the young generation or you must know precisely which objects in the young generation are referenced from the old generation. The first is the approach taken by haskell but unfortunately languages with mutable data can't take advantage of that so they must use the second approach, which is implemented with more write barriers (and more compiler-GC coupling).

Copying collectors are generally either stop-the-world or generational/thread local and they run in that thread while collecting. Stop the world again requires safe points but thread local doesn't. You still can't really do the thread local gc in a library because you have to be sure that the compiler isn't keeping pointers only in registers while you scan and you have to emit write barriers to know what's referenced in your thread local generation from older generations.

justincormack · on Jan 20, 2014

There are some general GC libraries eg Boehm. however there is a lot of specific detail in a language and allocator that means you may be able to get better performance.

nickik · on Jan 20, 2014

Wow, that must have been a lot of freaking work. Testing this must be nightmare, if you have on native pointer left the browser could essentially randomly crash.

Pretty cool stuff. I really want to see the performance changes the GC can get.

mccr8 · on Jan 20, 2014

Work on generational GC has been proceeding in parallel with the exact rooting work. You can see what the current generational GC performance is on standard benchmarks like Octane by looking at the "Firefox (Ion, GGC)" line on http://arewefastyet.com/

venomsnake · on Jan 20, 2014

Isn't that problem right in the domain of automated testing and static code analysis?

evilpie · on Jan 20, 2014

We do both, but it's still very tideous. After a while the dynamic analysis only caught false postives (something on the stack looked like a pointer). The static analysis had to be updated when new potential hazardous cases became obvious. https://wiki.mozilla.org/Javascript:SpiderMonkey:ExactStackR...

AndrewDucker · on Jan 20, 2014

You've got a typo in your URL there - missing 'n' in 'rooting'

_r5wf · on Jan 20, 2014

Is this the equivalent work of Oilpan project for Chromium-Blink? https://groups.google.com/a/chromium.org/d/msg/blink-dev/59f...

Doc: https://docs.google.com/document/d/1y7_0ni0E_kxvrah-QtnreMlz...

pcwalton · on Jan 20, 2014

No, Oilpan is different; it's a garbage collector for C++ objects, while exact rooting is for JS objects. Oilpan has more in common with the XPCOM cycle collector in purpose (although the approach is different), which Firefox has had for several years now but which WebKit has had no equivalent to.

DannyBee · on Jan 20, 2014

Right, and on the flip side, V8 has had a generational, incremental JS collector for quite a while :)

mccr8 · on Jan 21, 2014

For what its worth, it looks like v8 has had incremental GC since 2011 ( http://blog.chromium.org/2011/11/game-changer-for-interactiv... and generational GC for longer) while Firefox has had IGC since 2012 ( https://blog.mozilla.org/javascript/2012/08/28/incremental-g... ).

sliverstorm · on Jan 20, 2014

Can anyone comment on what this will mean for Firefox in rough terms? Less bloat in long-running processes? Less memory usage in general?

kevingadd · on Jan 20, 2014

The eventual introduction of compacting and generational collection will:

Dramatically reduce the cost of GC for short-lived objects (dramatically, to the point of producing something like 5-10x wins on some of the stupidly tuned microbenchmarks typically used to compare JS runtimes)

Reduce typical heap fragmentation (essential for long-running processes - imagine a facebook/gmail tab left open for days)

Potentially improve computational performance (compacting your GC heap will improve memory locality in some scenarios, which means the CPU cache can be used more efficiently, which will make JS code manipulating small objects much faster)

Reduce average GC pause duration and reduce the frequency of GC pauses

In summary, generational collection is a big win for the things users tend to care about, if you look at what the advantages mean for applications. Compacting is less important, but tends to be nice too.

ksec · on Jan 20, 2014

Still waiting for e10sM2. I hope after this is finish and done they can move on to it.

Firefox really needs some help in its snappiness area.

josteink · on Jan 20, 2014

Firefox really needs some help in its snappiness area.

While I wont mind optimizations to the general browser runtime and everything it powers, I don't think Firefox "really needs" help in the snappiness department.

Firefox today is a unique browser with several unique qualities.

1. It's founded on open-source, 100%, unlike certain other supposedly-open browsers created by ad-companies.

2. It's extremely extensible and is the only browser I can feel I can make "mine".

3. It has a clear stance on the value of the open web. No DRM shall pass. All other big browser vendors are peddling one DRM or another, and they are doing so on the open web, with no shame, no regret. Using those browsers implicitly makes you support Web-DRM, and to me that is completely unacceptable.

4. Where it tries to evolve standards it does so in a compatible way, and is open to community feedback. Compare this to for instance Google which develops a product in house, client and server code, in a quick turn releases it, as is, on the open web with some light docs saying "It's an open standard, anyone can use it. Just don't ask for us to do things differently". See Google's own attempt at ActiveX-like technology (NaCl, this website is incompatible with your CPU architecture) and how they've completely hijacked HTTP 2.0 with a new protocol which honours nothing of what made HTTP a success.

There are no browsers currently out there which gives me any incentive to switch from Firefox. And doing so for a few milliseconds improved loading-times seems like complete madness.

From a freedom point of view all other browsers are compromised. Only Firefox remains.

What happened to the world of techies with principles? Did you throw them all out when you bought those laptops from that Cupertino based patent-troll and launched a Web 2.0 URL-shortener written in node.js compiled with emscripten?

mccr8 · on Jan 20, 2014

It is a different group of people that are working on multiprocess Firefox. There was an update about the status a month ago: http://billmccloskey.wordpress.com/2013/12/05/multiprocess-f...

Making e10s work is a long process of converting existing code over to working when webpages are running in a separate process. For instance, if you do "copy", then the actual copying is occurring in the child process, but of course you want the parent process to know what chunk of text you copied so you can, say, paste it into the URL bar.

Ygg2 · on Jan 20, 2014

To be honest I haven't noticed it being sluggish, except when loading a 10 tabs when waking from hibernate. These days Chrome chokes with same amount of tabs that FF deals with easily.

Chrome does sometimes win with the JS demos (mainly that 3D period table demo).

agumonkey · on Jan 20, 2014

Chrome wins on psychology, it's uber fast on start and with few tabs, then it gets a bit slower than firefox, which is relatively constant in its non-snappiness (it's not slow but you notice tiny lags in the UI regularly)

Ygg2 · on Jan 20, 2014

To be honest I haven't noticed that it's non-snappy (I am on Nightly but use Regular at home). I'm bit biased towards Mozilla, but I don't think it is lacks much polish. Chrome dev tools are better, but the gap is closing in that area.

agumonkey · on Jan 20, 2014

Well, my laptop is from 2007, so I may "suffer" more from the sum of tiny tiny delays that I don't perceive in Chrome. I also use Nightly but I was under the impression that often Stable releases were a tiny bit faster (maybe some debug features disabled...) except for major feature upgrades like gc, jit etc

ps: about devtools and overall catch up, Mozilla is really impressive, they've been pushing hard and regularly on difficult topics (memory, javascript, standards, devtools). So much they suffer from under-publicity, I often realize their devtools could do X many weeks after it's been released.

ksec · on Jan 21, 2014

Fan boys will try to tell you otherwise. But lets just admit Firefox is slower then Chrome. Its a fact. Even though i have been an Firefox users for longer then a decade.

Try Opening "Tab from Other devices". Firefox will stall. Try using Feedly and open link in a new tab, Feedly will suddenly response very slowly. There are also other page loading performance which Blink/Webkit leads. Not to mention Google's way of implementing advance preloading technique which seems to win as well, even though Firefox has something similar.

Yes I understand the geeks and tech cares about DRM. Yes I understand about implementation of Web Standard and Freedom. But most users dont. When ever i see users complaining about why not switch to Chrome on their computer. I just say no. I FORCED the usage of Firefox on 100s of computer in the office. If not, I bet users would choose a Chrome anyday.

So the point is, Mozilla needs to seriously think about its execution model. It needs to be efficient and not spread itself thin.

Ygg2 · on Jan 21, 2014

I admit I am a bit of a Fan boy.

However, I recall trying FF on my laptop and being slow, then switching to Chrome and it being super fast. When I updated to Nightly, the speed was about the as Chrome.

DRM and other things be damned, I can't notice a difference between Firefox and Chrome in performance. And I bought my computer 3 years ago.

I can't remember when Chrome subjectively outperformed Firefox. And the objective measures are usually about the same.

nikbackm · on Jan 21, 2014

Maybe you used a new, fresh and bloat-free profile when running the Nightly?

Could perhaps explain the speed difference.

Ygg2 · on Jan 21, 2014

No, both were fresh installs without plugins. Nightly ran at significantly faster speed than Regular on that laptop.

On my regular comp the difference was non-noticable. Even with all the plugins.