Lazy deserialization

stcredzero · on Feb 12, 2018

A snapshot contains everything needed to fully initialize a new Isolate, including language constants (e.g., the undefined value), internal bytecode handlers used by the interpreter, built-in objects (e.g., String), and the functions installed on built-in objects (e.g., String.prototype.replace) together with their executable Code objects.

This sounds suspiciously like the Smalltalk image. The Smalltalk undefined value was involved in some paradoxes, therefore it couldn't be completely defined/instantiated declaratively.

I wonder if the (de)serialization mechanism used here could be re-targeted in a manner resembling the Parcel technology developed for VisualWorks Smalltalk? Basically, the runtime state of an application could be serialized into a "parcel," which could then be rapidly deserialized and more or less directly injected into the runtime image.

KMag · on Feb 12, 2018

> This sounds suspiciously like the Smalltalk image.

Lars Bak was a major contributor to both V8 and StrongTalk.

Promarged · on Feb 12, 2018

> The Smalltalk undefined value was involved in some paradoxes

Could you elaborate on this? Or provide some references? This sounds interesteing but when I tried DDG and Google for "smalltalk undefined paradoxes" the results were not exactly satisfactory.

stcredzero · on Feb 12, 2018

nil was the sole instance of the class UndefinedObject, which was a subclass of Object, whose superclass evaluated to nil.

Another paradox: Every instance has a Class. A Class is also an instance of a Class. That object also has a Class, and that instance of a Class also has a Class, and so on. It was turtles all the way down.

carterschonwald · on Feb 13, 2018

hrmmm... sounds like a coinductive description https://en.wikipedia.org/wiki/Coinduction

(i've done a bit of modelling using coinduction, and it has a very funky but precise relationship with immutable object oriented program descriptions)

edf13 · on Feb 12, 2018

Good work - but can we just take a minute to look at this...

> Over the past two years, the snapshot has nearly tripled in size, going from roughly 600 KB in early 2016 to over 1500 KB today.

1.5MB per snapshot... per tab effectively! It's crazy how wasteful we've become.

jaawn · on Feb 12, 2018

I have thought for awhile, and continue to think, that developers avoid "premature" optimization too fiercely. Yes, there are diminishing returns with optimization effort, but too often people interpret "avoid premature optimization" as "never optimize unless it feels slow, and even then only if it feels slow when it's the only thing running. Otherwise, blame everything else that is running first!"

drak0n1c · on Feb 12, 2018

Not to mention that the "never optimize until it feels slow" judgement calls are usually conducted on Macbook Pros with fast internet.

jaawn · on Feb 12, 2018

Oh absolutely. I have been noticing this is starting to happen with SSDs as well. A lot of modern games run awful on mechanical drives.

cobalt · on Feb 13, 2018

that's unfortunately because we are pushing fancier and fancier models with higher and higher resolution textures

taeric · on Feb 12, 2018

Indeed, the quote of Knuth on this is followed immediately by not neglecting that critical minority of code where it matters.

Sadly, the promise of zero cost abstraction is a huge siren call. And not likely to change anytime soon.

jaawn · on Feb 12, 2018

"Zero cost abstraction" is like a sarcastic joke at this point in web development. If you pull up performance comparisons for web backends, a lot of the popular ones based on interpreted languages are absolutely abysmal when compared to c++ or Java code (Node is a good example). Many definitely have streamlined development workflows and have nice, high-level abstractions, but at a cost to performance. I don't mean that Node is useless or something, sometimes it makes sense, but it still forces you to compromise.

Front-end frameworks are even worse. A lot of older (but not "ancient") PCs are unusable on the modern web because of poorly-optimized JS or Adobe Flash (a decent portion of this issue is also due to the inherent inefficiency of JS and Flash as well). Fortunately, Google has been making strides with V8, Mozilla did awesome with Firefox "Quantum" and everyone is slowly ditching Flash, but performance still seems to be an ever-present issue.

taeric · on Feb 13, 2018

Fully agreed. I might extend it beyond web.

It does seem to be getting comical.

CurtHagenlocher · on Feb 12, 2018

I vaguely remember Lars Bek saying ~8 years ago that the snapshot size was 50kb (at the time).

m_eiman · on Feb 12, 2018

Couldn’t they do this with a copy-on-write initial heap that all the engine instances clone?

jannotti · on Feb 12, 2018

I think that's pretty close to what they're doing, really. Though usually COW refers to doing things on a page by page basis, which probably would not be as effective (since it would copy entire pages, when only a little was needed, so unless you were careful to put related function close together, you'd copy a lot more of the heap than you really want).

You're probably right though. I bet they end up closer and closer to that. Possibly even sharing pages that can't be modified, like code objects. That's what the last sentence or two seems to allude to. Eventually, they'll pretty much be reinventing shared libraries for the JavaScript world. Not that that's a bad thing.

jnordwick · on Feb 12, 2018

Shared libs with security related side-channel leakage?

michaelmior · on Feb 12, 2018

@dang Could this be replaced with the non-mobile link?

https://v8project.blogspot.ca/2018/02/lazy-deserialization.h...

sctb · on Feb 12, 2018

Updated. Thanks!

michaelmior · on Feb 12, 2018

Whoops! Forgot you're sharing the load with dang now. Thanks!