Hacker News new | past | comments | ask | show | jobs | submit login
Chrome Binary Size (neugierig.org)
116 points by mcantelon on Sept 20, 2010 | hide | past | favorite | 65 comments



I do have to point out that this visualization represents the binary executable size of Chrome, not the source tree. (According to another comment, it's generated with http://github.com/martine/bloat .)

Noticed a lot of folks linking to directory treesize visualizations.

This — breaking down a compiled binary to show the size of it’s constituent components — is a ton more interesting to me.


...and yet it includes header files. curious.

Looking at what's there, it looks like it can compile and run C (in nativeclient?).


It's possible to define symbols in header files (for example, inline functions).


I thought the way the C preprocessor worked, those ended up getting literally copy and pasted into the .c file that #include'd them, and should be indistinguishable from the case where you wrote them directly in the .c file? Or does gcc keep track of where #include'd files came from, and write that info into the binary?


Yes, otherwise you'd never find syntax errors in the header files.


But that's during the compilation phase - this is the compiled binary. Binaries that are compiled with debugging information turned on will include this information for a similar reason: so that when you're debugging a program, or when it breaks, you can map that to places in your source file.


Right - and `nm` uses the debugging information to identify the file:line where each symbol was defined. The original scripts used to generate the treemap use `nm`.


yes, and no. Whole program optimization kinda changes that. So do precompiled headers etc. Also chrome is C++.


Could be precompiled headers, inline functions, etc.



I patched it to support generation on 64 bit systems: http://github.com/martine/bloat/issues#issue/1


This is an amazing visualization, but I don't like using the term "bloat." What would you take away? Keep in mind that Chrome's architecture is purposefully approaching that of an operating system.


I don't like using the term "bloat."

It's an extremely non-specific term that people use to describe software they don't care for, much as people cast arguments they don't like as "FUD".

I was in a conversation with someone who said he felt IE 9 was bloated. I asked him why. "The right-click context menu is too big."

"Uses too much RAM", "requires too much disk space", and "has too many dependencies" can all be valid complaints. But it seems that in most uses, "bloat" = "has one or more features I don't use."


Having too many features is valid usability complaint. In fact, it probably harms the user quite a bit more than "requires too much disk space".


In practice, not really. People just ignore what they don't use.


In practice, yes really. Simplicity is one of the basic principles of good Human Interface design. User testing bears out that users find it hard to do even basic tasks with a more complicated program.

Now, sometimes it is possible to add features in a way that is unobtrusive enough that it doesn't affect usability for basic tasks. But it's a tough design challenge to do this, not a given.

By the way, I think Chrome overall does a pretty good job of design simplicity. But the "just ignore the features you don't want" argument is not a good basis for HI design.


> But the "just ignore the features you don't want" argument is not a good basis for HI design.

It is a good basis if you can find out (as the HI designer) which features most people don't want: Then you can put them into places for advanced users, where they wait for the moment people need them. IMO the "simplicity" of many programs is just a crutch for HI designers unable to design a usable interface with more features.


Do you really believe that it is possible to create arbitrarily simple interfaces for programs with arbitrarily many features? That seems unlikely to me. It seems much more reasonable to me that there is a correlation between the number of features and the complexity of a program.


They also don't use what they can't fathom. A plethora of options can make it difficult for a new user to find the features they want.

Yes they do ignore the unwanted features, but they also ignore features they would use if they knew about them. In that respect the notion of bloatedness is even amplified because a larger portion of the software is provided but idle.


Why do you believe this?


It's useless to have N versions of the same library on your system. I prefer that my favorite distro takes care of updating one particular library if there is a security bug than having to wait for google to take care about it. It's what is done in gentoo: http://phajdan-jr.blogspot.com/2010/08/www-clientchromium-no... Currently, it uses the system libraries for:

• bzip2,

• codesighs,

• cros,

• icu,

• jemalloc,

• lcov,

• libevent,

• libjpeg,

• libpng,

• libxml,

• libxslt,

• lzma_sdk,

• molokocacao,

• ocmock,

• pyftpdlib,

• simplejson,

• tlslite.

As TODO, there is still zlib due to an issue during compilation.

It would also be great to have V8 as a different project so that it could be used outside of chrome with much less pain.


> It would also be great to have V8 as a different project so that it could be used outside of chrome with much less pain.

http://code.google.com/p/v8/source/browse/trunk/ ?


or type about:credits in your chromium address bar


That chunk of "third_party" stuff is a pretty tempting target.

Instead of using shared libs - like those that you've probably got on your system already - Google appear to like slurping everything into the one blob to solve dependency/versioning issues.


That chunk of "third_party" stuff is a pretty tempting target.

Not all of it. Part of the third_party code is a modified version of bsdiff, and as the bsdiff author I specifically endorse the shipping of modified versions of bsdiff with the software which required the modifications.


It's probably not a terrible default, given their cross-platform development. Slurping everything into your app bundle is the official way of doing it on OS X anyway, and more or less the only sane way of doing it on Windows, whose DLL-versioning is notoriously bad. So if they were doing any factoring out of shared libraries, it'd be solely for the Linux version, which I could see them not considering worth the hassle.


Please stop repeating that canard, it got old years ago: http://news.ycombinator.com/item?id=297297


I thought that analysis was ridiculous the first time I read it. I read Chrome's design docs when they were released. Chrome's design is approaching that of an operating system. Considering that it has to manage processes, there is no alternative.


But that's the thing, it's only managing processes, not scheduling them nor managing the minutia of their stacks — it devolves that responsibility to the real kernel by using 'dumb' native processes instead of inventing its own fucked coroutines with a browser-wide GIL for all the logically-independent javascript environments.

It is less OS-like than any other major browser!


Correct, it is not an operating system. Just as the Java Virtual Machine is not an operating system. But when you start using principles and common designs from operating systems, you are approaching the design of an operating system.



I'm actually waiting to buy a Netbook until it comes out, in the vain hope that someone will ship a decent ARM-based system using it.

But it is just slightly-extended Chrome running directly in X unparented without the standard Gnome Desktop BS on otherwise stock Ubuntu.


tmux+vim+chrome is all i really need. (and a full posix system down below, of course)


What I want is NaCL implementations of urxvt+SSH and VNC/RDP.

I don't need yet another local posix system to maintain, I want remote access to my real server where my main screen session lives, and I want that access to live in a tab.


you can get vnc over html5, thought it may not be as fast as you'd like.


The term "bloat" is not used anywhere else than in the URL.


And it looks to me as if the only reason it's even in the URL is that that's the name of the tool he wrote to extract and display the information.


What does Chrome use Speex (a low-bitrate voice codec) for?

edit: answering my own question, a speech-to-text in HTML proposal from Google

https://docs.google.com/View?id=dcfg79pz_5dhnp23f5

http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2010-May...

http://www.jeremyselier.com/entry/speech-attribute-demo


Chrome also incorporates libjingle, Google's library for ICE-like RTP handling, used in Google Talk / Google Video, which also uses Speex for voice part.


This is a pretty awesome way to visualise filesizes of various things... never seen it before..



...And for the Windows community: http://www.sixty-five.cc/sm/

Is there a Linux equivalent?



Someone is working on a kde4 port, see comments at http://kde-apps.org/content/show.php/KDirStat?content=10159

Gnome also comes with something similar.


You can get a pretty interactive treemap of the linux filesytem using http://gdmap.sourceforge.net/

[screenshot] http://gdmap.sourceforge.net/img/gdmap-preview.png

You can also see a treemap of the kernel although it is static http://www.cs.umd.edu/hcil/millionvis/Treemap_Visualization_...


In addition to the standard 'du' command, which you can sort and munge with the standard unix stuff, Gnome has bundled the 'baobab' utility for the last few releases (so if you're on a relatively recent Gnome release, you probably already have it installed). It has an interactive "rings" view and a "treemap" view, which are both pretty nifty.


In Ubuntu the default "Disk Usage Analyser" (the projeect name is "Boabab") in Applications -> Accessories does this.

By default it's a ring chart but there's a menu to select treemap instead. The authors seem to believe that treemaps are good for comparing only size, but the ring chart shows hierarchy well too.

http://www.marzocca.net/linux/baobab/baobab-ringschart.html

http://www.marzocca.net/linux/baobab/baobab-treemaps.html


I use WinDirStat for Windows http://windirstat.sourceforge.net/


I've always stuck with du | xdu. Except for the primitive rendering it is rather reminiscent of webtreemap (or vice versa, i guess) -- click to dive down, top to go back up.


du -hs /home/* | sort -k 1,1

/edit typo


Hmm, that and the Mac and Windows utilities linked don't seem to be doing what's needed to generate this Chrome view, though. The Chrome binary is a single file on disk, and this treeview is digging into it to figure out which components contribute to the final size of the statically linked binary.


For Gnome, it's called baobab (in Applications -> Accessories -> Disk Usage Analyzer).



For displaying files on the disk it's actually patented.

http://www.patentstorm.us/patents/5987469.html

Estimated Expiration Date: May 13, 2017


Interesting. My first thought on finishing your post was "How rude; you've tainted us all with the knowledge of a that patent." Evidence of manners in the making I suppose; adaptation of custom in the face of seemingly immovable, shared burdens.


"How rude; you've tainted us all with the knowledge of a that patent."

What is required to prove willful patent infringement? Is "Defendant's web history shows (s)he accessed a web page on which someone claimed the patent existed" sufficient?

EDIT: "Defendant", not "plaintiff".


In most development scenarios you are actually interested to know which patents can potentially get you sued so please explain how exactly is the awareness of the patent existence of the disadvantage to you.


Wikipedia (http://en.wikipedia.org/wiki/Patent_infringement_under_Unite...):

> If an infringer is found to have deliberately infringed a patent (i.e. "willful" infringement), then punitive damages can be assessed up to three times the actual damages. Legal fees can also be assessed.

If it cannot be proven that an infringer had knowledge of the patent, though, the damages that can be assessed are substantially lower.


Thank you.

I'd still like to know how can you as the developer and your software survive in either scenario (with or without knowing that you infringe) it the patent covers the basic functionality of your software?


Sequoiaview and Treesize are two good, free Windows options. Treesize, in particular, has a prettier output and a ton more features. We install it on all our Windows servers because it makes disk cleanups so much easier.

Here's the free version: http://www.jam-software.com/treesize_personal/


Imagine a tool that would take the output of a profiler and render an interactive graphic like this for both: (A) data structure sizes and (B) run-time performance. That would be incredibly useful.



Cool, thanks! Too bad there's not a Windows version, though.



Seems to be PHP-specific: it processes the output from something called "xdebug 2", which is a PHP extension that dumps detailed information about what your code is doing.

Not quite the same target audience as kcachegrind...




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: