Hacker News new | past | comments | ask | show | jobs | submit login
Google HTML/CSS style guide suggests omitting head and body tags (google.github.io)
113 points by soheilpro on May 21, 2019 | hide | past | favorite | 111 comments



"Omit head and body tags, & type attributes to reduce filesize..."

I'm pretty sure you could remove all the head and body tags from the entire internet and it would still make less of a performance impact than if every WordPress blog stopped using Google fonts.


Yeah, or if people returned to using HTML the way it was meant to be used instead of using a JavaScript framework on every page.


One of them is a functional change and one of them isn't. And they are not incompatible changes, so you can do both if you really care.

But there's no reason to ignore literally free improvements just because there's bigger, not free improvements to be had as well.


A single analytics loading script and the tracking payload is larger than the gains from the removal of these tags. Most pages have more than one script and payload too.


This reminds me of a job I recently did for a company that had duplicate Google Analytics tags. I removed one thinking I was helping only to be yelled at when their numbers were cut in half the following month.

After I explained that they were getting incorrect information I was told "I know this isn't exactly ethical ..." yada yada. I fired that client.


Firefox has a tracker blocking mode that shows you all of the trackers. I counted 25 different tracker services being blocked on a single news article yesterday.


Below is the recommended Google Analytics loader

    <!-- Google Analytics -->
    <script>
    (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;    i[r]=i[r]||function(){
    (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new     Date();a=s.createElement(o),
    m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;        m.parentNode.insertBefore(a,m)
    })(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
    
    ga('create', 'UA-XXXXX-Y', 'auto');
    ga('send', 'pageview');
    </script>
    <!-- End Google Analytics -->
That's 455 bytes, even if you have an adblocker and don't load the external javascript. No amount of removing html, head, and body tags will overcome this.


Removing ads and tracking scripts will be more effective in reducing page size.


And Web pages will load a lot faster. The whole Internet would speed up!


Removing a fuel tank from a plane would be an efficient way to save fuel.


Wouldn't this metaphor be more like "removing paying passengers"?


Removing ads is more like removing the power-hungry android tablets and wifi systems from the plane.


Tablets and wifi systems don't pay for the plane's flight, so that analogy doesn't hold.


I'll consider following that advice the day https://google.com obeys.


This part of the guide does say it's option:

    Omit optional tags (optional).
And google.com used to do this, along with other ugly but valid byte-saving hacks, but as the page has gotten richer they've loosened up a bit.


I vaguely remember years ago google.com omitting tags to strip file size and there being some mild controversy around it. Pretty sure it was before HTML 5 adaption and don't think back then the tags where optional, just that browsers try to interpret whatever mess is thrown at them.


Do as we say, not as we do. all the more reason not to trust them.


This is an internal style guide, not external recommendations, as the title would suggest. Comments like these are really making me wary of the level of groupthink on here.


Now this makes sense while on my one page website such an omission would save the user like 10 bytes of extra data, when you add up the surface area of all Google properties that extra 10 bytes adds up across billions of page views.


Are you referring to the group think that Google is full of evil hypocrites?


>Do as we say, not as we do. all the more reason not to trust them.

---

>Are you referring to the group think that Google is full of evil hypocrites?

I'm confused, what are you trying to say?


Previous discussion of this from 2016: https://news.ycombinator.com/item?id=12520674

(Seems like a terrible idea to me - complex rules to follow for a really minimal saving...)


True, for source code it might be better to leave those in. However, the fact that omitting them is standard-compliant is interesting for automated minimizers.


Yep I agree with that top comment. Explicit is far more understandable and maintainable than implicit.


I tried this for a summer, it really did lead to easier to read HTML... although now when I go back I’m not sure which tags can be omitted and which might be errors. I think it’s probably a net negative trade off but still not certain.

These days Im trying out pug instead, which is similar in a sense: more concise and readable at the expense of other people being certain of its correctness and myself remembering the syntax months later.


In my reading of this, I wouldn't say they "recommend" it, but that they recommend considering it. Their link is quite clear that HTML 5 makes these tags optional and gives an unambiguous mapping onto the traditional structure.


It's optional under certain conditions but those conditions are given as you correctly point out in the link to the HTML standard. While many seem to be focused on what Google says, it's nothing that those of us who read the actual spec didn't already know.

And every web developer references the spec when they have questions...don't they?


This seems like insane advice. Even if it's technically permitted, the HTML5 spec page they themselves link to shows a bunch of caveats and exceptions that make it risky.

And for what - to save literally 10 bytes?


times that 10 bytes by all the pages googlebot spiders and that's significant savings on bandwidth & storage for big G


We're talking about the folks who host YouTube, right?

The savings from this won't be the slightest bit significant to them.


ok thanks Warren buffet


I really doubt they avoid data fragmentation well enough for this to make a difference.


Omitting tags is something we can get used to, but just by looking at the spec I'm kinda scratching my head.

Eg. why can't the closing tag for title be omitted? Tags are not allowed inside it.

I guess it's because you should be able to type tags as the title, but having to think about stuff like that makes ommiting more of a hassle than it's worth to me.

And what if templates are used and put together? Then you suddenly need to know how every combination of templates is going to be in the future to avoid unintended side effects.

Saving a bit of data on HTML isn't my top priority, and why would it? Just by not using a bunch of css/js-libs/fonts I'm able to have a smaller footprint than 99 percent of all pages.


I don't understand: since when are those structure tags optional, and in what browsers and browser versions would this work in?


I believe they've been optional in every version of HTML.

The example document in the HTML+ spec (1993) did not use them: https://www.w3.org/MarkUp/HTMLPlus/htmlplus_7.html

The HTML 2.0 spec gives this as its first example:

    <!DOCTYPE html PUBLIC "-//IETF//DTD HTML 2.0//EN">
    <title>Parsing Example</title>
    <p>Some text. <em>&#42;wow&#42;</em></p>


The very first draft HTML spec, https://www.w3.org/MarkUp/draft-ietf-iiir-html-01.txt (June 1993), is somewhat interesting here, because the only element with optional start and end tags is the html element (head and body require both start and end tags, per that spec).

Certainly by the point of HTML 2.0 (November 1995; the first non-draft spec for HTML) all three elements have start and end tags all optional.

The more general ability to omit start/end tags from given elements is a feature of SGML (October 1986), and HTML 2.0 to HTML 4.01 were defined as SGML applications (though approximately nothing ever used an SGML parser for HTML).


The HTML5 specification says where they're optional, so any browser that properly implements the spec


So I'm supposed to assume a browser properly implements a specification?


Yes, that's the point of having a set of standards. Otherwise you end up with an IE situation where everyone just develops for a single browser.


Otherwise you end up with a Chrome situation where everyone just develops for a single browser.

You're commented needed to modified for current times. ;-)


Chrome is based on Webkit which is free and open source so that is a terrible analogy with IE that was proprietary.


Its being open source isn’t mutually exclusive with its not adhering to the standard


Chrome uses its own Blink and V8 now, and there are many cases where developers only design for Chrome and small differences in implementation can be a huge pain in Firefox/Safari.


Blink is also under a FOSS license so I am not sure what the problem is. Competiting browsers use Blink as well.


If it was that simple there would be no reason for Safari to lag behind.


No professional developer targets a browser when developing a web site unless its a captured audience with no choice. Professional developers follow the specification. Those who do otherwise are only mentioned in reddit headlines and other hobbyist sites.


I understand that's the intent of standards, but rarely are there not limitations or bugs in implementations.

If this was truly the case, we wouldn't need things like Applitools to view our HTML documents in various browsers to scan for differences.


That's what we do every day we use any browser out there.


Well, that is literally the point of a specification.


You're missing my point. Assuming that all browsers correctly implement the specification is naive. I don't have an issues with standards themselves existing.


Isn't this what you'd be doing in any case?


They are implied. I believe this dates back to when browsers supported non-standard HTML and the behavior of doing it was cemented in HTML5 (like tag soup and so forth.) Since it was based on existing behavior it should work even in old IE, just like the doctype.


Since pretty much forever. HTML 4 for sure, probably even earlier.


From the slug of this page it seems since 2011? https://www.w3.org/TR/2011/WD-html5-20110525/syntax.html#opt...

Note though that there are some conditions.


It's been allowed since long before then, all the way back to the og html 4.01 spec in 1999


All the way back to HTML 2.0, which was the first non-draft HTML spec.


I see. Today I learnt something new!


Good. I hate writing those useless tags over and over and having all the actual content deeply indented from the start. (seriously).


I disagree, having these tags give the document a structure that makes it easier for machines to understand. As for indenting, I usually break the consistency and put html, head and body at column 1, starting the indenting for anything inside head and body.


I can't think of why it would be easier to write software to parse

    <html><head><title>foobar</title></head><body><p>Hi</body></html>
vs

    <title>foobar</title>
    <p>Hi
Number two looks way easier. If you were to write code to parse #1 you would just need extra code to ignore the useless tags.

Anyway, parsing HTML is a nightmare, and HTML documents are usually broken in the wild, and browsers still, amazingly, manage to render almost anything you throw at them. I'm sure they all already handle it.


> I can't think of why it would be easier to write software to parse

Maybe you haven't considered it closely enough?

If you look at the specifications, there is a complex set of rules governing the conditions under which a tag may be omitted. These naturally complicate the syntax. Consistency is preferable because it typically results in a simpler syntax.

Consider the following language:

    document ::= tags
    tags     ::= tag { tag }
    tag      ::= '(' symbol [ tags ] ')'
    symbol   ::= 'x' | 'y' | 'z' | 'w'
It encodes a document like the following:

    (x (y) (z) (y (w (x))))
    (z (w (z) (y)))
i.e. a list of trees of tags named by symbols.

Now, try to describe the language that's identical except that the closing bracket is optional if it belongs to a 'w' tag that precedes an 'x' tag. Try even to describe a language where they are not optional, but must be omitted. They'll both be more complex than the language I described above. Now imagine that there are about a hundred such exceptions, as in HTML. Chances are that rather than encoding them as syntax for the parser, you'll complicate the lexer to automagically insert the optional tags, fuzzying the boundary between the lexer and parser.

> Anyway, parsing HTML is a nightmare, and HTML documents are usually broken in the wild

Exactly because the syntax is needlessly complicated.


> having these tags give the document a structure that makes it easier for machines to understand

What is a machine supposed to do with the <html> tag? It's entirely useless and tells you nothing. At best it could be used to identify file type, but the doctype has long since replaced that usage and that still wouldn't justify any reason for the end tag.

Similarly for <head> and <body>. They don't really do anything. There's no machine-useful structure to be had from that.

For some of the optional end tags sure, that definitely helps make things easier for machines to understand. That's why XHTML exists. It's not really used, though, but if you really want easy to understand document structure for machines you wouldn't go anywhere close to HTML in the first place and instead use XHTML.


Which machines have trouble, here?

Sounds superstitious.


Man, me too. Indenting the meat of the page that heavily seems wrong, as does not properly indenting it.


It's funny that the company that tries to "accelerate" web pages by directing webmasters to build their websites using a 270k AMP script internally guides its developers to size optimize by omitting the spoonfuls of bytes for optional tags—in a document that in itself doesn't follow these guidelines and is possibly the simplest page that Google has ever produced. What an absurd state of affairs.


To be clear, for anyone confused about the "main HTML tag" in the title, this recommendation is essentially to eliminate <html><head></head><body></body></html>.


I was slightly confused.

I thought it was referring to the <main> tag.

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/ma...


From https://google.github.io/styleguide/:

> This project (google/styleguide) links to the style guidelines we use for Google code. If you are modifying a project that originated at Google, you may be pointed to this page to see the style guides that apply to that project.

These are just Google's style guides for their own code. They're not saying you should do this in your own code.


Of course they don't put their money where their mouth is and the document that describes this guideline itself doesn't follow it:

    <!DOCTYPE html>
    <html lang="en">
    <head>
    <meta charset="utf-8">
    <title>Google HTML/CSS Style Guide</title>
    <link rel="stylesheet" href="javaguide.css">
    <script src="include/styleguide.js"></script>
    <link rel="shortcut icon" href="https://www.google.com/favicon.ico">
    <script src="include/jsguide.js"></script>
    </head>
    <body onload="initStyleGuide();">
    [...]
    </body>
    </html>


>For file size optimization and scannability purposes, consider omitting optional tags.

Is there much savings omitting body and head tags?

This seems like excessive optimization where having them there sometimes because someone left them in, but omitted the next because someone else removed them seems like it could be confusing, a few brain cycles might be more valuable than the possible savings.

Granted I'm not Google so maybe there are some great savings to be had there, I'd like to know what they are.


> This approach may require a grace period to be established as a wider guideline as it’s significantly different from what web developers are typically taught.


Note that the guidelines, including this note, are at least 5 years old now.


Interesting... I have never done this except on small test/toy projects at home. I think the main downside today is probably just that the structure of the document is sort of implied, and it may be unclear to new developers why its possible to style html/body when they don’t exist. It does work consistently across browsers last I checked, though.


That problem already exists with tables, doesn't it?


Yes, it certainly does. I wouldn’t view it as a huge positive, though; it certainly confused me for a while, especially since synthetic nodes actually exist in the DOM.


Note that this is decidedly not new; there are references to this advice from Google at least back to 2014.


What about meta tags (viewport, description, robots), Open Graph tags, and inline styles?

Is it correct to put a title element outside of a head element?

"Permitted parents: A <head> element that contains no other <title> element."

"The <title> element is always used within a page's <head> block."

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/ti...

Google's style guides aren't very good in general, including Python, which doesn't follow PEP8.


Put them before the start of the body and they'll go into the head. In

    <!DOCTYPE html>
    <title>title</title>
    <style>body { color: #111; }</style>
    <p>content
the title and style elements are both inside the head element.


It doesn't seem like a good idea to make HTML more complicated just to save 13 characters per page. It's just one more unclear thing that beginners would have to understand. "These elements go in the <head>" is easier to learn than "Here's a list of elements that have to be put in a specific order, but there is no visual indication that they are different." They will tend get copied/pasted out of tutorials in the wrong order.


They did get this correct:

> Use UTF-8 (no BOM).

I was mystified the first time I read about "byte order mark" and I'm still confused. I mean, I know that if you want to waste space on 16-bit characters, it might make some very small amount of sense to have a byte-order mark, but for crying out loud, just standardize that order and live with it. We don't need more parts of text file that are interpretable, and cause both usability problems and security problems.


Aren't there lots of scripts that depend on the head, body, and even html tags? The HTML tag in particular is used for a lot of feature detection, and the body is used as the root element for a lot of SPAs.

That's not to say that abolishing those tags isn't a great idea, but it would be harder than it looks for projects with an appreciable number of dependencies. I'm definitely in support of never writing that stupid <html> tag again.


I think those elements still end up in the DOM regardless of whether you type them out explicitly. I might be mid-remembering though.


You are correct. Browser fills in missing tags, closes elements, etc. Try opening blank page in any browser and inspect it: you'll find html, head and body tags already placed in DOM.


Ah yes, you're right. I forget a lot of those browser behaviors that haven't much consequence on my work.


There's a difference between a tag, which is a bit of concrete syntax like "<b>" or "</html>", and an element, which is basically a node in the DOM.


From the article:

  <!-- Not recommended -->
  <!DOCTYPE html>
  <html>
    <head>
      <title>Spending money, spending bytes</title>
    </head>
    <body>
      <p>Sic.</p>
    </body>
  </html>

  <!-- Recommended -->
  <!DOCTYPE html>
  <title>Saving money, saving bytes</title>
  <p>Qed.


> In addition, keep the contact area as small as possible by linking as few style sheets and scripts as possible from documents and templates.

Good to know that Google recommends HTTP/1.1 optimization practices. It seems like everyone is ready to spew dozens/hundreds of tiny files over HTTP/2, though the performance rationale is dubious at best.


I don't use body or head tags, but I use the html tag so there's a place to put the lang attribute.


I think this sort of things should be done by some html minifier, not by a human. Because the omission rules are not trivial, and removing these tags doesn't really help readability.


It helps a tiny bit sometimes. The tag clutter does make it messier. It is not enough to justify the memory burden of the edge cases is all, imho


Do all other browsers behave the same way when dealing with the omission of these basic tags?

Because if not, then by following Google's advice here we are creating the new Internet Explorer.


It's written in the specification going back to HTML 2.0


Would love to see someone ask how you integrate Google Analytics or GTM with this, given they both have instructions relative to <head> and <body> tags...


I can’t stop looking at the missing </p> tag in the recommended example.


How much of a difference would this really make?


It's like not using semicolons in Javascript: it doesn't necessarily need to be about shaving bytes but opting out of optional things that you don't need, especially when the machine adds them back for you.

For example, it keeps your indentation level flat here.


Confusing title; I assumed it meant they were recommending avoiding the <main> tag, which seems weird but OK, it’s a new one and we’ve seen document outline tags get deprecated before. But no — they’re recommending avoiding the <head> and <body> tags! These are literally the first things you are taught when you learn HTML. This seems like a pretty fundamental change to have slipped in here without further explanation. What’s the reasoning?


It's explained.

> For file size optimization and scannability purposes

Omitting tags means less transfered bytes means faster websites. If the browser can forgivingly parse HTML like this, why not take advantage of it?

Of course, it doesn't sit well with me either. This'll take some time to digest, but it is interesting to consider.


I know it all adds up, but stripping <html><head></head><body></body></html> seems like a miniscule saving compared to the huge amounts of cruft stuffed into most websites.


That's true. Presumably this is for extreme cases when you really need that extra bit of performance.

Perhaps HTML could have been better designed with performance in mind. Every tag you open (i.e. <foo>) you're supposed to close (i.e. </foo>) which is quite redundant in terms of bytes when you think about it. It is easy to imagine markup languages that minimize the number of bytes the client has to transfer.


Chrome does, what about the other smaller browsers?


I think Google has a set of very specific aims, that aren't so relevant to a lot of other organisations.

Cutting out optional tags will improve page size, download speed etc; something that's obviously important when you're serving so many billions of users.


[flagged]


Could you please review the guidelines and then not post like this here?

https://news.ycombinator.com/newsguidelines.html


It makes sense to me that if those tags are doing nothing, removing them is a legitimate micro optimization.


TL;DR

HTML has been deprecated in favor of Google Markup Language (GML).


This is all standard HTML5.


> For file size optimization and scannability purposes, consider omitting optional tags

They're seriously going to say that removing `<html></html>` is a file size optimization? Why does this make it easier to scan?


For visual inspection removing html, head and body also gives you 2 additional indentation levels (of course this depends on your indentation style).


for googlebot, naturally


Funny that the page recommending that we not use <head> still uses <head>. "Do as we say. Not as we do."

You can view source, and they post it to GitHub.

https://github.com/google/styleguide/blob/gh-pages/htmlcssgu...


Google has been anti-Web for a long time. So I couldn't care less about their "ideas" other than being informed of what populates the head of these people.

On a more objective note, if you need omit the head and body tags to get more performance, something is very wrong with your page/tech whatever.

Maybe consider removing all those Google Fonts, all that Google tracking scripts, FB too and since you are at it, consider deleted Google Adsense if you can, since those $4.84 per month are costing you and us a whole more than the Billions in revenue Google gets.

Sorry for the saltiness, but I'm really getting fed-up with the arrogance of these childish oligarchies.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: