Hacker News new | past | comments | ask | show | jobs | submit login
HTTP/2: The Long-Awaited Sequel (msdn.com)
142 points by mattparlane on Oct 12, 2014 | hide | past | favorite | 71 comments



Check out the IETF draft[1], and this awesome book[2] for more details on HTTP/2.

Some of the coolest stuff I saw was streams and server push. Streams allow multiplexing multiple logical streams of data onto one TCP connection. So unlike the graphs you typically see in chrome network inspector where one resource request ends and another begins, frames (the unit of data) from multiple streams are sent in parallel. So this means only one connection (connects are persistent by default) is needed between server and client, and there are ways to prioritize streams and control flow so it gives devs more opportunities for performance gains.

Also headers are only sent in deltas now. Client/server maintain header tables with previous values of headers (which persist for the connection), so only updates need to be sent after the first request. I think this will be a consistent 40-50 byte saved per request for most connections where headers rarely change.

[1] http://tools.ietf.org/html/draft-ietf-httpbis-http2-14

[2] http://chimera.labs.oreilly.com/books/1230000000545/ch12.htm...


I don't get this.

TCP has steams. TCP has connection mux. TCP has flow and congestion control. HTTP has keepalive. Why build another stack on OSI layer 7?

Also now we have to keep state to work out what the diffs are. State is evil.

Whilst I'm sure this will have some minor performance advancements, I'm not sure that it justifies the new protocol stack.

Not sending 2Mb of JavaScript and crappy HTML down the connection to display the front page probably has higher gains.


You'll need to call a meeting of all the internet's firewall administrators who block TCP ports by default but allow 80 and 443 through. If you can get them to agree to stop breaking the internet then we can use TCP. Until then we will need to build a new internet on top of HTTP, inside encryption so they can't meddle with it.


I don't understand your point.

80 and 443 are "well known ports"[1] which is fine.

What does this have to do with ports? TCP is connection based so a client can create as many connections as it likes to a port on a host.

If someone does indeed build a new "internet" built on top of HTTP which is tunnelled through well known ports with different services with the intention of circumventing the firewall then they will not be allowed through my firewall at all.

[1] https://www.ietf.org/rfc/rfc1700.txt


The problem is that opening new connections is horribly inefficient. The 2-3 round trips (TCP + SSL) required to set up a new connection and the ensuing slow start phase significantly delay the request, and thus the response. It is much more performant to use a single, well-utilized connection in congestion avoidance. The only way to avoid the ugliness of repeated flow control is to build on top of UDP (see QUIC), but there are practical issues with network connectivity and firewalls there.

EDIT: why do you want to block HTTP/2 by the way? You know that HTTP/1.1 can be used to tunnel other protocols too, right?


That's not horribly inefficient. That's the cost of doing business with HTTP.

In fact you're going to have to go to the same effort to establish a TCP connection that your HTTP/2 is going to run over, then still have to do a key exchange. That channel then has the same advantages of a persistent HTTP/1.1 channel plus the ability to provide multiple streams.

The multiple streams can be resolved simply by making more than one connection to the server defensively. Perhaps a mechanism to schedule that client-side would work. Oh wait, we already have one (connection limits and keep-alive).

Then again, all of this is moot as once you've loaded the static resources (images/css/js etc) via HTTP, you should only be seeing one request periodically when an operation takes place or at an interval if polling or kept alive for server-push so maximum two connections from a client to a server.

If you need to do anything more than that, you're probably using the wrong technology both on the server and client.

HTTP/1.1 tunnelling I understand. In fact I use it most of the day (RDP over terminal services gateway) which is RPC over HTTP.

The rationale I have is that effectively managing HTTP/2.0 at the firewall requires packet and protocol inspection rather than merely understanding what connections have been made and where from and where to. This has a significant complexity and tooling cost and effort. Plus there is a significant opportunity to mask illegitimate traffic as legitimate traffic. For those of us who deal with end user network security, this is a major problem.


That's not what HTTP/2 does. If what you want is to tunnel past a firewall, establish an SSL connection and run any number of existing VPN protocols over it, and you can continue to run TCP/IP just fine.


You can run TCP inside a TCP connection, but your bandwidth throttling gets a bit strange. Your inner TCP sees delays instead of packet losses and that isn't how TCP is built to throttle.

As a practical matter, the percentage of customers who will put up a VPN to use your service is vanishingly small.


Trust me. The only thing which will come from this is even more broken firewalls, more complex software, a gazillion of new category of bugs and more vulnerabilities than currently contained in all of a phps combined codebase.

And then we'll be back at square one, ready to make this mistake all over again.


All true but TCP has head-of-line blocking, which means even if resources are requested in parallel then can only be returned in the order they where requested.

In an ideal world we could switch to using something like the SCTP networking protocol with HTTP that would solve a lot of issues. Unfortunately we are stuck with TCP, so the application protocol (HTTP) now must implement a networking protocol so we can multiplex over a single connection.

At least people won't have to inline resources, sprite images, or concatenate CSS and JavaScript anymore. And header compression is a small upgrade to the spec.


Couple of follow ups on this one:

SCTP is message oriented rather than stream oriented so this isn't really useful. The chunk size is also two bytes meaning that all your messages have to be less than 64k or you have to implement packet reassembly and stuff. Oh look, back at TCP again.

We must do nothing.

I suspect this entire SPDY/HTTP/2 reengineering effort is a 1000% complexity and risk increase for a 2-5% gain in performance. That is not a trade-off as an engineer I could accept.

90% of the inefficiency of web applications is down to the application stack, not the protocols. Sending hundreds of KiB of uncompressed text down rather than compressed abstract or native virtual machine instructions for example is a bigger win.


Boy, it was dumb of them to put stream in the name (Stream Control Transmission Protocol) if it wasn't capable of acting in a streaming manner.

Oh wait, SCTP can act in an ordered-with-congestion-control mode (aka stream-oriented), and the userland interface to it (the most basic form of which is just plain old Berkeley sockets) does in fact implement packet assembly (of course, no matter what, if you want packets bigger than the MTU something's gonna have to disassemble and reassemble them on some level of the stack anyways).

Not to say that SCTP is a practical solution given the glacial pace of acceptance of any new network protocol at its level, but let's not start spreading FUD about its capabilities.


Yes, network protocols like IPv6 have a glacial deployment speed. Because all the network equipment have to support it.

But it isn't so for transport protocols like SCTP. Only the endpoints using it need to support it. So a transport protocol that provides a real benefit could be deployed relatively quickly.


I am not trying to extol any virtues or negatives of SCTP, just comment that for HTTP/2 to have multiplexing over a single connect without the head-of-line block problem they have to implement messages also. Seems wasteful.


    TCP != HTTP
TCP is a transport layer protocol (OSI Layer 4). HTTP is an application layer protocol (OSI Layer 7).

https://en.m.wikipedia.org/wiki/OSI_model


That's my point.

HTTP/2 is implementing TCP's responsibilities. Again. Badly.


The same criticism with equivalent merit could be leveled against TCP because it lives as far the Physical Layer (OSI Layer 1) as HTTP lives from the transport layer...e.g. "TCP is implementing Ethernet's responibilities."

But of course, though we rarely worry about Token Ring these days, we do run TCP over IEEE 802.11 all the time.

Likewise, HTTP is run over other Transport Layer protocols even if it is less common, e.g. UPnP uses HTTP over the UDP transport layer protocol. http://en.wikipedia.org/wiki/Universal_Plug_and_Play#Protoco...

OSI's higher levels are abstractions. As is the case with all useful abstractions, they serve to implement the functionality of lower levels without requiring attention to their actual implementation. Not having to manage TCP allows a lot of useful JavaScript to be easily written.


That's a disingenuous and self contradictory description of how the OSI stack works.

There are upwards guarantees at each layer that the stack makes. All implementations within the layer must be equal to the next layer even if one of the implementations provides capability of higher layers. Nothing is said however about adding further guarantees in layers 8, 9, 10, 11, 12...and so forth because they have already been made.

I suppose I shouldn't use parity bits on serial connections then?

"Not having to manage TCP allows a lot of useful JavaScript to be easily written"

That's absurd. It makes no difference.

As for UPnP, which I know well having written an entire UPnP stack, it's a broadcast messaging layer, not a connection based protocol. All the HTTP messages stay within the size of a UDP datagram and it is expected to be wholly unreliable. Even though it's ugly, it's hardly a comparison.

I get the feeling a lot of people here are web developers with little experience of protocol stacks and not system programmers!


I suppose I shouldn't use parity bits on serial connections then?

If you're writing at the level of serial connections and parity, by all means pay attention to those details. If you're writing at higher level, consider abstracting away such details in an interface, library or module.

I miss Wildcat BBS as much as the next person: by which I mean, not very much. HN is full of really fucking smart people not the idiots implied by your comment.


Huh? I think you missed something.

This is nothing to do with BBS's or code abstractions. On the former, there is no OSI stack; it's terminals down serial connections. On the latter, it's datagrams or sockets. It's about the guarantees that the link layer makes or doesn't. Parity doesn't pass up the layers because the guarantees are made further up (TCP).

You can still run token ring, serial, thick ether, thin ether, paper aeroplanes thrown between buildings. It doesn't matter above the data link layer.

Yes there are really fucking smart people here, as you put it but it appears there is a normal distribution of people as well.


If you call opening a new secure stream in one RTT rather than N RTTs "badly".


Where's the research that says that the connection overhead is destroying humanity?

Back when we I had a 14.4k SLIP dialup and RTT of 200ms+ connection overhead and TCP channel overhead was a major drag on throughput but it's not like that now. I'd be surprised if there was a tangible difference to the end user.


> Where's the research that says that the connection overhead is destroying humanity?

It's destroying big business who push lots and lots of resources to the browser:

- From an admin POV, you have to shard your domain => more work, more maintenance.

- From a browser POV, you have to open multiple TCP connections => you take slow start and TLS handshake in your face for each connection + the connections have to fight each other because the OS wants to be fair among TCP connections

- From a web admin POV, you want to inline your content to reduce round trips => you have more work to do on your resources

SPDY is certainly not necessary for everyone (it mostly benefits those who push lots of different resources), that's true. We're talking about businesses who lose a month worth of revenue if the latency to their site explodes from 50 ms to 500 ms.

But it still is interesting because the actual usage _on top of HTTP_ doesn't change: you still have your websockets or your Server-Sent events, you still have your keepalive, you can do a simple-stupid "one HTTP call per resource" and it will be handled efficiently, sometimes SPDY will work underneath to push content so that the next HTTP call will actually hit the cache without you knowing about it... all at the cost of changing (or updating) your library. Because you certainly don't write HTTP text directly to your TCP socket.

The interesting point will be for those library developers. The added complexity will certainly make it harder, but on the other hand the binary format and strict rules will make it easier to parse the messages... I'd like to see where it goes here.


But not quite. HTTP/2 is like a thread and TCP is like a process. The priority of a process and be raised or lowered affecting all threads in it.


The issue is that HTTP/2 is basically working around deficiences in TCP, and doing it badly, because it appears to be easier to get buyin for that than for fixing TCP or deploying alternatives.


> So unlike the graphs you typically see in chrome network inspector where one resource request ends and another begins, frames (the unit of data) from multiple streams are sent in parallel.

Connections are already processed in parallel whenever they can. That is, when the browser knows what to request, and it fits in the execution model. If there's a huge number of assets on a single hostname, this has been a limiting factor because the browsers have limited the number of requests to a single hostname to avoid overloading the server. But that will remain an issue even if the requests are multiplexed over a single connection.

Most of the time when I see graphs in the network inspector that aren't massively parallel it's because nobody have spent time optimizing where/how assets are requested in ways that will make them just as bad with connection multiplexing.

There certainly can be benefits to reap from it, but the worst offenders are already ignoring best practices.


Multiple parallel connections increases the likelihood of packet loss due to network congestion, is also imposes a larger load on servers and intermediate proxies.

A TCP handshake has to take place for each connection and this isn't cost free, and there's the SSL negotiations on top (though techniques like OCSP stapling help)

Going massively parallel isn't free - Will Chan of Chrome did a good write up here: https://insouciant.org/tech/network-congestion-and-web-brows...


True, but my point is that unlike what zaptheimpaler seemed to imply, if you don't see massively parallel downloads in the network tab already, you are not going to see massively parallel downloads just because you get multiplexed streams, and the sites that do see parallel downloads are already for the most part going to do well.

So the real world impact for users is likely to be small:

Lowering the cost of multiple streams will likely give you decent percentage wise improvements on page download times that are already so good that the absolute improvements are likely to be small, and likely minimal to no improvements on the pages that are actually slow.


The real world impact is incredible, not small. Opening new connections is horribly, horribly bad for internet performance. http://www.chromium.org/spdy/spdy-whitepaper


Akamai report a 5-20% increase with some sites seeing no improvement.

With the advent of the browser pre-loader then as long as the resources are declared in the markup then the browser should discover them and issue the request.

Currently browsers often seemed to blocked on waiting for a connection to come free


HTTP/2 is certainly not a clean separation of concerns like HTTP/1.x was, but it's something of a pragmatic approach to protocol design.

HTTP/1.x was neatly layered on TCP with an easy-to-parse text format. This in turn ran neatly on IP4/6, which ran on top of Ethernet and other myriad things. This separation of concerns gave us the benefit of being very easy to understand and implement, while also allowing people to subvert the system, adding things like half-baked transparent proxies to networks that would munge streams and couldn't agree where HTTP headers started. We ended up having to design WebSockets to XOR packets just to fix other people's broken deployments.

HTTP/1.x also became so pervasive that it became the overwhelmingly most popular protocol on top of TCP, even to the point where a system administrator could block everything but ports 80 and 443 and probably not hear anything back from their userbase. This is the reason we ended up with earlier monstrosities like SOAP and XML-RPC: by that point HTTP had become the most prevalent transport that it was assumed incorrectly in many cases that it was the only transport.

Perhaps the IETF should be pushing a parallel version of HTTP that pushes many of these concerns into SCTP. The problem here is that it'll take forever to get that rolled out and we need something to improve things now. Look at how long it's taking to roll out IPv6: something we actually need to fix now.


>We ended up having to design WebSockets to XOR packets just to fix other people's broken deployments.

I was unaware of this and became intrigued. If anyone else is curious, this is the explanation from the RFC: http://tools.ietf.org/html/rfc6455#section-10.3

Basically it's to prevent an attacker from cache poisoning an HTTP proxy (like one on a corporate network) that doesn't properly support WebSockets. WebSockets look a lot like HTTP over the wire, so without masking the wire data in some way a proxy could be tricked into believing a faked "HTTP"-looking request and response are real, and thus cache whatever an attacker supplies.

This would technically be a bug in the proxies, but it's nice to see IETF accounted for this and put in countermeasures before it inevitably became a DEFCON talk.


> and we need something to improve things now

I disagree. Nothing is needed now, HTTP/1 is not broken and it works well enough.

There should be time enough to come up with a clean design. Even if it requires designing a new transport protocol.

Rolling out a new transport protocol like SCTP takes a lot less time than rolling out a new network protocol like IPv6. Transport protocols only runs on the endpoints, not on the routers in the network.

Except for firewalls and NAT'ing home routers, but if HTTP/1 over SCTP would result in a faster better browsing experience the problem would solve itself.


> Why is Internet Explorer leading with HTTP/2 implementation?

Leading? Firefox and Chrome already support HTTP/2 already (and SPDY, the basis for HTTP/2, for a long time now), just not enabled by default.


You are right and given that its a tech preview, this is Microsoft catching up.

Their real problem of course is IIS. We'll probably have to wait for IIS9 which I cannot see happening for another two years. IIS8.5 appeared 12 months ago in Windows Server 2012 R2.


Also, Chrome has experimental support for HTTP/2 in Canary[1] as well as Firefox since version 34 (if I'm reading [2] correctly).

It seems unusual for Microsoft to disable SPDY support entirely, at least until support for HTTP/2 is more widely deployed...

[1]: http://www.chromium.org/spdy/http2

[2]: https://wiki.mozilla.org/Networking/http2


HTTP/2 is based on SPDY: http://en.wikipedia.org/wiki/HTTP/2#Genesis_in_and_later_dif...

So if they leave SPDY in place along with HTTP 2.0, they could wind up with strange incompatibilities occurring or site operators feeling like they need to support both SPDY and the HTTP 2.0 standard (rather than just the HTTP 2.0 standard).

Looking at it, it actually seems more progressive to dump SPDY and move to the SPDY-based HTTP 2.0 at this stage. Then ten years down the road hopefully SPDY will be dead and there will just be HTTP/1.1 and HTTP/2.0.


You can probably get a comparable, if not greater, improvement in performance by using ad and tracker blocking. Most of those extra TCP streams opened when displaying a web page are for ads and trackers. Those are the ones opening a TCP connection to send their one-pixel GIF.


A modern web browser will not block page loading when making a request for an image, though. I don't think blocking ads will necessarily improve perceptible page load time, though obviously it will reduce network traffic.

This does not apply for ad code that's implemented as <script src="..."></script>, which will indeed block page loading.


Try it, the change can be dramatic. Ads have a lot of JS these days.


>Ads have a lot of JS these days.

A lot of the ad JS I see is in the form of inline script tags, which generally should not block anything (the JS usually asynchronously constructs another script tag, which shouldn't impact performance).

>Try it, the change can be dramatic.

I've been using AdBlock Plus, and now uBlock, for at least 8 years. So I'm definitely not arguing against it.

It's just that in theory, an ad tracker (like a 1 pixel image) does not necessarily have to impact performance. Also note that some ad blockers add performance overhead themselves.


>A lot of the ad JS I see is in the form of inline script tags, which generally should not block anything (the JS usually asynchronously constructs another script tag, which shouldn't impact performance).

Well, it does impact performance, even if it's async. The pipe is only a finite width (especially on mobile).



Will this affect the way we do AJAX requests? Or the speed of them? Or has this no impact on websites talking back to the server? My knowledge of networking at the HTTP level is limited and I am trying to find some context.


The javascript programmer sees no change, but things work faster. Multiplexing allows many requests in parallel to the same server over a single socket, with the requests completing in the order they are ready, not the order they were requested, which should reduce latencies but might lower your effective bandwidth if you only got that bandwidth because your browser opened many separate connections to the server.


From source:

"What does this mean for developers?

HTTP/2 was designed from the beginning to be backwards-compatible with HTTP/1.1. That means developers of HTTP libraries don't have to change APIs, and the developers who use those libraries won't have to change their application code. This is a huge advantage: developers can move forward without spending months bringing previous efforts up to date."


Why is it limited to operating system version? Shouldn't it be a browser feature?


(Disclaimer: I don't work for MS, so this may not be entirely accurate anymore.)

It's probably because IE is really just a UI wrapper around system libraries[0]. The changes for HTTP/2 would be made not in IExplorer.exe, but instead in WinInet.dll (and possibly URLMon.dll).

This is because IE isn't the only application that will use these new features.

EDIT: I should add that you don't just go changing system libraries in a patch Tuesday, you'd wait and throw them in a new version, hence the 10 preview.

[0] http://msdn.microsoft.com/en-us/library/aa741312.aspx


I'd guess because it's a low level optimization, but I'm not sure.


Because they want to entice people to use the windows 10 preview


DDOS future blackmailers are happy: a new leverage for amplification :)

I want that so bad. Coding is hard, DDoSing is so easy.

Thank you architects for making black hats life so easy. HTTPS by default? YEESS even more leverage.

I love progress.

Next great idea: implementing ICMP, UDP, routing on top of an OSI layer 7 protocol, because everybody knows religion forbid to open firewall for protocols that do the jobs, or we could even create new protocols that are not HTTP. But HTTP for sure is the only true protocol since devs don't know how to make 3 lines of code for networking and sysadmins don't know how to do their jobs.

And HTTP is still stateless \o/ wonderful, we still have this wonderful hack living, cookies, oauth and all these shitty stuff. Central certificate are now totally discredited, but let's advocate broken stuff even more.

Why not implement a database agnostic layer on top?

When are we gonna stop this cowardly headless rush of stacking poor solutions and begin solving the root problems?

We are stacking the old problems of GUI (asynch+maintainability+costs) with the new problem of doing it all other HTTP.

I have a good solution that now seems viable: let's all code in vanilla Tk/Tcl: it has GUI, it can do HTTP and all, and it works on all environment, and it is easy to deploy.

Seriously, Tk/Tcl now seems sexy.


It looks to interesting to see Microsoft adopting standards as earliest player in the field


Could somebody elaborate how server push relates to web sockets (if at all)? Are they completely independent and will both be supported or does one build on the other?

Given that the web is becoming more and more real-time this seems pretty interesting.


server push just means that when a server sees a request for index.html, it can serve index.html and also index.js and index.css without those being requested, and when your browser parses the html and discovers it needs the js and css, they are already in cache and are fresh enough to use, which saves the round trip latency and might enable a mobile radio to go to sleep earlier.


What if the browser already has those in the client cache? Will it have to abort the pushes, and will it even be able to do so in time on a high bandwidth high latency network like 3g/4g?

Is there a risk that cellular data usage will increase from this?


Yes there is. Knowing when a client has a resource cached already is an important part of server push. There is indeed a risk of over pushing.


Is there a http/2 test page out there that shows if you are connecting with it?

Found this project but nothing live

https://github.com/molnarg/http2-testpage


So this terrible NIHy Rube Goldberg contraption does actually get to see the light of day.

I'm saddened. The days of good internet protocols are clearly behind us.


What in particular is bad about HTTP/2? Complexity can arise in protocols for multiple reasons. In this case, security and correctness are more culpable than anything else. That is, if you want to design something similarly performant, you are going to run into a lot of the same issues (flow control and priorities for fairness, issues with gzip compression, and so on).


What in particular is bad about HTTP/2?

At the risk of sounding too blunt: Everything? All of it? Its mere existence?

It fucks up responsibilities by addresses network-layering issues at the application layer. It takes a simple & stateless text-mode protocol and converts it into a binary & state-full mess.

It has weird micro-optimizations decided to ensure that Google's front-page and any Google-request with its army of 20000 privacy-invading tracking cookies should fit within one TCP-packet using American ISPs MTU packet-size, to ensure people are not inconvenienced when their privacy is being eaten away at. Which I'm sure is useful to Google, but pretty much nobody else.

The list goes on.

It does a lot of things which is not needed nor asked for by the majority of the internet, and yet the rest of the internet is asked to pay the cost of it through a mindboggling increase in complexity, and I'm sure a source of a million future vulnerabilities.

I'm not aware of a single thing in there which I want, and if I'm wrong and find one, I'm unwilling to accept that this is the cost I have to pay for that feature.

Any web-browser I will use in the future will be one where HTTP/2 can be disabled.


> It takes a simple & stateless text-mode protocol and converts it into a binary & state-full mess.

That seems better than the current situation, which often ends up doing the exact same thing, but in an ad-hoc way that gets reimplemented every time.


> It has weird micro-optimizations decided to ensure that Google's front-page and any Google-request with its army of 20000 privacy-invading tracking cookies should fit within one TCP-packet using American ISPs MTU packet-size, to ensure people are not inconvenienced when their privacy is being eaten away at. Which I'm sure is useful to Google, but pretty much nobody else.

Could you elaborate on this, please?


I assume josteink is decrying the fact that header compression is used, which is a pretty ridiculous complaint.


Unfortunately yes.

However there were may other bad protocols that died through lack of use. You can still vote with your feet. A vendor will not maintain a protocol stack if people don't use it.


Hey, don't look so down... Poettering might be shipping an HTTP/2.0 library soon!

And yeah, I'm with you--I think that a lot of this tail-wags-dog stuff is going to come back and haunt us, but we as an industry fucking suck at being conservative when it makes sense.


Its way more complicated. But I guess it has to happen.


A lot of work for just half of HTTP.


No, it really doesn't have to happen. Some people are just pushing it through because ~reasons~.


Reasons being increased performance and security.


Security? You mean, like the trusted proxy stuff? Or the increased performance, which doesn't benefit you so much if you include lots of random shit from other domains (like most people do anyways)? C'mon.


Do not want




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: