Stop worrying about Time To First Byte (TTFB)

forgotusername · on July 5, 2012

I'd guess this minutiae was written in response to some unreferenced critique of their service, in reality nobody with sense would use TTFB as a useful measure of application performance.

In fact measuring performance anywhere near the HTTP layer is often pointless when such metrics are so disconnected from what a user actually experiences. Perhaps a better test would be something like "perceptible latency", taking into account factors like the ability for the browser to progressively render partially transmitted HTML, or large hangs as a result of external objects (e.g. web fonts, GL models, etc.), especially in a world with increasingly complex front-end Javascript blocking the browser's UI thread (syntax highlighters on blogs, anyone?).

As Apple demonstrated on the iPhone, it's more than possible to hide multi-second delays using tricks as simple as showing a facsimile image of what will shortly be loaded, similarly a web app that animates a form submit with an unintrusive 400ms animation (hiding a 500ms response time) may appear snappier from a user's perspective than an app with no animation but a 300ms response time.

Actually CloudFlare's own signup process is exemplary in this respect, they bury half minute long DNS queries in the background while the user continues filling out forms.

TazeTSchnitzel · on July 5, 2012

On the subject of perceptible latency, supposedly this is why Google's Chrome iPad app feels fast yet is actually slower than Mobile Safari at JavaScript. If you do the UI right, you can hide latency.

And Microsoft spoke a lot with Metro about using animations to hide latency. By animating everything, you can hide most of the delay, and only show loading screens when it's a long wait.

jexe · on July 5, 2012

Do you have a link to the MS talk/article about their work with Metro? I'd be very interested.

TazeTSchnitzel · on July 5, 2012

Oh, er, I don't have a link, but I believe it was from a talk they did at //build about Metro. It shouldn't be too hard to find.

huggyface · on July 5, 2012

supposedly this is why Google's Chrome iPad app feels fast yet is actually slower than Mobile Safari at JavaScript

The JavaScript performance fetish has shockingly little relevance to the actual runtime performance, beyond being a loose correlation with "focus on performance". Aside from JavaScript still being relatively trivial, benchmarks focus on the same code running thousands of times, where reality often sees a given piece of code executed one or in the single digits, annihilating the JIT benefits.

Safari on Windows is a horribly, horribly slow, sloggy browser, yet it has amazing JavaScript numbers.

TazeTSchnitzel · on July 5, 2012

Yep. Safari also performs great in some ridiculous benchmarks (like a personal CSS 3D transforms abuse benchmark), but it's not a very nice browser in real use.

infinity · on July 5, 2012

>>increasingly complex front-end Javascript blocking the browser's UI thread

I agree with you, before optimizing the performance somewhere near the HTTP layer there is often much that can be done at the front-end or the back-end level. The loading of web fonts can actually block a browser for some time until the font is loaded (and often makes the page much harder to read ) - it is not just that the user has to wait for the loading to finish, but the browser sometimes freezes completely (for example, this happens sometimes with Opera), which is a bad user experience.

"Perceptible latency" requires that we have some people who will load the page and measure if is was fast or slow or sluggish. In my experience this is often said to be too expensive or too much work. Actually, it requires just a few people with different browsers and internet connections to get a good idea if the website is fast or slow - BUT: this is nothing that is measured precisely.

The TTFB given by the tests, mentioned in the article, on the other hand, looks like really hard data, measured exactly in fractions of seconds. This can be plotted nicely in a graph, for example.

Another way to see if a website is fast or slow is to ask the visitors. There has been a survey on one large website, which I have created, with some open questions about the new website. One open question was: What do you like about the website. More than 90 percent of the visitors, who have filled out the survey, answered that they have noticed how fast the website loads.

lukeschlather · on July 5, 2012

TTFB is terrible data for anyone trying to optimize a website's page load. There are a dozen other easily quantifiable metrics one can easily gather

- page render time server side - time until the user sees something meaningful (you ought to be able to measure this with a headless browser if you have some idea what the first thing you want the user sees) This will get you what people might think they're getting out of the TTFB metric. - full page load latency - data transfer latency

TTFB might be useful if you have some data that suggests your web server is a significant bottleneck, but I wouldn't gather it as a matter of course in trying to optimize page load times.

jaequery · on July 5, 2012

I'm shocked at how Cloudflare is downplaying the importance of TTFB just because their service happens to INCREASE them.

What I would have liked to see was them showing commitment to how they will be IMPROVING the TTFB.

I love Cloudflare but seriously, saying TTFB is not important is just non-sense. Whoever says TTFB doesn't mean anything either doesn't know what they are talking about or they just buying into the hype of cloudflare/cdn services.

Take a look at a "typical" magento store for example: http://www.hybrid-racing.com/store

It has a 1.5 TTFB. This site is slow and no matter what CDN or WPO (front-end optimizations) you do, it will always take 1.5 second page to page.

Only way to speed up this site is switching over to faster hosting, something more preferable like linode's SAS 15K5 RAID10 hosting and/or perform some caching (e.g; full-page caching via varnish).

patrickmeenan · on July 5, 2012

Completely ignoring TTFB would be a BAD idea. There is no single metric that conveys the user experience (or performance).

Certainly optimizing for 1ms because of the overhead for gzip compression isn't where you should be spending your time but I have dozens of cases in the WebPagetest forums where users have had 5+ second TTFB times and needed help figuring out what was going on. I have even seen cases where it was over 20 seconds (search for TTFB in the forums and you'll be shocked). It is usually a combination of shared hosting and excessive database queries by their CMS but it is common enough that completely ignoring it would be a very BAD idea.

Companies like Cloudflare can help the real TTFB (without cheating) by optimizing the conection between the end user and the origin site. Most CDN's call it DSA where they maintain a persistent connection back to the origin and eliminate some of the round trip times. Cloudflare's recently-launched Railgun feature should have a similar benefit.

It's actually a pretty well-known best-practice for web performance to "flush the document early" if you have any expensive back-end processing to do. This isn't cheating and involves sending as much of the HTML content as possible to give the browser a head-start downloading external resources (css, js, etc).

Doing it with just the HTTP headers in order to cheat the metric itself is not in anybody's interest.

jefftk · on July 5, 2012

    > There is no single metric that conveys
    > the user experience (or performance)

True. But if you need to pick one, TTFB isn't as good as something like Speed Index [1]. Getting your TTFB down will usually help you get your content in front of the user faster by, as you say, giving it a head-start on external resources, but that will also show up in the Speed Index.

[1] https://sites.google.com/a/webpagetest.org/docs/using-webpag...

xpose2000 · on July 5, 2012

I am glad that TTFB worries have been debunked by cloudflare. However, more issues still linger:

1) Response times, in general, are terrible with Cloudflare enabled. This chart [http://goo.gl/JX1v6] shows how Googlebot responds when cloudflare is enabled and disabled. With Cloudflare enabled in June, response times are above 1000ms. I have since disabled it in the middle of June and response times returned to normal.

2) Cloudflare is getting worse. Look at the chart again. Notice the elevated response times in April? I also had cloudflare enabled then. They hit around 400 - 500 ms in response time. Then I disabled it again at the first of May.

3) Ajax requests, most notably, POST have increased noticeable latency. Some days are better than others.

I do not have anything "special" enabled. Minimize HTML,JS, CSS are not enabled. Nor is rocket loader.

I'd also like to add that when "cache everything" + Minimize HTML, JS, CSS + Rocketloader is enabled, then response times seem to be perfectly normal on a different site hosted on the same box. Googlebot: [http://goo.gl/DRDnF]. However, most of us cannot use this feature unless our content is static.

Everything posted here is based off a free account. I will also be cross-posting this on the cloudflare blog post that this links to.

jgrahamc · on July 5, 2012

It might be better to put in a support request on this rather then posting on the blog. I've highlighted your comment on our internal chat.

aaron42net · on July 5, 2012

There's an important optimization that Nginx is missing around real-world TTFB, however that would benefit Cloudflare users.

Even though it is streaming compression, zlib normally packs compressed content into larger than 1500-byte blocks. Browsers can start parsing gzipped content as soon as they can decompress a block, but they have to wait for a whole block to arrive, which means for gzipped content they will often have to wait for more than the first packet to arrive. (Due to TCP slow-start, this may be another round-trip.) This means they are also waiting longer to start fetching <head> contents.

There's a simple fix for this, which is to call deflate(..., Z_SYNC_FLUSH) on a chunk that is likely to compress to under 1500 bytes, like maybe all of <head> or the first 4k. The total compressed size will be slightly larger, but the tradeoff is usually worth it. Nginx doesn't currently do this, but would be a nice optimization to make available.

It's possible to work around this in Nginx by using the embedded perl module and SSI. After </head>, I use SSI to call perl's $r->flush, which Nginx correctly translates into Z_SYNC_FLUSH. The improvement is small, but measurable.

jgrahamc · on July 5, 2012

Thanks for that suggestion. I will look into altering nginx to do that.

isalmon · on July 5, 2012

"At CloudFlare TTFB is not a significant metric." People might think that this article is written to educate masses. The thing is - if you take webpagetest.org and run speed tests with and without Cloudflare, in MANY cases 'Load time' will actually be slower WITH CloudFlare. And the biggest difference will be in TTFB.

Why does it happen? Cloudflare works as a proxy, taking HTML from your server and returning it to a user in optimized form. Because all requests go through their servers - there's one additional hop. With this hop TTFB increases significantly.

Although I agree that TTFB is actually not that important from end user's perspective, the reason why Cloudflare wrote this article is strictly marketing.

saurik · on July 5, 2012

The hop is not the problem. In fact, an extra hop can actually be a benefit, due to various TCP and connectivity factors. The issue might actually be "CloudFlare buffers all the content and does complex operations on it before returning any of it", or maybe is simply a poor CDN (not really being very distributed, for example).

For more information, see a long comment I wrote on this a while back (note: while reading, remember that a CDN typically has a server very near the end user but not so near as to be on the other end of the last mile of their network connectivity; in essence, they are a reverse proxy solution positioned in the network where a forward proxy would normally go), as well as the response someone left:

http://news.ycombinator.com/item?id=2823268

josephscott · on July 5, 2012

The small table comparing a large Wikipedia page is really the main point, a relative comparison of TTFB.

The mistake was is making a relative comparison on such super small numbers. If your TTFB is less than 2ms, then you are fine. You are better than fine.

If your TTFB is 1.5 seconds, then you are struggling.

In determining if you have a TTFB issue comparing the relative improvement of super small numbers isn't likely to help you much.

igrigorik · on July 5, 2012

This is just silly. Time to first byte does matter, when you know what to put in those first bytes.

https://plus.google.com/114552443805676710515/posts/GTWYbYWP...

jgrahamc · on July 5, 2012

I've replied on G+ but I think it's worth pointing out that I don't disagree that time to the first useful byte matters. The issue is that what's being reported isn't that and the time of header generation is not dependent on the time to the first useful byte (as the gzip example indicates).

BlueZeniX · on July 5, 2012

Your post explains time to first full packet. Yes that matters, not the first byte.

henrikschroder · on July 5, 2012

But if you have a dynamic webpage, TTFB measures how long it took you to process the page and start outputting results. The rigged server in this test doesn't seem relevant to that case, or am I missing something?

infinity · on July 5, 2012

In the article it was examined what some of the tests for page load speed measure as TTFB: The tests measure the time until the arrival of the first character of the HTTP headers.

When using a dynamically generated page it is possible to send the HTTP headers, then maybe send some some content and then do some server side calculations or database access, and finally send the rest of the page content. The TTFB measured by the tests, which were mentioned in the article, will not reflect the time needed for server side calculations.

When using a server side scripting language there may be some kind of output buffering, which has to be deactivated first.

bad_user · on July 5, 2012

Not many web frameworks are configured to start writing the response before the response is actually ready.

This is because when sending the response you need to know the HTTP status of that response. Is it a redirect? And if you're querying the database lazily, after your page started to render, what about DB errors that could happen? Then you'd need to send an HTTP 500.

This is the drawback of using this feature in Rails 3. You have to ensure that there's no way something unpredictable happens. And you have to activate the feature explicitly in your controllers. And after you did that, you're probably going to be aware that TTFB is not that relevant.

gilini · on July 5, 2012

That's exactly their point. They claim most measurement tools will give false results if your web server outputs the headers before processing the actual request data.

rossjudson · on July 5, 2012

In other words, web servers can cheat and look good, so we should just ignore TTFB. If you're not a web server developer, TTFB is quite useful. It tells you how quickly request processing happened. Progressive images can get started quickly.

The problem is not with TTFB -- it's that tests that succeed even with "cheating" header responses need to be fixed, if possible, to fail the cheating server.

WebPageTest could modify their TTFB test to require a complete header, so the single-character cheating wouldn't work.

Dylan16807 · on July 5, 2012

But sending the header first isn't cheating. What you need to do is wait for the first byte of actual page content to show up. That will show you how long initial request processing took.

ralph · on July 5, 2012

Didn't TTFB become popular because back then the start-up/handover time of the request from the server to the URL handler, CGI, FastCGI, etc., was significant depending on the handler's language, ... Far less significant now. Cutting TCP packets with compression is much more useful.

saetaes · on July 5, 2012

One thing that's interesting is that the Navigation Timing API available in modern browsers varies from being wildly incorrect to accurate, depending on the browser. In a quick test on my Mac, Firefox 13 showed an incorrect result (basically, the same issue as WPT and Gomez - almost immediate TTFB), while Chrome 20 correctly detected the 10s TTFB.

saetaes · on July 5, 2012

One more data point: On Windows, both IE 9 and Chrome 20 measure TTFB correctly via NavTiming, while Firefox does not.

Kudos · on July 5, 2012

To me it looks like Cloudflare are going to use their position in the request/response cycle to measure server response times and inject javascript to enable real user monitoring. It should let them build performance metrics similar to New Relic's.

shawabawa3 · on July 5, 2012

I don't understand the point.

Surely TTFB is used to measure server latency (basically a ping). Why would TTFB be used as a proxy for page load speed when you could just use...page load speed?

gilini · on July 5, 2012

I believe they're referring to using TTFB as a performance test, to measure how long took the server to process your request.

dknecht · on July 5, 2012

This is metric that is commonly used in the popular webpagetest.org performances test

mstdokumaci · on July 5, 2012

actually, http headers are not that easy to send. how do you plan to create "content-length" header before rendering whole response?

WALoeIII · on July 5, 2012

This is why nginx gzips the body before writing the headers out.

You can use 'chunked_transfer_encoding' to enable chunking, combined with 'proxy_buffering off', your backend can stream the body and nginx will gzip the chunks. Disabling buffering has other consequences, so be sure to read up and experiment before you go to production.

jgrahamc · on July 5, 2012

Transfer-Encoding: chunked