Hacker News new | past | comments | ask | show | jobs | submit login
Enabling Secure HTTP for BBC Online (bbc.co.uk)
68 points by edward on July 14, 2016 | hide | past | favorite | 62 comments



> There are always practical limitations to site-wide technical changes, and HTTPS Everywhere is no different. Sites and content we consider ‘archival’ that involve no signing in or personalisation, such as the News Online archive on news.bbc.co.uk, will remain HTTP-only. This is due to the cost we’d incur processing tens of millions of old files to rewrite internal links to HTTPS when balanced against the benefit.

Not to be snarky, but haven't people written tools to help with this? This seems like a common issue. I mean, there's `sed` and similar tools, obviously, but something that could go, validate that the link works over https://, and update it. I don't see why that would need to be some monumental amount of work.

HTTPS is more than just privacy. See https://certsimple.com/blog/ssl-why-do-i-need-it and https://www.troyhunt.com/ssl-is-not-about-encryption/


  haven't people written tools to help with this?
Let's say you have a web page with a javascript slippy map that imports openlayers from a CDN; and openlayers then retrieves map tiles from openstreetmap.

If you serve that page over https but the javascript CDN url is http, the javascript library won't load. And if the js CDN supports https and you switch to it, the library might still compose a http URL to retrieve the map tiles - causing some browsers to block the tiles as mixed content. Other browsers are willing to load http images on https pages and will work. Unless the tool understands how the map library composes its URLs, someone will have to fix this manually.

To detect bugs like that automatically, after changing to https you'd have to spider every page in your site with several different browsers / browser configurations looking for errors and bad links. And if your archived site had a bunch of errors and bad links to start with, you'll need some way to compare the before-and-after error reports too.

TLDR: It can be more complicated than you think.


Plus... they don't have to do this.

They could put in place redirects, and then use HSTS to tell browsers to only visit the HTTPS links.

They could leave the old HTML unprocessed and pointing at HTTP and HSTS will fix it for modern browsers.

Only the first request would be via HTTP, and Chrome and other browsers can be told to use HTTPS when they see the links even then: https://hstspreload.appspot.com/


> Not to be snarky, but haven't people written tools to help with this? This seems like a common issue. I mean, there's `sed` and similar tools, obviously, but something that could go, validate that the link works over https://, and update it. I don't see why that would need to be some monumental amount of work.

Not as trivial as you'd think: if there's an HTTP URL on the page when it should be HTTPS, how did the URL end up there? Dynamically from PHP code? Dynamically from JavaScript code? Did the URL come from a database? Did the URL come from an environment variable? It can be a lot of work to track all these down and a lot of them you won't be able to find using grep/sed e.g. URLs might appear as relative URLs in code with the "http" part being added dynamically.

You'll get insecure content warnings as well if you try to load HTTP images, css, iframes or JavaScript on an HTTPS page. Likewise, the URL for these can come from lots of places.


I think this trivializes the scope of what the BBC developed. Even with well automated processes, you'd still want a human doing light QA given the wide diversity of content. The BBC has been at it for over twenty years building ad hoc minisites[1]--sites so far down the long tail, that if forced to choose, they may be more prone to pull the plug than to maintain.

[1] http://news.bbc.co.uk/nol/ukfs_news/hi/uk_politics/vote_2005...


Sites and content we consider ‘archival’ that involve no signing in or personalisation,

AUGH! Seeing this "SSL is just for private things" mindset in 2016 is really disheartening. It's to keep people from screwing with your connection, not just snooping on it.

I really hope the browser vendors start treating HTTP the same way they treat broken certs sometime soon. This will change once users start asking, en masse, "Why am I getting all these warnings", not before.


Pretty sure a diluted form of the broken cert treatment for HTTP is available behind a flag in Chrome, so it might be in the pipeline.

Source: http://peter.sh/experiments/chromium-command-line-switches/

See:

    --mark-insecure-as


I don't think you can just run 'sed' on any random iOS app, any random symbian app, any random smart-TV app, some other guy's service that hits your APIs and feeds, and so on... :)


Now, that could be a valid issue, indeed, though not sure for how long I care about those devices continuing to work without any valid upgrade path... Using things like HSTS and CSP's `upgrade-insecure-requests` would help here for clients that do support it.


You might not care, but the BBC does — it's one of the issues they mention in the blog post.

If the BBC "channels" stopped working, but other providers' content continues to work, the BBC would be blamed.


> Earlier in 2016, the Chromium development team decided to implement a change to Google Chrome, preventing access to certain in-browser features on ‘insecure’ (non-HTTPS) web pages. In practice, this meant that key features of certain products, such as the location-finding feature within the Homepage, Travel News and Weather sites, would stop working if we didn’t enable HTTPS for those services.

I think this shows how valuable it is to use incentives to get people to Do The Right Thing(tm). Perhaps more things should be changed to require HTTPS.


It was gutsy (and insightful) of them to publish to the world their upgrade experience. I wish people would be a little more positive about that instead of pointing out how much they suck.


A lot of people think they know better and think it's just a case of a few webserver directives, but have no idea of the scope of the BBC content.


> The CPU overhead of TLS encryption has historically been significant. We’ve done a lot of work behind the scenes to improve both the software and hardware layers to minimise the load impact of TLS whilst also improving security.

I thought that it hasn't been significant overhead for a while now?

related: https://www.maxcdn.com/blog/ssl-performance-myth/ https://istlsfastyet.com/


> Even a 2012 MacBook Air can sign an SSL key in only 6.1 milliseconds.

The BBC has to deal with machines much older and much less powerful than that.


Every TLS speed concern I've heard has been about the server speed, not the client speed.

The servers shouldn't be running on old MacBook airs.


It is the BBC


Even if it took an ancient machine 10x longer than a 2012 MacBook air, that 61 milliseconds more is really not all that much time in the grand scheme of things.

I'm sure the people using these machines that are "much older and much less powerful" than a 2012 macbook air are not expecting sites to load as fast as a newer machine, and probably don't care about the loss of less than 0.1 seconds to load time. If you're running a 6+ year old machine and expecting high performance you'd have to be insane.

Even if BBCOnline cared this intensely about performance, there are more than a few other things they could do to speed everything up. The switch from Apache to NGINX for one. I know that this takes many more developer/sysadmin hours, but if they really cared about a tens of milliseconds then it is definitely something they'd invest in. NGINX has quite a lot of support and is very stable, as well as generally known to much faster than Apache in most cases [1]. It's also not like NGINX is a hipster/unused server, it has quite a respectable share of the 'market' [2].

I also noticed on this page that they docwrite a script (probably to force it async?). This type of 'hack' is terrible for performance [3]. You could just add the 'async' attribute to the script tag and actually move it in the html and reduce the cycles wasted by a hacky solution.

[1]: https://www.rootusers.com/web-server-performance-benchmark/ [2]: http://news.netcraft.com/archives/2016/03/18/march-2016-web-... [3]: https://www.stevesouders.com/blog/2012/04/10/dont-docwrite-s...


I think the best thing you can do to speed up HTTPS is to move to HTTP/2. Check this out: https://www.httpvshttps.com/


I can see they've only enabled Elliptic Curve Diffie-Hellman Ephemeral and RSA key exchange cipher suites. That means users will either use ECDHE and get forward secrecy, or old clients will just use "RSA" (which means the client sends a "pre-key" back to the server encrypted with the server's public key) which works but doesn't give you forward secrecy.

What they HAVEN'T enabled is Diffie-Hellman Ephemeral suites, which give older clients forward secrecy at a big CPU hit.

So this is an example of performance-tuning your TLS settings. There's also stuff to do with session tickets, session resumption, and eventually they'd also be served using ECDSA certs, once all clients support it, or there is at least a great way to only show the older RSA cert to old clients.


And just yesterday I told someone to visit BBC when trying to connect to public wifi that requires a redirect to a login page first. Guess I'm going to have to find a new go-to http site now


Space-bar heater :)

ON a more serious note, I always use http://example.com. Being reserved and maintained by the IANA for documentation and testing, it's the most stable site I can think of.


Be aware that plenty of ISPs sadly MITM example.com. I ran into this when our test suite that curl'ed example.com and checked its output failed when we ran our binary on a new provider.



http://something.com

or use what Google does when Chrome notifies you of a login gateway to public wifi: http://www.gstatic.com/generate_204


wow. For those too lazy to do the extra search step: http://www.something.com/faq/


There really should be a new, better solution to captive portals.


There is! RFC 7710[1] specifies a DHCP option/RA extension that encodes the URL of the captive portal, s.t. when, e.g. DHCP completes, the connecting machine immediately knows what the captive portal URL is, and doesn't have to get MitM'd to know it.

[1]: https://tools.ietf.org/html/rfc7710


You learn new things every day. I'd love to see this in greater adoption.


Perhaps zombocom? http://www.zombo.com/


Despite such late upgrade to https, the site looks good, uses html5, works without Flash and even not accuses me of piracy for using adblocker.


Although this is good news, it will stop me from injecting a hidden breaking news banner to stop it popping up. Should still be able to block the domain, but that won't cache for as long when off WiFi. [^1]

At least this will stop ISPs like BT from doing deep packet inspection and serving stale pages from their cache. Once it's been rolled out to the news site over the next year, of course.

If they use ChaCha-Poly then the load on low power devices shouldn't be much. I did a lot of reading on this for my recent book and it's pretty good for devices lacking hardware AES acceleration.

[^1]: https://unop.uk/block-bbc-breaking-news-on-all-devices


Good on BBC for coming right out and talking about their plans in public. Love reading this stuff!


Haha, here's what I got on the first page load:

string(240) "https://ssl.bbc.co.uk/dna/api/comments/CommentsService.svc/V... string(40) "Error in cURL request: SSL connect error"


I don't get it ... for me their entire website is still http only, even if I add https myself I always get redirected back to http


Ah ok I see https://www.bbc.co.uk/travel is now https, but https://www.bbc.com still redirects me to the http version, I thought when they mentioned their "domestic" website they were talking about www.bbc.com or www.bbc.co.uk ... funny even their blog post that informs us about their https support can only be accessed through http ;)


Did anyone else notice the irony that this proud announcement is served over insecure HTTP?


Apologies if I'm being naive, but how does it take 3 architects a whole year to upgrade a family of websites to HTTPS? The BBC are way behind the times here, although the article alludes to issues with suppliers.


The BBC's web infrastructure is a patchwork of disparate systems, run by separate product teams, on lots of different technology dating back potentially a couple of decades in some cases, with lots of third-party dependencies. Additionally, many of the original teams developing their websites are no longer with them, and as with most systems, documentation has no doubt suffered over the years.

For each individual product, they need to figure out what modifications it needs to become HTTPS-enabled (lots of links and identifiers are hard-coded to HTTP, and third-party CDNs might not support HTTPS by default), and update their testing procedures to ensure that it remains HTTPS-compatible, before they can enable HTTPS. Given that this is the BBC (a publicly-funded entity), they also have to ensure that everything continues to be fully supported on browsers going back to IE6, Firefox 3, and Safari 3 - with partial support for some browsers older than them.

In my opinion, a year is doing pretty well.


There are probably a million different websites and applications hitting their HTTP endpoint, and likewise for scripts and assets embedded in their webpages hitting third party non-HTTPS endpoints. If you just start serving your existing HTML and APIs on HTTPS, lots of stuff will break.

They probably have more websites and hostnames than you'd realize, as well. Take a look at a site like https://dnsdumpster.com/ and search for bbc.co.uk


Well, every 3rd party scripts e.g. for tracking you are using needs to be on https too. CDN and all webservers needs to serve traffic over https. You need a secure way to update cecrtificates across the whole network. Of course if you run single website it's simple but when you have hundrets of servers and multiple 3rd party tools it might get tricky.


There were four major bullet points on the challenges. I imagine they also have other things to do and need to spend time persuading people to make changes elsewhere in the org.


It just took me three hours yesterday to secure a single Wordpress install. Three hours is a lot less than a year, but on the other hand it's longer than the 10 minutes I was expecting.


That's roughly the response time for Comodo customer support.


They enabled Secure HTTP don't forget. Not the insecure kind a lot of people might be thinking about here.


BBC used to be the organization that other broadcasters followed when it came to technology. Now it's the follower.


> HTTPS has been around since 1996

A blog post about spending several years updating to a protocol that's been around for 2 decades and has been standard for full sites for years. This makes me feel like anyone who has an account on BBC should be afraid of their security practices. Calling a plaintext password leak from BBC right now.

EDIT: People are taking this comment more seriously than I intended. I don't actually think you should distrust BBC's security practices because of this, but I do feel that major websites should have side-wide SSL by now. It is clear that a lot of people below me disagree with that, that's okay, I'm glad I spawned a debate here.


Calling FUD on your comment.

It hasn't been "standard on full sites for years", and still isn't now. Only recently with the 'HTTPS everywhere' move has the idea that public sites with no authentication should support HTTPS. And even now, that's not a universally supported opinion, because of its effect on caching.

The BBC has used HTTPS on pages with forms that submit secure data, as has been the historic standard.

Moving a site as massive as the BBC, which spans multiple domains and subdomains and has millions of pages is a big task. Note how you can still see news articles from the late 90s at the same URL. So, yeah, I can understand why writing a blog post about it is worthwhile.


> It hasn't been "standard on full sites for years", and still isn't now.

I'd estimate about 75% of the time I'm on an HTTPS website.

> The BBC has used HTTPS on pages with forms that submit secure data, as has been the historic standard.

This is insecure as the HTTP page can redirect to a malicious HTTPS page from a different domain.


I agree with the need for the BBC to do this. But I disagree with the OP's suggestion that, just because it took them until now to finish doing it, that I should be "afraid of their security practices".


The "afraid of the security practices" was more of a joke than an actual serious jab. But I do still hold that site-wide TLS should be default by all major websites at this point in time.


Can you point out some other major sites used by the general public which have spent the last few years without site-wide SSL to back up your claim?


I can do better than that - I can give you a report published by Google in March 2016 which listed lots of them.

https://www.google.com/transparencyreport/https/grid/

For example, the following are all in the world's top 100 websites and none of them support any form of HTTPS. The link includes quite a few more.

* alibaba.com

* ask.com

* ask.fm

* baidu.com

* cnet.com

* cnn.com

* dailymail.co.uk

* ebay.com

* globo.com

* go.com

* goal.com

* goo.ne.jp

* imdb.com

* live.com

* mirror.co.uk

* naver.jp

* nytimes.com

* onet.pl

* pornhub.com

* telegraph.co.uk

* uol.com.br

* weibo.com

* wikia.com

* wikihow.com

* wp.pl

* yahoo.co.jp

* yelp.com

* youporn.com


That's scary honestly. Thanks for sharing at least.


Washington Post. Buzzfeed. The Guardian.

New York Times still dosent have HTTPs.


Their traffic is too high for them to afford it (and probably wouldn't outweigh the SEO uplift)


I am actually -1 on this idea. They should be able to afford terminating SSL on the load balancer end. It doesn't change the architecture for as long as the load balancer terminates the traffic. Securing HTTPS internally is an expensive move. CPU wise has always been an argument (or more like an excuse). If your load balancer is doing computation other than forwarding requests and decryption SSL session, you should double check your architecture. If small startup will millions of views can sustain SSL traffic, why not NYT? Most their traffics are just serving static files and results are usually cached anyway.

Instead, the burdens are on testing and developing the migration. For example, they'd have to inventory and edit everywhere they use http:// (hardcoding the scheme in your front-end code) instead of //. Furthermore they have to support third-party ad networks deliver active scripts (like javascript) over HTTP. Having HTTP while on HTTPS will create mixed content warning and for active contents browsers will block these violations immediately, thus breaking the website.

To me, the decision of not migrating to HTTPS because of infrastructure capacity is always a myth. Someone has to prove that with data.


Can I ask you why you say it would be too high for NYT to afford it, when many companies with significantly more traffic have site-wide TLS?


If your examples are Google or Amazon, then those have profits several orders of magnitude bigger than the likes of NYT.


The most commonly cited answer I've seen for these kind of questions hasn't been the cost of serving HTTPS but rather the delays in getting the ad networks to support HTTPS. Some sites reported that they had fewer bids (and thus lower revenue) for ads on HTTPS pages, which is something I'm sure the NYT management watches very closely.


MailOnline. SSL would just cost too much in Akamai billing. Otherwise they'd do it straight away for the SEO uplift.


Yup. Until very recently, you did most of your shopping on Amazon.com using unsecure HTTP, right up until you clicked Checkout. Only recently did they move all of the product browsing over to HTTPS.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: