Today, I am happy to report that the site is back up — this time under Project Shield, a free program run by Google to help protect journalists from online censorship.
This is the first I've heard of Google's Project Shield. Very cool project, is it relatively new? Does anyone know if this was the result of someone's 20% time or a top-down Google initiative?
I'm working for a journalistic organization that cooperated with Google to create one of the graduated Jigsaw projects (Investigative Dashboard).
We were considering using Project Shield relatively recently (as in, some time this year), but at that time, the info about how it actually worked was... well, non-existing on their website. IIRC, their SSL support is also fairly new, which made it pretty unusable until very recently.
Interestingly, I've never looked deeply at what Project Shield does under the covers but I know what all the underlying technology "must" be. The main difference between GCLB and Shield, is that Shield is a free service (operated by a different group, as already mentioned) explicitly for those at risk of censorship.
As another poster indicated, if you want someone to terminate SSL for you, you're going to need to hand them a key. We encrypt ours at rest, provide the same security and care to your secret material as we do Google properties, and as you can see with our Customer Supplied Encryption Key support for Persistent Disk and GCS, we care a lot about letting you control access to your data. If you don't mind me asking, to whom are you comfortable uploading your keys to?
Disclosure: I work on Google Cloud, so I'm actively trying to take your money in exchange for our services.
Keyless SSL is a great thing for people who really can't convince their auditors that it's okay to share their keys. But, it has its own problems like:
> Note: Keyless SSL requires that CloudFlare decrypt, inspect and re-encrypt traffic for transmission back to a customer’s origin.
That's not particularly different (to me), but I have a different threat model. Again, it comes down to what scenarios you care about and what you're comfortable with in exchange for <something>.
Even initiating tons of sessions is likely to mean that the key server is going to be busy. But if you're really concerned with sharing you key with us, I agree CloudFlare's Keyless SSL provides a real service that does a lot for you without handing the key over explicitly (you just have to keep doing your part).
Project Shield welcomes applications from websites serving news, human rights, or elections monitoring content. We do not provide service to other types of content, including gaming, businesses, or individual blogs.
As I said down thread, we (Google) and some of our customers are routinely the target of attacks (DDoS and otherwise). We don't talk about it (sorry? yet?), but we have mechanisms to protect you as well as ourselves and other customers. My hope is we'll be talking about a particularly high profile one soon, but if you've got an account and want to talk to us privately, we're here to help.
Disclosure: I work on Cloud, but I'm not a networking expert.
The underlying technology isn't that difficult to understand...there is a web application firewall that sits in front of the web server serving the real content. There is also a load balancer to distribute the requests.
You can block the attack based on IP, request headers, rate control, geo, etc. The problem with some ddos attacks is the size of the request body. That's what usually overwhelms a server.
Ddos attacks do knock out the servers, but since the requests are distributed, then you don't see that. Because so many servers are getting knocked out, then real paying customers can't serve their content. I'm guessing that's why they pulled the plug.
Google's infrastructure is so large that knocking out a few servers won't impact much, they just spin them back up.
I realize Krebs seems to be using Shield, not vanilla GCP, but I'd like to know the answer to this as well.
Krebs' article describes two attack methods, fake HTTP requests and a simple volumetric flood of GRE packets. L7 mitigation clearly isn't part of GCP's service. The GRE packets should be a different story -- if the customer isn't using GRE, they should be able to drop this traffic easily using the GCE firewall or GCE load balancer.
The vulnerability report at http://www.securityfocus.com/archive/1/532239 suggests the GCE firewall is, or at least was as of May 2014, ineffective against such an attack (even if your firewall rule set will drop the problem traffic). But it also says the GCE load balancers do not have the same problem. Krebs didn't mention how much of the 600+ Gbps was GRE, but let's say a GCE load balancer was configured to pass through only TCP/80 and received 100 Gbps of non-TCP traffic. Would the load balancer function as intended and drop all of this with no interruption to legitimate TCP traffic, would Google null route you, or something else?
I don't think that's true for GCLB. We might rate limit packets arriving or some such, but I'm not a networking expect (as with other posts, I'm sure someone else at Google will send me an excellent email explaining a subtlety here).
We generally don't talk about DoS attacks at Google (sorry). But I can say we're familiar with them, and that Cloud uses the same front end and networking infrastructure as the rest of Google here. We do understand (per project, the unit of billing) if there's tons of ingress/egress from a single project, and your intuition is right: it would be wise to respond to "Hey, is this 1 Tbps of traffic legitimate?" quickly ;).
Disclosure: I'm an engineer in Cloud (mostly GCE), but since I'm not an account manager I refuse to reassure you ;).
I'm a game server operator. I'm not currently running anything, but I probably will in the future. We used to receive volumetric L3/L4 floods all the time from booters. The good thing is these were always just dumb kids, not sophisticated attackers, so they would be easy to filter with a simple ACL (often they were targeting some random port we don't use, so a simple rule set of "accept whatever UDP and/or TCP ports we use, drop all else" would be effective). There were a few providers who would set up these ACLs, but they were only effective up to a few Gbps and would null route past that.
If I get back into this, I will give GCE a try with GCLB in front for, effectively, ACLs. In my experience it's a near certainty that any reasonably popular game server will be attacked (very often as retaliation for banning someone for cheating). I'd imagine your mitigation capabilities for a small-time game server customer paying a few hundred a month would be different than a large corporate client paying 5 figures, but I'd still like to try GCE for this as it seems like the best bet (besides OVH).
I've always wondered just how much a network run by someone like Google or Facebook or one of the other absolute top tier providers like AWS or Azure might be able to 'handle' in terms of dealing with DDoS attacks.
Presumably these giants can easily handle such traffic as long as someone is willing to pay for the privilege? 665gbps seems tiny in comparison to the capacity someone like Google might have at its disposal but I'm speculating as I haven't seen anything detailing their network stats.
To give something of a concluding statement to this waffle I guess I have respect for Google in running this public service type protection for sites that have a strong enough 'public good' element.
Right, we don't talk (in detail) about the "capacity" of our network. However, you can see that we just added a crazy new submarine cable to Japan (FASTER, https://plus.google.com/+UrsHölzle/posts/c6xP4PGmTAz) with 60 Tbps (I think we've reported somewhere that Google will use 10). And in many ways, it's a lot easier and cheaper to do networking over land like say in the US or Europe than between the west coast and Japan ;).
So yes, 665 Gbps is well within our network capability.
Disclaimer: while I work on Google Cloud, I'm no networking expert.
For what it is worth: My GCP-hosted site (https://cloud.sagemath.com) was hit by a DDoS attack in April (a WordPress amplification attack), which was about 5GB/s at peak time. The GCP network had no problem handling the traffic, but the Linux network stack in my cluster of GCE VMs -- which were running nginx -- simply couldn't handle the load. I now use CloudFlare.
It sounds like too much abussive traffic reached the VM. I guess Google doesn't have a similar product as CloudFlare for normal sites that can throw up captchas, enable caching, etc. in response to an attack.
If the 665 Gbps botnet was indeed powered by mainly IoT devices, then this is only the very beginning. We're about to see multi-Tbps botnets soon, all because most IoT companies could care less about security, and because most of them want to connect every IoT device to the Internet by default (rather than through a gateway, which at least could limit infections).
> and because most of them want to connect every IoT device to the Internet by default
Stuff is going to get even worse when IoT devices begin using IPv6. By design, devices are publically reachable and not hidden behind a NAT router, which makes RCE exploits way, way easier.
I'd take a guess that loads of IoT devices have "backdoors" like open SSH/telnet with insecure default passwords, too - the same shit that hit el-cheapo routers, for example.
> Stuff is going to get even worse when IoT devices begin using IPv6. By design, devices are publically reachable
Is that true? I just setup IPv6 at home yesterday and I don't see the difference from IPv4 in terms of reachability. The default policy on my firewall for incoming traffic for both IPv4 and IPv6 is drop.
Yes, NAT can give you a pseudo-firewall in that LAN devices aren't given publicly routable addresses, but I have no idea why anyone would leave their IPv6 network completely open "by design".
IoT devices like surveillance cams, VoIP babyphones etc. have two options:
1) depend on third-party servers for operation, so no outside-to-device-initiated communication is needed. Downside: costs money to run the servers, and just imagine the sausagefest when a camera cloud server gets hacked. Upside: you can firewall off your private network as you like, and the device will still work.
2) allow the owner of the device to directly connect to the IoT device via IPv6. Downside: everyone and their dog can access (and exploit) your devices, provided that they know the IP address. IPv6 addresses are long and pretty random, but e.g. if the MAC address is used for assigning the IPv6 address, not so random any more. Upside: no SPOF/centralized node that turns your device into a brick if the operator shuts down.
Basically, the only way to "securely" operate IoT devices is inside a separate VLAN. FritzBox routers can do this, but not many others, e.g. because the router can't independently manage the Ethernet ports, because it's cheaper to put in a GBit switch than to choose a SoC that's beefy enough to drive four GBit ports individually. I put "securely" into quotation marks, as a hacked surveillance cam or babyphone is an open invitation to any hacker.
"Basically, the only way to "securely" operate IoT devices is inside a separate VLAN."
This is a major pain to configure, and way beyond the capacity of even most IT professionals. One of the problems is that you do want to allow some devices (phones) from the 'normal' network to (selectively) be able to connect to devices in the IoT vlan.
I spend a solid day trying to set this up once (and on a 'real' switch, not a Fritzbox which I have too but only use as modem) and I'm not saying that I'm that good a networking guy (I mean, that I wasn't able to it working means I'm not) but I do know more than the average internet installation guy who would be the only hope for 'regular' users to set up their networks properly.
Most routers nowadays seem to come with a long random password written on a sticker stuck on the bottom. That seems a reasonable compromise between being hard to hack over the internet and reasonably easy for non techie owners.
i would say making secure tunnels a default for home routers would be a better mitigation... there is no need to control my "smart home" or what ever from my neighbors phone...
Some things just do not make much sense without being able to control them remotely. A separate VLAN for devices would only allow them to communicate with another which probably is not what you want. BTW. i could not set up VLANs on a fritz box with FritzOS last time i tried. However nearly all routers with OpenWRT will do without a problem.
The only way to get it well-adopted would be if you could get Google, Apple, Microsoft and router manufacturers to agree on some user-friendly and secure way to set up tunnels and connect to them from your phone/devices. Good luck with that...
While a common agreement on how this should be done would be the best solution, it should be enough to have at least some option to setup such tunnels through an API. Router manufactures could develop an App to make that setup easier then.
Although this wont be practical for any device that is not a phone or a (desktop) computer.
And then there would be malware running in your browser that uses the API to expose your IoT devices. There is precedent for that where Javascript in the browser was accessing your router's web interface (with default credentials) to change your DNS server.
That's just another target among many in the broader class of XSS attacks, and there are protections that router manufacturers (and anyone else hosting a website) are able to build in to protect against it.
Unfortunately, this belies the meat of the issue with IoT. After you've bought the router, there's no reason for your router's manufacturer to keep updating the firmware and fix bugs that allow an XSS attack, and even if the manufacturer does upgrade the firmware, there's no no way for the manufacture to force the firmware to be updated on all of the devices that have been sold, hence installed devices with older firmware that's been exploited and is now part of the botnet used to attack Krebsonsecurity.com
AVM doesn't call it VLANs, but "guest network". Activating the feature opens up a new SSID with internet-only access (the newest release of FritzOS even can do a captive portal!), and you can assign the LAN4 ethernet port to it, too.
Separating the IoT stuff (at least the stuff that needs connectivity from outside) into its own VLAN at least prevents a hacker from gaining access to the rest of your home network (e.g. NAS devices or your normal computers).
i can see why they would not call this VLAN functionality although it surely makes use of it, it is not configurable individually. So lets just say i misunderstood what you were meaning. Setting up a guest network with internet access would allow the same exploitation in order to participate in a DDoS attack.
While this would as you said protect my other networks to some extend it does not solve the problem. I will not ever trust these devices to get secure enough to expose them directly to the internet.
This is why i am thinking the only way to operate these devices securely is to require some kind of transport protection in form of an IPsec tunnel for example. This would allow me and anyone with the right set of access to control the devices without making them accessible to anyone else.
If home routers would encourage the use of such tunnels it would be a normal thing to have a link back to your home network (or maybe a separate IoT network) which could be properly firewalled...
DDOS is exactly that: a Distributed Denial of Service attack. In other words, there's thousands of IP's popping in and out performing a variety of denial of service attacks, so it's not really a one block and done type deal.
Usually there are many IPs popping in and out, and it's often hard to tell them apart from regular clients.
Also, you still have to receive the traffic even if you then ignore it; if your pipe is smaller than the attacker's, it may be enough to overwhelm you.
The original post [1] about krebsonsecurity being kicked of akamai (for valid reasons) had a lot (480) of reactions. I think a many of readers of that post see this as a valuable follow-up.
127.0.0.1 points to localhost, which is your local machine.
However if you query the Whois, you'll see that the site changed its name servers from Prolexic's to Google's ones today:
$ whois krebsonsecurity.com | grep "Name Server"
Name Server: NS-CLOUD-D1.GOOGLEDOMAINS.COM
Name Server: NS-CLOUD-D2.GOOGLEDOMAINS.COM
Name Server: NS-CLOUD-D3.GOOGLEDOMAINS.COM
Name Server: NS-CLOUD-D4.GOOGLEDOMAINS.COM
Such name servers change can take up to 72h to propagate. Meanwhile, you can edit your /etc/hosts file (on GNU/Linux) to hardcode the IP of krebsonsecurity.com:
DNS doesn't "propagate" and it certainly doesn't take 72hrs either. You're simply waiting for the cache TTL to expire on the DNS server you are using, then it'll re-query the authoritative servers.
What about that doesn't sound like propagation? I guess you could try and be pedantic and claim since there's no "push" it's not propagation, but this exact phenomena in DNS is called propagation all the time.
Also many ISP's servers ignore TTLs for certain changes that happen infrequently (like NS changes) and wait a few days. So they might refresh the A record according to the TTL, but they'll still be pulling it from the wrong name server.
More often than not, it's not ISPs ignoring TTLs as you describe, it is an improperly configured SOA record. There's more TTLs you (as owner of domain) control involved in the caching of a record than just the TTL on an A/AAAA record.
As to your point about "propagation", argumentum ad populum applies.
Could you elaborate more on why propagation is not the right term? Its definition reads:
The process of spreading to a larger area or greater number; dissemination.
Which is (to my understanding) exactly what happens with DNS information while clients fetch the updated zone.
72 hours is indeed a comfortable upper bound ("up to"), that we would give to customers so as to make absolutely sure every cache down the chain has refreshed its records, and is not the cause of the issue at hand when name servers were changed. 24 hours is common delay.
"Propagate" is the traditional term for DNS update delays. It dates back to the era before NOTIFY and IXFR when it often took a long time for secondary servers to update their copies of a zone. Nowadays with a good setup that should happen within a few seconds.
Perhaps it is a bit of a misnomer to use it for cache timeouts, but it isn't particularly wrong or confusing unless you have a very pedantic reader.
Propagate implies that the answer isn't readily available everywhere already. Instead of saying "up to 72hrs for DNS to propagate", it's much more accurate and just as easy to say "up to 72hrs for your DNS cache to refresh".
Am I being a bit pedantic? Sure, I'll admit that. But the reason I'm being pedantic is that this mischaracterization implies that there's nothing that can be done to prevent it. That is absolutely not true and if DNS is handled properly, you can get 99.99% or more of the world resolving with the correct records in under 1hr, with most of that being in under 15min. Excusing this as DNS propagation and not handling just looks lazy to me.
But what do I know, based on all the down votes apparently folks disagree with me so I'll gladly bow out from the conversation.
I see the same. But it is not uncommon for DNS as it needs time to propagate globally. FWIW, no inserting the www subdomain does not help either. Waiting will though.
Now that he has quickly found another safe harbor, this attack may well have a sort of Streisand effect and give Brian Krebs more prominence than he already had. Would be nice to see something good come out of this and maybe even cause future attacks to be seen as counterproductive by potential perpetrators.
Krebs won't use Cloudflare because Cloudflare protect DDOS-for-hire sites from each other. He thinks, before CF offered this protection, the DDOS-for-hire services would take one another offline; and that it's an ethical problem, for CF to be protecting the very people whose criminal acts create (some of) the demand for their services.
Cloudflare, in their defence, say they don't censor/check/approve sites and that's a good thing - after all, sites like wikileaks should be allowed protection.
I'm all for free speech, but protecting sites that commit criminal activities is not a "good thing". In fact, they should be partly liable for the damage if they were aware of it and did nothing to stop it.
Protecting a journalists free speech is in line with the original do not evil motto. It's also great advertising for GCE capabilities around handling DDoS.
Google probably has excess capacity. There is nothing more to be paid if the server is sitting there doing nothing. It's actually good for Google because they get to analyze a huge amount of traffic which they can use to protect paying clients.
Unfortunately I can't get to the new site (without changing my DNS servers) because Verizon is resolving krebsonsecurity.com to loopback. Presumably doing it for (poor) DDOS mitigation, but this sort of censorship is ridiculous.
The krebsonsecurity.com site was really pointed to 127.0.0.1 by its owner recently (see https://twitter.com/briankrebs/status/779144394360381440), so it's probably just a DNS cache at Verizon which hasn't expired yet. Give it some time.
While in this case I was wrong, DNS poisoning is certainly not out of the realm of what Verizon will do, and when a site resolves properly on one ISP and not another, I don't think it's a "wild claim" to assume that it's the ISP's fault.
Guess with this I'll now find out, as crapping on Krebs' site is practically a right of passage when you've got a botnet now.