“We have been experiencing a catastrophic DDoS attack”

encoderer · on Sept 4, 2016

I've done some googling before asking here: Can anybody explain why Linode is so often targeted like this? We moved Cronitor off Linode in spring 2015. During the christmas holiday when they suffered a 2 week DDOS I thought of the family time I'd be missing that year as we did a crash migration to AWS had we not migrated when we did. I have to imagine this has been horrible for their business.

I would use Linode if I needed to lease computational power, because it is still a great value vs AWS, but I could not run a high availability service there. It would feel like professional malpractice at this point.

funkyy · on Sept 4, 2016

I am also surprised as this is not the first time I am reading about it on HN. Linode seems to be highly vulnerable to certain attacks as we could see in the past. I hope they will fix it and provide a permanent solution as I was hoping to use them as a part of my network, but I see more and more signals they can't handle serious traffic. Hopefully, they will redesign their infrastructure to handle it.

I am with all of you guys that are affected by this. I am looking forward to them to resolving this soon.

morecoffee · on Sept 4, 2016

Isn't the problem money? They could certainly fix it but would have to raise the prices on each VPS?

reefoctopus · on Sept 4, 2016

I'd be ok with paying twice what I'm currently paying if they could solve this problem. We're with Linode because it would cost us about 10x as much to run on AWS, and we can't justify that.

hakanensari · on Sept 4, 2016

They're price-matching DigitalOcean, so raising prices is probably out of the question.

ksec · on Sept 4, 2016

Linode is now half the price of DO per RAM

ddevault · on Sept 4, 2016

Disclaimer: Linode employee, not in networking

My understanding is that Atlanta in particular has some poor upstreams and is making our job pretty difficult there. Notice that it's almost always Atlanta that's getting hit. I would suggest just using other datacenters or making sure your high availability model includes several DCs (which I would suggest at any hosting provider, really).

spydum · on Sept 4, 2016

As others have said, maybe it's to affect a customer. However, hosting industry had a lot of roots in the adult content industry, and those guys didn't have many ethical guidelines. It was not uncommon for hosting companies to DDoS each other to drive a competitor out of business. Not saying that is the case here, but perhaps?

nxzero · on Sept 4, 2016

Ask HN: Who is behind the Linode DDOS attacks?

https://news.ycombinator.com/item?id=10797795

ryanlol · on Sept 4, 2016

https://news.ycombinator.com/item?id=10797827

Found the culprit.

mappu · on Sept 4, 2016

We stuck with Linode after that, but the important parts now failover to Vultr.

reefoctopus · on Sept 4, 2016

How did you set it up to failover? Our MySQL database is very large, and I'm not sure of the best way to ensure it is up to date on another host.

anthony_franco · on Sept 4, 2016

For our setup we just have master-master replication across two different hosting providers.

leesalminen · on Sept 4, 2016

I've been wondering about this lately. Is it really feasible for a small (one man?) team to keep master-master MySQL replication over WAN running smoothly?

toast0 · on Sept 4, 2016

If you want things to work smoothly; dual-master, single-active is the way to go.

If you use mysql's read_only flag and application users don't have SUPER permissions, you can easily prevent writes to the wrong server; set read_only = 1 in my.cnf and manually set it to 0 on exactly one of the masters. Use the read_only flag to drive automation for which server to send writes to.

Manual failover is set old server read_only, kill existing connections (read_only flag is cached), wait for replication to catch up, set new server read_only = 0. You can make a script to do this with one button, but I wouldn't recommend making it autonomous: flapping between servers is disruptive and could lead to data inconsistency if you switch when replication is behind; data inconsistency is usually way worse than write downtime until someone logs in to flip the switch.

Try to have half your slaves off each master, so if a master is down, you still have 50% capacity. (I've seen some patches from google a while ago to keep binary logs in sync between masters, and make switching masters easy: If that's available, you may be able to have slaves just follow the current active master)

If you have budget for it, an extra slave off each master can be helpful: You can cron them to shutdown MySQL, tar up the directory, and restart. If you untar that on a new slave, it'll continue replicating from that point in time. If you rotate out the backups, you also have some ability to restore data from the past, if there is a bad update.

donut2d · on Sept 4, 2016

I'm a one man operation keeping a master-slave setup with a manual failover and it's been pretty smooth sailing once I got it setup. Don't know how much more complex master-master would be.

tempestn · on Sept 4, 2016

Same here, and as long as you understand how mysql replication works it's not too much effort to deal with. Performing the initial sync without downtime is a bit tricky, but can be done with a well-designed database and some thought. Basically you need to at least temporarily make the bulk of your data read-only, so that you can do most of the data transfer while things are running, and then only briefly lock tables on the source server for long enough to copy the stuff that has changed since the dump, and grab the binlog position. Then you copy that stuff over to the slave as well, update the slave to the correct position, and then start the slave.

That's master/slave, but to get master/master, all you need to do is start a slave on the original master and point it at the current master position on the original slave (which should be static since it isn't yet accepting any queries directly). These posts may be helpful:

https://www.digitalocean.com/community/tutorials/how-to-set-... http://plusbryan.com/mysql-replication-without-downtime

Once it's running, as long as you're not running autoincrement queries or other things that can conflict on both servers at the same time, without taking appropriate precautions, it should chug away without any intervention.

If something does go wrong, you can often figure it out by looking at the slave status, fixing the inconsistency manually, skipping the bad query, and then starting the slave. If not though, you can always just re-synchronize from scratch. Or even better, run your databases off of an LVM volume, then take regular snapshots. (IE snapshot, make a tarball of the snapshot of the mysql directory, then remove the snapshot.) That will give you a consistent backup, even with the server running. On an SSD, the temporary added latency probably won't be noticed, especially if done off-peak. Then if anything goes wrong, you can restore from the snapshot, and it should catch back up to the master from the snapshot's position automatically (as long as your expire_logs_days setting in my.cnf is longer than the duration since the snapshot was taken).

eblanshey · on Sept 4, 2016

No need to take production offline when using the percona xtrabackup tool to set up the slave. It's super easy to use, and I've done it multiple times on databases in the hundreds of gigabytes.

tempestn · on Sept 5, 2016

That does look like a great tool! If we were on all InnoDB I'd probably try switching over to it instead of LVM snapshot-based backups right now. (Since it can do incremental, mostly.) We have a bunch of large MyISAM tables though (MyISAM used because the tables are read-only, so read speed is the only real consideration), so those would have to be handled separately. I could always xtrabackup all the innodb stuff and then just file copy the myisam tables separately though, since I know they won't be changing.

Hmm, may have to give it a try! Thanks.

viraptor · on Sept 4, 2016

It's pretty much the same. You almost never want writes on both sides (now in a failover plan anyway), so as long as you have a switch for which side receives the writes, it's simple.

monkeywork · on Sept 4, 2016

Site A, insure any writes use odd numbers in unique field.

Site B, insure any writes use even numbers for unique field.

This avoids conflict.

echelon · on Sept 4, 2016

Or allow writes in both datacenters with randomized tokens as keys. If you need datacenter-affinity for certain events, use one of the token bytes to encode the author datacenter. Updates that don't have to land in order can be written in an eventually consistent manner. Write a feed of changes in each datacenter and have the peers consume this update feed. Viola, partition-tolerant master-master with failover.

viraptor · on Sept 4, 2016

This is necessary but not sufficient to prevent issues. Sure, it will prevent auto increment key collisions, but unless you're using strict sessions everywhere you can run into other key consistency problems. For example if one master deletes a row while another updates it you'll end up with a missing key and stopped replication on the first one. (depends on the replication mode as well)

porker · on Sept 4, 2016

Doesn't this work only for an append-only structure?

If I'm updating existing records, and the MySQL master at Site A gets updated, then goes down before Site B is updated.. I've got an inconsistent setup.

Been thinking about Master-Master MySQL replication recently as we have a system that's duplicated and taken offline each summer (to run a summer camp), and looking for a way to sync changes in it back to the main 'live' MySQL database.

jovdg · on Sept 4, 2016

I do the same with tree nodes, works extremely well. Two active masters and one slave configured as master, conflicts are non-existant since I found out about this "trick".

onetwotree · on Sept 4, 2016

If it's as easy and reliable as it is with Postgres (and I'd suspect it is), the answer is in my experience yes.

lunaru · on Sept 4, 2016

which tool are you using for master-master on postgres?

dwightgunning · on Sept 4, 2016

BDR isn't yet available in core Postgres. Postgres-BDR by a company called 2nd Quadrant seems to be the most widely used / best documented.

It can be installed as a plugin [1] and there are patched 9.4 packages available [2].

I set up a POC with databases in UK and US (east coast) last year. It was fairly straight forward and seemed to work well.

1: https://2ndquadrant.com/en/resources/bdr/ 2: http://bdr-project.org/docs/stable/index.html

mordae · on Sept 4, 2016

Don't! Not until the next release. We use the 9.4 and it's very experimental.

anthony_franco · on Sept 4, 2016

I set it up once like three years ago and it just works.

I switch between either master with just a DNS update that propagates within 5 minutes.

noir-york · on Sept 4, 2016

We stuck with Linode after last Christmas too, but this new attack looks like the final straw.

How have you found Vultr?

theyak · on Sept 4, 2016

We've moved off vultr. Too many network down times for no known reason and hard reboots on our servers causing loss of data. A friend has had similar experiences. Their support was also poor.

mappu · on Sept 4, 2016

All's well so far! Happy to recommend them.

They seem to be the next biggest name in VPSes after Linode and DigitalOcean, with broadly similar price and quality.

noir-york · on Sept 4, 2016

Thanks! Will give them a shot

secure · on Sept 4, 2016

My VM on vultr has better performance than the comparable DigitalOcean one. Uptime is okay as well, haven’t had any trouble so far.

busterarm · on Sept 4, 2016

We were sort of in the same boat as you last year. We were already prepping our failover and it was about 95% ready to go when the DDOS started on our Christmas break (and our senior guy who did all of our deploys was out of communication). December 23rd was a difficult 10 hour, remote day but we got things finished up and could relax afterwards.

throwsep3 · on Sept 4, 2016

I wonder if this is a diversion to keep Linode's security team busy so they won't notice someone compromising the Xen nodes with XSA-185/6/7/8?

mratzloff · on Sept 4, 2016

Apparently they came to the same conclusion—just a bit late.

https://status.linode.com/incidents/frq9sz7dyb1x

hyperpape · on Sept 4, 2016

What in that article makes you think that? I don't see it. They do say "we'll have to upgrade Xen nodes", but they don't mention the DDoS or link them.

mratzloff · on Sept 4, 2016

Why would they? Timing and the fact that it has happened before make it seem likely.

warbiscuit · on Sept 4, 2016

The XEN update was a scheduled thing -- I got an email about it weeks ago (had one linode I hadn't moved to KVM), and it was already scheduled for this weekend.

That said, I don't disagree that the attackers might be trying to distract the team while they exploit that... though I don't see what it gets them compared to quietly exploiting the XEN issues before they were common knowledge on 9/8.

hyperpape · on Sept 4, 2016

You're the one who linked to the piece. I think it's your job to be clear that it doesn't say that, it's your conclusion.

mappu · on Sept 4, 2016

Fascinating, seems believable.

XSAs are released alarmingly frequently. I've always wondered if KVM is really more secure or if it just gets less scrutiny (certainly there's a class of qemu vulnerabilities that impact both hypervisors).

rodgerd · on Sept 4, 2016

KVM is a much smaller code base, which probably helps.

technion · on Sept 4, 2016

I've noted the AWS security bulletins[0] list nearly every Xen advisory with "AWS customers' data and instances are not affected by these issues, and there is no customer action required."

It would appear? you'd need to go back for quite a few months of being unpatched to find a genuine issue. Unless something about Amazon's mitigations don't apply universally.

[0] https://aws.amazon.com/security/security-bulletins/

lmz · on Sept 4, 2016

AWS (and other large hosts: https://www.xenproject.org/security-policy.html search for predisclosure) get notified before the public.

exhilaration · on Sept 4, 2016

Amazon and other big cloud providers get xen security fixes first, there was an HN discussion about it. Some they've probably already implemented the fix.

HappyTypist · on Sept 4, 2016

This happened last time.

Scarbutt · on Sept 4, 2016

They switched to KVM.

kawsper · on Sept 4, 2016

As far as I know, not all VPSs are running KVM yet. I migrated my last VPS to KVM just yesterday.

stenius · on Sept 4, 2016

They recently incentivized customers to switch to KVM again by doubling the ram but they haven't required a switch yet.

MichaelRenor · on Sept 4, 2016

I disagree with people saying these types of attacks can't be prevented if you switched hosts. I'm sure Google+cloudflare[0] would keep your website online. AWS also if you had the cash.

The amount of distributed traffic happening right now against linode would probably only represent a 5% increase in traffic to a popular Google product. At least you know they have the expertise. Nothing against the very smart and talented linode engineers, but the two companies are on very different levels of traffic engineering.

[0] https://www.cloudflare.com/google/

the_duke · on Sept 4, 2016

Google and AWS probably have sophisticated DDOS mitigation (can anyone comment on this?) and you can scale up pretty quickly.

But if your service is the direct target of the attack (as opposed to the whole provider) and your servers are getting hammered...

Even if you architecture allows quick horizontal scaling, you still face a tough decision.

The attack could go on for days, and the hosting costs can go really high really fast. Which can be catastrophic for a small company.

user5994461 · on Sept 4, 2016

Attacks are rarely targeted to the hosting providers. They usually target a specific customer.

Google/AWS probably have 100 times the capacity (and redundancy and architecture reliability and failover and awesomeness) of linode. That means that, first, they can't be put down easily, second, a DDoS is limited to a small subset of the infrastructure and doesn't bleed to every customers and services.

As for traditional hosting companies (OVH and the likes) When you're being DDoSed, they'll null-route your IP space. (i.e. they advertise your IPs as dont-exist-on-the-internet-anymore). The traffic is dropped while in transit on the internet because it can't go anywhere. It doesn't reach the hosting company anymore.

Note: being null-routed means your site and all your services are off the internet and thus effectively dead.

As for CloudFlare. They have many locations all around the world and they can absorb a lot of traffic, to the point they themselves cannot be DDoS. They have active monitoring and mitigation against common attacks and known malicious sources, which may prevent the attack without even you knowing about it.

When you're under attack, you can block subnet/AS/countries in cloudflare settings, or request a challenge/capcha from every visitors. Cloudflare will reject visitors (with or without challenging them) at their edge location before any traffic can get to you. It is very effective from my experience.

Generally speaking. The only way to stop a DDoS is to do it before it reaches your datacenters so you need help from your ISP/provider/CDN.

Edit: The attack that put down linode last christmas was against linode itself and not a specific customer. Part of the mitigation included linode moving its critical services behind cloudflare :D

iMerNibor · on Sept 4, 2016

> As for traditional hosting companies (OVH and the likes) When you're being DDoSed, they'll null-route your IP space. (i.e. they advertise your IPs as dont-exist-on-the-internet-anymore). The traffic is dropped while in transit on the internet because it can't go anywhere. It doesn't reach the hosting company anymore.

OVH hasn't been doing this for a while, they got some beefy ddos protection setup for this exact reason - it was way too easy to take down someone for hours

Hetzner (another big european hosting provider) followed recently: https://news.ycombinator.com/item?id=12403783

Online.net also has included protection (+ paid upgrades)

At least here in europe the big hosting providers are all switching to providing included protection for all their customers, at least for traffic intensive attacks which hurt everyone

morecoffee · on Sept 4, 2016

It's not hard to compare this to brush fires. They happen periodically, and only the big trees tend to survive them. Linode is getting pretty unlucky here, but I would imagine that all the small time (and even the medium sized) hosting provides are going to succumb eventually. Is the end game just going to be Google vs. Amazon?

wowaname · on Sept 4, 2016

I'd love to see a network infrastructure and transport protocol that's more resistant to many (D)DoS attacks, because it seems like things will only worsen if it never becomes more difficult for people to attack others' servers online.

apapli · on Sept 4, 2016

Good luck. A DDoS is basically lots of traffic. Perhaps run IPX? (Joking)

wowaname · on Sept 4, 2016

If application- and transmission-protocol-level DoS vectors are fixed, then you're left with just the raw "lots of traffic" volumetric attacks, which means your attacker has to have a lot of compromised hosts (or the right compromised hosts with lots of bandwidth). I'd say that's a reality that would be easier to handle, because you raised the bar from anyone who can develop or use a script and deploy to a few low-power systems, exploiting protocol shortcomings, to only those who have a bunch of higher-powered systems.

The smaller hosting companies may still very well go out of the game if the problem worsens, even if most DoS venues do end up being mitigated. I don't know how I would respond to that as of this moment, but hopefully it doesn't have to come to that. It's already tough to find a decent hosting company in my experience.

blfr · on Sept 4, 2016

There are other companies which manage to deal with DDoS attacks (OVH, CloudFlare) but you're right that none of them can be described as small.

simonmales · on Sept 4, 2016

Play by Play of Linodes 'twelve day attack' over Christmas and New Year 2015

https://blog.linode.com/2016/01/29/christmas-ddos-retrospect...

xmatos · on Sept 4, 2016

I don't get the hate towards linode here, on hacker news. I've been their client for a couple of years now and I find it an excellent vps provider. Excellent uptime and performance at a pretty good price. AWS has a few outages every year. Google just had one last week. Azure sucks balls. So, why the hate? Is it because it competes with some ycombinator startups?

boulos · on Sept 4, 2016

For us (Google), I think you're referring to: https://status.cloud.google.com/incident/compute/16017

which was caused by our own maintenance several weeks apart (the root cause description is really quite good).

I think the distinction people make implicitly is a 25 minute outage versus 8 hours. DDoS attacks suck, but they're just standard these days. As a customer though, any source of (network) outage usually has the same outcome: "my site is unreachable (and I don't care why)".

The reason we (and AWS and others) offer multiple datacenters/zones within a <1ms boundary (a compute "region") is so you can build a highly available app that can fail over with minimal degredation. For customers that were using App Engine Flexible environment with the regional spreading turned on, only some of their instances were affected, but their apps shouldn't have skipped a beat.

Linode is good at what they do, but any customer in Atlanta just had to wait this entire event out.

Disclosure: I work on Compute Engine.

fizzbatter · on Sept 4, 2016

I don't follow this too closely, so this is just wild speculation from me:

But could it simply be severity of the attacks? I keep seeing comments about a 2 week ddos attack last christmas - that's something that i would be shocked to see Google/AWS succumb to. Not that Google/AWS attacks don't happen, i just can't imagine them being down for ~2weeks

(I imagine it was just one datacenter from Linode, not the entire service, fwiw)

warbiscuit · on Sept 4, 2016

They weren't down for the entire two weeks, but various datacenters went up and down for hours, then quieted down for a few days, then was back again, then another hit; stretching across two weeks.

One thing that took them so long was that their upstream ISPs at some of the datacenters were themselves unable to handle the DDOS, so they had to switch ISPs, which took a while.

I don't see Google/AWS as easy to attack; but I'm not sure why similar tier players like DigitalOcean aren't being hit -- or maybe they're just less transparent about things, or are actually a smaller target (didn't think they were?).

edit: here's a postmortem from linode of the christmas attack - https://blog.linode.com/2016/01/29/christmas-ddos-retrospect...

hobs · on Sept 4, 2016

No, its because at one time the community actually liked and recommended it and then got burned and acts accordingly.

Linode has a bunch of great features, but after seeing them get hacked a half dozen times over really silly things, more DDOS's than you would be happy with, and frankly I have had interactions with their management (just online) and was sorry to have had said interactions.

You can also read a bunch of implications from former employees about their management, but you can feel free to discount that given how many times ex-employees are a bit pissed.

- Former Linode Customer

zodiac · on Sept 4, 2016

What hate? The post is literally from linode itself

VonGuard · on Sept 4, 2016

OK, who hosts at Linode and is very popular/pisses people off? 4chan? 9gag? Reddit? Something Awful?

ryanlol · on Sept 4, 2016

I think 9gag used to be on linode? I'm pretty sure none of the rest are though.

discr3t3 · on Sept 4, 2016

Yeah Reddit is on AWS[1] and 4chan appears to be self-hosted (or at least it was for most of its existence)[2]

[1] http://highscalability.com/blog/2013/8/26/reddit-lessons-lea... [2] https://www.4chan.org/news?all#106

busterarm · on Sept 4, 2016

WPEngine & their customers.

dopamean · on Sept 4, 2016

This is becoming less and less true every day since last Christmas.

busterarm · on Sept 5, 2016

I've noticed! We really haven't had issues with your service since about January.

ablagoev · on Sept 4, 2016

I've always wondered, while in similar cases GCE/AWS can handle the traffic, is it not chargeable? So, while you will probably not get affected by the DDoS, aren't the costs going to cut your head off?

technion · on Sept 4, 2016

Related, if the costs reach a point that you cannot pay them, is it not a successful, unmitigated DDoS?

kalleboo · on Sept 4, 2016

Isn't incoming data on AWS free? Or are you thinking of some kind of attack where they're using your infrastructure to amplify outgoing data?

Edit: I guess an attack that causes your infrastructure to auto-scale could get expensive REAL quick...

technion · on Sept 4, 2016

I'm running the maths right now.. and you could convince me to take down my side project by just having a server outside their network put wget in a loop targeted at my S3 resources.

kalleboo · on Sept 4, 2016

Isn't incoming data on AWS free? Or are you thinking of some kind of attack where they're using your infrastructure to amplify outgoing data?

ryanlol · on Sept 4, 2016

>Update - We have been experiencing a catastrophic DDoS attack which is being spread across hundreds of different IP addresses in rapid succession, making mitigation extremely difficult. We are currently working with our upstreams to implement more complete mitigation.

That's pretty harsh.

i_feel_great · on Sept 4, 2016

Well, I had Linode shortlisted for an upcoming project. I hate to take them off the list because it is not their fault, but I don't want this kind of unreliability.

delroth · on Sept 4, 2016

> it is not their fault

I don't understand this line of reasoning. It's not like DDoS attacks are some kind of 0-day failure mode that nobody has seen before.

Would you also say "it is not their fault" if their uplink provider had a fiber cut and they didn't have redundant uplinks? I'm guessing not: it's well understood that has a service provider you need to plan for this kind of unavailability and pay more money for redundant links. So it seems really weird to have this double standard for a different kind of availability failure mode.

Just like network availability or datacenter power availability, you need to invest technical and financial resources into DDoS defenses if you want to be resilient to incidents. If you don't do that as a hosting provider, I definitely won't feel sad for you.

toomuchtodo · on Sept 4, 2016

There are a handful of environments that can sustain a large, coordinated DDOS attack. Can you sink 10-20Gb/s of traffic forever? Not cost effectively.

toast0 · on Sept 4, 2016

At work, my www servers get short ddos on a regular basis; on our 10g hosts, 10g+ attacks are livable (outgoing TCP throughout goes down because incoming acks are part of the traffic that's getting dropped when total inbound is above the Nic capacity). We have some newer boxes with 2x10g, I'd imagine those should be able to handle 20g of attack, but I haven't noticed. (I usually only check for a ddos if external monitoring shows an unexpected failure)

That's for volumetric attacks (udp reflection), tls handshaking can eat all the CPU way before we run out of network :(

acul · on Sept 4, 2016

It wouldn't be 10-20g. That should be easy to handle for most hosting providers. Attacks these days are commonly over 100g. That becomes challenging

ryanlol · on Sept 4, 2016

If you live in 2002, nah. In 2016? Yes. Bandwidth is cheap and linode has high margins.

Tinyyy · on Sept 4, 2016

Bandwidth is cheap but attacks are cheap too.

ryanlol · on Sept 4, 2016

Which is exactly why you need to be prepared for them?

toomuchtodo · on Sept 4, 2016

Infrastructure to service requests is not (comparitively). Rarely is it trivial to distinguish DDoS traffic from legitimate traffic.

ryanlol · on Sept 4, 2016

Usually it is and according to linode it is so in this case too.

Edit: Not sure what alternate reality the downvoter lives in, but vast majority of the attacks these days are just "dumb" packet floods or even easier to filter reflection attacks. (Linode clarified this to be a mix of dns and ntp traffic on IRC)

But hey, go on and find me a layer 7 attack that'll take down entire datacenters :)

packetized · on Sept 4, 2016

Stolen credit cards are cheap, and they buy a lot of GCE/AWS resources.

source: been on the receiving end of a several hundred gigabit L7/HTTP DDoS from the aforementioned providers.

markonen · on Sept 4, 2016

One difference between Linode and providers like AWS is that the typical deployment architecture on Linode still exposes customer VPSes to direct L3 internet traffic whereas on a best-practices AWS deployment that is almost never the case.

I'd imagine it's easier to filter out bogus L3 traffic when the vast majority of your target IP space comes with explicit configuration as to what sort of L7 application traffic is acceptable.

acul · on Sept 4, 2016

You can't really compare aws to linode. Aws have hundreds of Gb of transit bandwidth so they can easily absorbe big attacks. They also have a backbone network which allows them to increase the surface area of attacks which increases the available bandwidth.

It's actually not that easy to Filter "bogus" traffic. In the hosting world, especially cloud, you have thousands of customers doing whatever they want. Who knows what is bogus or not. And even if you can filter it at your edge routers your transit links are still going to be getting slammed. The filtering needs to be done upstream in the ISP network. This is usually a manual process as no one supports BGP FlowSpec at the moment.

RTBH is the best way to defend if you don't have the bandwidth to absorbe.

user5994461 · on Sept 4, 2016

Block DNS/NTP in the security group => Problem solved (unless it only filters traffic at the instance input)

Put an ELB in front of the services, the ELB only listening to port 80/443, roll out the ELB publicly, roll out new instances only accessible privately, kill old instances being DDoSed => Problem solved => Repeat for all other services, they shouldn't be publicly accessible in the first place.

Ain't saying it's easy but there are some options to help mitigate the attack.

i_feel_great · on Sept 4, 2016

I hate to blame the victim. I do not know enough about the circumstances, or the industry in general, to call negligence on Linode.

tomschlick · on Sept 4, 2016

Atlanta is their smallest datacenter IIRC with their lowest bandwidth capacity.

I would still recommend them as we have had great luck with their Dallas DC. It seems to be the most resilient of them all in terms of random network outages and also DDOS attacks.

However the one in December did take them offline for a few hours but after which they apparently implemented more DDOS protection. Keep in mind this could happen to anyone (Digital Ocean, Vultr, etc) and mostly their mitigation techniques seem to be to kill your VM until the attack is over.

reefoctopus · on Sept 4, 2016

>the one in December did take them offline for a few hours

That is a bit of an understatement. Our servers were down every couple of hours for a week, and the attack bounced around to pretty much all of their datacenters.

tomschlick · on Sept 4, 2016

Sorry I should have clarified. Our servers were down for a total of 4-5 hours spread across a week or so. After the first incident it only took a few minutes for them to come online each time, although I hear others had varying downtimes while they mitigated that DDOS.

Thankfully it was over Christmas time so our clients were mostly offline.

poopythecat · on Sept 5, 2016

How do you know atlanta is their smallest datacenter?

Is there a list somewhere that shows what the capacity of each of their datacenters is?

Atlanta got trashed the worst back in December but they beefed it up a lot. But how much? Is Dallas still bigger?

Im considering migrating everything to dallas.

tomschlick · on Sept 5, 2016

I think it was a blog post that was talking about why Atlanta had so many capacity issues with the DDOS. I just tried to find it and now cant.

Dallas has been superb though, definitely go with that if you can.

poopythecat · on Sept 14, 2016

How was the recent attack in Dallas? Labor day weekend Atlanta went down for about 25 minutes until they squashed it. Then I saw dallas came under attack. How did it handle the DDOS this time?

tomschlick · on Sept 14, 2016

We had a few seconds of latency tacked onto our normal response time for about 40 mins but everything stayed up. Here is a graph from our monitoring tool (grafana with WorldPing) http://tomschlick-screenshots.s3.amazonaws.com/RKRIJQ9Q

poopythecat · on Sept 14, 2016

Very interesting. Looks like Dallas can still handle a DDOS better. Got news that Atlanta is increasing their bandwidth 6X real soon. Its scheduled this month. They handled this DDOS pretty good. Although we actually had downtime. Adding 6X bandwidth should make a huge difference. Not sure which datacenter should be my primary though after the 6X upgrade. Oh wait, linode just responded to me. They are recommending atlanta for ddos over dallas after the 6x upgrade. Hmmmmm.

tomschlick · on Sept 14, 2016

Good to know! We are looking into a secondary DC setup soon so we will have to look at Atlanta after their capacity bump.

julianz · on Sept 4, 2016

I have a Linode VPS that is fantastically reliable. It's showing no downtime or problems during this DDOS. They've been great to deal with.

edoceo · on Sept 4, 2016

My box in ATL is just wicked slow, on the network. Looks like saturation to me. Got stuff in their other DCs which seems OK for now.

sbarre · on Sept 4, 2016

Are you sure it's not in Boston?

edoceo · on Sept 4, 2016

Hella sure

finid · on Sept 4, 2016

This can't be good for (Linode's) business.

Vultr had problems a few weeks ago, but I don't think it was DDOS-related.

Somehow things have been quiet on the DigitalOcean's end.

diegorbaquero · on Sept 4, 2016

3½ hours already. Distributed from and to many IP addresses. Seems like an attack on them rather than on specific users. :/

tempestn · on Sept 4, 2016

Or an attack on them as a vector to attack a specific user or users.

Dowwie · on Sept 4, 2016

I have no evidence to support this theory, but I believe that Linode is not an outlier with regards to frequent DDOS attacks. What makes this company special seems to be with how it communicates to its customers when it's under attack.

This leads me to wonder: How much do other providers leave customers in the dark?

jonahx · on Sept 4, 2016

Can anyone recommend a good article that explains how attacks like these work, and what is required to stop them?

Also, we're on Heroku and they advertise Ddos mitigation as a feature, but "mitigation" sounds non-commital and I'm curious how they'd fare against a similar attack?

viraptor · on Sept 4, 2016

It's non-commital, because at some point, when you have enough zombie hosts properly distributed all over the world attacking you, your only defence is - have more bandwidth than the attackers. If your peers can't filter out the traffic before it hits your network and it simply saturates your pipes, there's nothing you can do inside the company anymore.

jonahx · on Sept 4, 2016

Thanks for the reply. Can you elaborate on what you mean by "peers" in the above? Eg, who (or what) would Heroku's peers be?

882542F3884314B · on Sept 4, 2016

The discussion [1] below was helpful to summarize the last DDoS attack and new controls implemented.

[1] https://news.ycombinator.com/item?id=10998661

whorleater · on Sept 4, 2016

Cisco has the easiest to grasp fundamentals [1], but to really understand DDOS attacks you'll have to dig down into learning about how TCP works and network layers.

Mitigation sounds non-commital because it is. I don't personally have any experience with mitigating a DDOS attack on Heroku, so I'm not qualified to talk about their prepardness, but DDOS mitigation varies quite a bit [2][3], running the entire gamut of "block a single IP address" to "there's hundreds of IP's rotating and attacking". So the answer to that would be: it depends on who you've pissed off or who's feeling particularly nasty that day.

[1]: https://www.cisco.com/c/en/us/products/collateral/security/t...

[2]: http://www.continuitycentral.com/index.php/news/technology/6...

[3]: https://www.radware.com/newsevents/mediacoverage/changing-ch...

isarat · on Sept 6, 2016

Linode is a cost effective solution comparing to AWS. But the security and DDoS issues could raise eyebrows and bring confidence issues with customers. This has happened even after they posted about enhanced DDoS mitigation strategies like procuring more bandwidth etc.

If you look at the history of Linode discussions under HN, most of them were related to DDoS attacks and service downtimes. https://news.ycombinator.com/from?site=linode.com

I hope they will recover soon and make the service stable.

bogomipz · on Sept 4, 2016

I really feel bad for these folks. Does know if they have a DDOS mitigation strategy other than RTBH with their transit providers? I would have thought that after the 2015 attack they would have looked into traffic scrubbing with something like Arbor Network or Prolexic. I understand that these are not cheap and Linodes as well as many other hosting provider's margins are probably thin but I would think that it would pay for itself in one or two attacks by minimizing customer churn an event like that causes.

ekiara · on Sept 4, 2016

I remember a few years ago when I moved my Linode from Fremont to Atlanta to avoid the frequent outages. I've never had show-stopping issues with Linode and the customer service has always been fast and responsive. Now though, I'm thinking of moving to their Frankfurt datacenter.

But now I think I need to setup fail-over with another VPS provider. What's a recommended alternative? Is Digital Ocean the next best choice after Linode?

pmalynin · on Sept 4, 2016

This explains why Package Control, Ansible Docs, etc. were down. Sucks. I knew this has happened before, just didn't know it was Linode.

chrischen · on Sept 4, 2016

Does the US have a competent cyber-crime division that can handle stuff like this?

icehawk219 · on Sept 4, 2016

Quite the opposite. Our government at least appears to work very hard to make sure these attack venues remain wide open.

SEJeff · on Sept 4, 2016

Yes, the FBI. They're fantastic at it and part of their job is helping businesses recover from compromise and going after the attackers. However, they're overworked government employees with not enough resources.

MichaelRenor · on Sept 4, 2016

I think they very much want to, but what can you do if all of your IPs lead to tor exit nodes?

From what I've heard, the FBI will collect a whole bunch of information and then sit on their hands because of the above reason.

77pt77 · on Sept 4, 2016

How can tor nodes be used for that?

Tor exposes a SOCKS interface, you can't control TCP/IP with the level of detail needed to perform these attacks.

gizmo686 · on Sept 4, 2016

Run the attack itself from a botnet of hacked computers, with the command and control server proxied behind TOR.

pbhjpbhj · on Sept 4, 2016

Naive response:

FBI can probably monitor both sides of those nodes (tap the data centre?) if they're in USA? So can't they monitor all the nodes clients, then do something like block returning traffic and look for timing or other meta data of the re-request from the command server?

Anyone tell me how reasonable that is?

naveen99 · on Sept 4, 2016

> overworked government employees

An oxymoron ?

jtl999 · on Sept 4, 2016

DigitalOcean has had a few DDoS attacks targeting their SFO1 datacenter over the past few months, but fortunately each one seemed to disappear in under half an hour.

hehheh · on Sept 4, 2016

DigitalOcean is the source of many DDoS attacks. They've got a lot of compromised servers on their network and ignore abuse complaints.

constit-protec · on Sept 4, 2016

Godaddy has been getting attacked a lot recently as well. Who is likely behind attacking servers, whether other server companies or governments?

mikebutash · on Sept 4, 2016

When you are the dns root record and in many cases hosting too for some 60-70 million domains, someone in the internet is wanting to attack someone at godaddy all the time. There isn't a time they aren't being attacked somehow in all reality, and I'd presume it's much the same for Linode.

The means of really combating a ddos is costly and extensive, this is why most use a service like prolexic or silverline, and typically with some massive infrastructures that comes with it. Anything less than n-by 40gb disposable internet pipes, preferably regional as you are, you can/will be smited at will.

hehheh · on Sept 4, 2016

GoDaddy, among others, host servers that are a major source of DDoS attacks. I am not sad that they are also victims.

bogomipz · on Sept 4, 2016

Organized crime syndicates? Vigilante hacktivists?

Scarbutt · on Sept 4, 2016

What kind of DDOS attack is the most likely happening here, a simple L3 spoofed ICMP flood?

m0atz · on Sept 4, 2016

Any details on the actual size of the attack and the attack vectors used?

ripken · on Sept 4, 2016

You get what you pay for. TCO of running a highly available data center is more than just the cost of metal, real estate, power, cooling, and cable

Going cheap with one of the non-big 3 public cloud providers will cost you.

thezach · on Sept 4, 2016

I used to be on Linode... then one of their techs tripped on a power cord causing a significant outage in a data center... and no I'm not joking.

jp_sc · on Sept 4, 2016

You are confusing them with Dreamhost: http://www.manton.org/2011/03/brent_on_baked.html

ausjke · on Sept 4, 2016

with linode for 10+ years and never heard about this, can you back this up with some evidence?

reefoctopus · on Sept 4, 2016

This wasn't something recent was it? Which datacenter?