Heroku was Down

teagee · on Feb 24, 2022

HN is truly a market leader in status page technology

chasd00 · on Feb 24, 2022

A couple of outages that have affected my client i learned about first on HN. We were able to mobilize a team and get on top of it faster than any monitoring team at the client ( a state government ). I feel like HN should invoice us haha

castaway3000 · on Feb 24, 2022

You might benefit from some better monitoring!

jonpon · on Feb 24, 2022

I came to HN to make sure it was actually down.

duxup · on Feb 24, 2022

Except when it is a false alarm.

I've seen a few get to the front page.

But when we're right we're right.

jcuenod · on Feb 24, 2022

10/9 times

for1nner · on Feb 24, 2022

HN: All of our amps go to 11.

dec0dedab0de · on Feb 24, 2022

Even when it's a false alarm it's usually something else is having a problem that is affecting many people and manifesting itself as a particular service being down.

thecleaner · on Feb 25, 2022

I'll make my employer pay for it if HN starts a paid service. Honestly they should write a blog post about when down times were first reported on Hn.

js4ever · on Feb 24, 2022

HN was also down for a moment because too much traffic

Tainnor · on Feb 24, 2022

I experienced that too

odiroot · on Feb 24, 2022

Such a lightweight tool at that! Barely any JS loaded.

rileytg · on Feb 24, 2022

jlangenauer · on Feb 24, 2022

All our apps are down as well. And it seems (and I really have to almost laugh here) that https://status.heroku.com no longer loads.

dspillett · on Feb 24, 2022

As always, a note to infrastructure providers: HOST YOUR LIVE STATUS DETAILS ON OTHER INFRASTRUCTURE.

Of course, they won't. If they host is on someone else's then that might look bad (tacitly saying that a competitor is reliable and might be up when they are down) and if they hive off an extra copy of some of their infrastructure there will still be single points of failure either accidentally, by human error (someone somehow messing up both segments at once), or by design (possibly through management trying to save pennies when they noticed this extra bit of infrastructure on the balance sheet).

wuputah · on Feb 24, 2022

I agree with your lead statement and argued as such, but was overruled. Last I knew and understood, Heroku Status is static pages pushed out to Fastly, with the internal admin site (that does that work) running in a Heroku Private Space. If you look at the DNS, it still appears to be served by Fastly, and Heroku Private Spaces are generally pretty isolated infra, so I would be curious what the failure mode was here. But ultimately this is the fire you play with when you self-host your status site...

bin_bash · on Feb 24, 2022

I am pretty sure Heroku has a Linode box or something for their status page. I may be wrong though. They may also have moved it but that would seem like an incredibly dumb idea.

I worked there and I vaguely remember something like this but it's been a long time.

If I had to guess this is probably a DNS issue (it's always DNS).

ezekg · on Feb 24, 2022

I've been with Heroku for 7 years and this is the first time in recent memory that I've seen Heroku completely go down. Nothing at all works.

digerata · on Feb 24, 2022

Ahh... were you sleeping in 2021? They went down in September, November, and December. Probably some other times earlier in the year as well. I stopped keeping track at how bad their service is because of how bad AWS's service is.

jrochkind1 · on Feb 24, 2022

My app didn't go down during any of those times.

I recall at least one of them I could not deploy new versions of my app though, which is pretty bad. I don't believe I had any downtime at all due to heroku platform outages in 2021. i believe you if you say you did though!

This particular outage definitely seems to be of a rare level of severity.

heyoni · on Feb 24, 2022

Their authentication system completely broke I think in January...for hours.

jrochkind1 · on Feb 24, 2022

Oh right, I remember that now.

Didn't bring my deployed app down though! No user-facing outage for me.

This time my app was down for about 35 minutes.

heyoni · on Feb 24, 2022

I think apps still worked but I'm not sure. I know I couldn't create a privatelink to our AWS environment because the CLI kept failing, so we were locked out of a dev database. Not too bad.

ezekg · on Feb 24, 2022

I'm well aware of their other outages. But I don't remember another time when they were completely down like this, even down to their status page.

digerata · on Feb 24, 2022

Well, they aren't completely down now either. Heroku Shield is up. Dashboard is up. EU is reported as up.

As an aside, I spoke with our AE in early January on where they are going in dealing with AWS unreliability. One would think they would have a good answer. They don't.

driverdan · on Feb 24, 2022

You haven't been paying attention. We're very heavy Heroku users. Their dashboard/API has gone down a few times in the past year. Overall they've been reasonably good.

It's much better than it was 5+ years ago. Back then they had almost weekly downtime.

ezekg · on Feb 24, 2022

I am paying attention. My business runs on Heroku. I never said they don't go down... they go down quite a lot (much to my dismay), but never completely down like this one. Even their status page is down, which is a first for me. Must be bad.

bin_bash · on Feb 24, 2022

they had their big layoff in 2020 or 2021 though so I wouldn't expect it to improve

iamricks · on Feb 24, 2022

So... what are the chances this is related to a cyber attack from Russia? Also our apps are down

brimble · on Feb 24, 2022

Now is a good time for everyone to double-check their backups and go over their disaster recovery plans. I know I am.

I mean, yesterday was a good day. But today's the best you've got if you didn't do it then.

driverdan · on Feb 24, 2022

Practically zero. Why would Russia care about Heroku?

Trasmatta · on Feb 24, 2022

Heroku runs on AWS, so it would be an AWS attack of some sort. But AWS in general seems to be up from what I can tell, so I'm feeling like it's not a cyberattack, and just a Heroku specific problem.

briandear · on Feb 24, 2022

https://www.cnn.com/2022/02/24/tech/russia-ukraine-us-sancti...

dc-programmer · on Feb 24, 2022

Collateral damage (but probably DNS lol)

adamnemecek · on Feb 24, 2022

Maybe not heroku per se but something hosted on heroku?

adamrezich · on Feb 24, 2022

what would lead you to this conclusion?

2OEH8eoCRo0 · on Feb 24, 2022

https://www.cisa.gov/shields-up

adamrezich · on Feb 24, 2022

what does this have to do with Heroku? why would it be targeted? especially given that they're running on AWS?

2OEH8eoCRo0 · on Feb 24, 2022

> Every organization in the United States is at risk from cyber threats

Heroku and AWS are organizations no?

> While there are not currently any specific credible threats to the U.S. homeland, we are mindful of the potential for the Russian government to consider escalating its destabilizing actions in ways that may impact others outside of Ukraine.

adamrezich · on Feb 24, 2022

this morning at work we had a database connection issue for a few hours. I work for a school district, which is an organization. therefore, it was probably a Russian cyberattack.

like what

2OEH8eoCRo0 · on Feb 24, 2022

> what would lead you to this conclusion?

I'm just posting what would lead somebody to that conclusion. Nobody is definitively saying anything right now.

adamrezich · on Feb 24, 2022

yes, and I'm just pointing out how ridiculous of a conclusion this is.

ezekg · on Feb 24, 2022

I wasn't gonna say it...

mhale · on Feb 24, 2022

Heroku apps are down, but API access is available.

Might be a good time to run: heroku pg:backups:download

sharksauce · on Feb 24, 2022

% heroku status

› Warning: Our terms of service have changed:

https://dashboard.heroku.com/terms-of-service Apps: Yellow Data: No known issues at this time. Tools: No known issues at this time.

=== Availability of Common Runtime apps 2022-02-24T16:50:47.249Z https://status.heroku.com/incidents/2402 investigating 2022-02-24T16:50:47.249Z (1 minute ago) Engineers are looking into reports of connectivity issues to Common Runtime apps in the US and EU regions.

jrochkind1 · on Feb 24, 2022

oh I didn't know that was a CLI command, nice!

seannui · on Feb 24, 2022

No update to their own status or their Twitter (@herokustatus) after 10+ minutes. Pro.

jpmw · on Feb 24, 2022

Let me bet: it's DNS.

It's always DNS.

ne0n · on Feb 24, 2022

It was, but I don't know why. I'm curious to hear if Heroku releases any information about how this happened. Heroku's DNS was returning a single 100.64.x.x address which is in a reserved range.

jpmw · on Feb 24, 2022

They typically publish post-mortems, I think? Not 100% sure. They definitely do in-house.

We'll have to wait and see I guess.

reubano · on Feb 24, 2022

AWS is down too, this is likely the cause since Heroku runs on it. https://downdetector.com/status/aws-amazon-web-services/

jmartens · on Feb 24, 2022

What is the proof that AWS is down? Functional monitoring of AWS by metrist.io (I'm a co-founder) shows no AWS problems. Downdetector is not a reliable source.

reubano · on Feb 24, 2022

What makes Downdetector unreliable? It's showing a huge spike right now.

jaywalk · on Feb 24, 2022

It's solely based off social media and user reports. It's the "smoke" in the saying "where there's smoke, there's fire" with the caveat that in some cases there's actually no fire even if there's a decent amount of smoke.

zorpner · on Feb 24, 2022

Downdetector relies on user reports, so e.g. if a user's ISP is down and they can't get to Facebook, they might report Facebook being down (or vice versa). DD spikes are typically indicative of _something_, but it's not always the actual down service.

reubano · on Feb 24, 2022

Gotcha. Although for a spike this large (over 1000) for a tech service (AWS vs Facebook) I'd give some credence to it. It could be that everyone who reported AWS as down is running on Heroku. Definitely possible. For comparison Azure [0] and Google Cloud [1] have spikes under 30.

[0] https://downdetector.com/status/windows-azure/ [1] https://downdetector.com/status/google-cloud/

djbusby · on Feb 24, 2022

Their (metrist) claim is that DD is human reported and therefore unreliable.

Metrist monitors via bots

jmartens · on Feb 24, 2022

It can be useful, but you have to take it with a grain of salt. A perfect example is the recent Facebook (Meta) outage. When that happened, Downdetector showed that ATT, Verizon, and T-Mobile all had issues. They didn't, it was just Facebook and users mentioned or otherwise claimed that it was their mobile carrier.

djbusby · on Feb 24, 2022

Neat! Wish you had a clickable demo rather than just screenshots.

jmartens · on Feb 24, 2022

Thanks for the feedback, we can do that soon.

nuggien · on Feb 24, 2022

maybe your service isn't as reliable as you think it is ;)

jrochkind1 · on Feb 24, 2022

AWS status page is all green.... Just kidding! I know the information content of the AWS status page is literally zero, it's always green!

somenewaccount1 · on Feb 24, 2022

Nah, it's just on a 12 hour delay between updates.

mattwad · on Feb 24, 2022

AWS must be only partially down, all of it is running fine for me in us-east-1. Elastic beanstalk ftw! :)

throwthere · on Feb 24, 2022

If us-east-1 is up I have a heard time believing ANYTHING aws is down

jermaustin1 · on Feb 24, 2022

AWS status page is up, but going VERY slow, and says no issues.

reubano · on Feb 24, 2022

I still don't get why they use images

    <td class="bb pad04 top center" style="width: 32px">
      <img src="/images/status0.gif">
    </td>

Instead of unicode:https://www.htmlsymbols.xyz/unicode/U%2b2705

cure · on Feb 24, 2022

It's because they can host the images in S3 buckets, so that their status page goes down when S3 is down.

/s

But, this really happened in 2017.

jermaustin1 · on Feb 24, 2022

if the image tag is drawn dynamically then it is actually probably less bandwidth than the unicode character since the image can be cached at multiple locations including the browser.

jrochkind1 · on Feb 24, 2022

Less bandwidth than the 1-4 bytes of a codepoint in UTF-8? How do you figure?

jermaustin1 · on Feb 24, 2022

Because if it is cached, then it is 0 bytes transferred. First request could be probably a few hundred bytes, but never needed again. And once it is at a CDN, there is never another request to the server.

cure · on Feb 24, 2022

.... no way.

Cache freshness checks involve a lot of headers, which take up way more bandwidth than one unicode character.

jermaustin1 · on Feb 24, 2022

if you care about cache freshness - you could set the cache header to 10 years and it never be an issue again.

XzAeRosho · on Feb 24, 2022

As usual whenever there's an AWS outage

nickphx · on Feb 24, 2022

No, AWS is not down.

jrochkind1 · on Feb 24, 2022

At first the heroku status page was still all green for me... although took 30 seconds to load. I guess when the status page takes 30 seconds to load, that's an indicator!

I did figure, okay, a 30-second-to-load status page probably means my app outage is a heroku platform problem.

(Also an indicator the status page is sharing too much platform with the platform it's supposed to be reporting on? Also in this case an indication that the platform problems are pretty deep?)

Interestingly, my app logs (via papertrail, which is still up) show that some traffic is getting through continually through this current outage, although I can't (and my monitoring app can't either, which pinged me).

tempnow987 · on Feb 24, 2022

Status Page was running a bit slowly for me connecting to developers.salesforce.com.

Link here to latest snapshot:

https://postimg.cc/McZHpCWg

No issues noted there.

henryaj · on Feb 24, 2022

EU outage towards the end of last year was similarly bad, but lasted much longer. Asked Heroku for at least a refund for the dyno time when our apps were unavailable and was flatly told no.

ericpauley · on Feb 24, 2022

Anyone still seeing outages? Our services are all back up.

ezekg · on Feb 24, 2022

All our applications are still down.

ezekg · on Feb 24, 2022

We're back up. Total 47 mins of downtime.

cbonser · on Feb 24, 2022

We are still down as well.

dec0dedab0de · on Feb 24, 2022

I've only ever used heroku for free tier level personal projects, but as I understand it they use AWS to do the actual hosting. I can understand an outage affecting their deployment process, but what could cause running servers to go down?

As I typed that out I remembered that they handle DNS, load balancing, and databases, so I guess any one of them.

alpb · on Feb 24, 2022

All it takes is a load balancer misconfiguration on a core service to take a large scale service down.

jrochkind1 · on Feb 24, 2022

it indeed behaved like a routing issue of some kind (my app was still UP, and was still logging, just no traffic could get to it), and a heroku incident status line said "Engineers are recovering affected routing components," so, yup.

ericpauley · on Feb 24, 2022

Our app is currently 100% down at the router layer, and dashboard requests are mostly failing.

VWWHFSfQ · on Feb 24, 2022

Routing and logs are down for me as well

leetrout · on Feb 24, 2022

Heroku just sent an email to us confirming issues. Links to a status page but that isn't loading.

https://status.heroku.com/incidents/2402

craigkerstiens · on Feb 24, 2022

It appears to only be affecting apps that have multiple dynos.

While Heroku is busy fixing the issue if you scale your app to a single dyno (assuming it can handle the traffic) it should restore availability.

jrochkind1 · on Feb 24, 2022

My single-dyno app was affected.

However, while I could not connect to my app, nobody I know could connect to my app, and my monitoring service could not connect to my app for a ping... my app logs showed that some traffic continued to connect throughout the outage. So it was not entirely universal. And was clearly a routing problem of some kind.

ericpauley · on Feb 24, 2022

We also had single-dyno apps affected.

craigkerstiens · on Feb 24, 2022

It may have been during an earlier part of the instance, but we just scaled multiple affected apps to a single dyno and all recovered. ps restart with multiple dynos and no effect.

jeremyjh · on Feb 24, 2022

I did too. I'm curious if you also have pre-boot enabled? I think that may set the routing configuration the same as if multiple dynos are up.

reubano · on Feb 24, 2022

My 6 apps are all single dyno. Not sure if I have pre-boot enabled or not though....

bisRepetita · on Feb 24, 2022

I had single dynos affected.

neilobremski · on Feb 24, 2022

https://status.heroku.com/incidents/2402

(It took a few minutes to get this link to work for me)

pwned1 · on Feb 24, 2022

It would be nice if they updated their status page. sigh

kennysmoothx · on Feb 24, 2022

Our Heroku applications are all down, it seems to be that it is a Heroku Router/DNS issue.

We can access our logs and the application is running, just no incoming requests.

jrochkind1 · on Feb 24, 2022

My heroku app was back up again as of about 100 seconds before this comment timestamp.

My app was down about 35 minutes total, according to my monitoring.

andygcook · on Feb 24, 2022

Our app is currently down as well and we host on Heroku. We received a down alert from our monitoring service at 11:32AM EST.

emilsedgh · on Feb 24, 2022

Yes it appears to be down for us as well.

smoldesu · on Feb 24, 2022

Rust's crate repository (crates.io) seems to be down too. I wonder if they're connected...

digerata · on Feb 24, 2022

Ours are up. US East. Heroku Shield.

forgingahead · on Feb 24, 2022

Our apps are all down as well, Heroku Status not loading either (or just incredibly sluggish).

canadianwriter · on Feb 24, 2022

Gotta love asking in slack "is uh... literally everything down right now?"

ab-dm · on Feb 24, 2022

Well that was a fun one to wake up to. We were down for ~33 minutes overnight.

mmarcant · on Feb 24, 2022

2/3 of our apps have been timing out of synthetic checks since 8:30a PT

leros · on Feb 24, 2022

I was able to get my apps up faster by restarting the dynos

nickrubin · on Feb 24, 2022

Yes, all apps are down

VWWHFSfQ · on Feb 24, 2022

It is hard-down for me

drstewart · on Feb 24, 2022

Cypress is also down

andycloke · on Feb 24, 2022

All 6 of my apps & dashboard are down too.

bisRepetita · on Feb 24, 2022

Some of my apps are back up. Others are not.

jpmoyn · on Feb 24, 2022

Down along with the status page right now

perryraskin · on Feb 24, 2022

My Heroku app is completely down as well

escot · on Feb 24, 2022

Mines back up, at least 20min down time

reubano · on Feb 24, 2022

Yep, all 6 of my Heroku apps are down.

reubano · on Feb 24, 2022

Interestingly enough, my 2 Squarespace sites weren't affected. Anyone know who they host with?

noodle · on Feb 24, 2022

Things just came back up for me

jamil7 · on Feb 24, 2022

And we're back? I think.

typeofhuman · on Feb 24, 2022

Yes all of my apps are down.

btown · on Feb 24, 2022

Same here, hard down for us

moeadham · on Feb 24, 2022

US down for us, EU is up

Justsignedup · on Feb 24, 2022

is it related to an AWS outage? https://downdetector.com/status/aws-amazon-web-services/

jmartens · on Feb 24, 2022

Functional monitoring of AWS by metrist.io (I'm a co-founder) shows no AWS problems. Downdetector is not a reliable source.

steeef · on Feb 24, 2022

"User reports indicate problems at Amazon Web Services" User reported outage sounds like people confusing the Heroku outage with AWS.

rileytg · on Feb 24, 2022

status, dashboard and landing page are down for me. (socal.)

pedroborges · on Feb 24, 2022

Our app is down too :(

vendiddy · on Feb 24, 2022

Down for us as well.

pwned1 · on Feb 24, 2022

Back up just now.

yekurtal · on Feb 24, 2022

Mine is still up.

mabsboza · on Feb 24, 2022

reporting from Nicaragua, our apps are down!

Gaussian · on Feb 24, 2022

We're down

rlopezc · on Feb 24, 2022

Down for us too

reubano · on Feb 24, 2022

Back up now!

ferajay · on Feb 24, 2022

Same here

nordec · on Feb 24, 2022

Same here

ageitgey · on Feb 24, 2022

Same here

greset · on Feb 24, 2022

Same here

bovage · on Feb 24, 2022

my apps are back up now.

amotinga · on Feb 24, 2022

yes, site is down