Hacker News new | past | comments | ask | show | jobs | submit login

The main problem that Pagerduty solved at the time was deliverability of alerts. It was hard, and still is, to solve the problem of making sure the right people actually get the alerts. It was so expensive because they have redundant hardware all over the place with different providers, and they ran it all themselves, including SMS and phone gateways.

One of their key differentiators was that they were not built on AWS, so when AWS had an outage, you still knew about it. That also made it expensive.

With Pagerduty, you're mostly paying for reliability. The peace of mind knowing that someone will get notified when there is an outage.

This looks interesting, but from your page I'm not sure how you're better than Pagerduty.

App only notifications looks like a disadvantage to me. What if push notifications are down? What if I'm on DND?

Where is your infrastructure built? Is it on a cloud provider? What happens if that provider has an outage? If you want to build a PD competitor you have to build it on your own hardware in multiple datacenters owned by different people with different interconnects. If you haven't done that how will you stay up when your customer's provider goes down so that they know about it?




FYI, PagerDuty is built on top of AWS. They used to do some multi-cloud stuff, but no longer the case (too expensive, too complex, causing more issues than it solved). Source: worked there for 2 years.


Alerting or control plane or both? If alerting is AWS only, I'm very sad. What happens if there is a global AWS outage? Never been one yet, but never say never.


If your service needs to survive a global aws outage, you just can't run with any saas. So many of these companies are single regioned in AWS. Auth0, Okta, Datadog, many others put customers in a regional box, and if that region goes down, all of those customers go down.


And yet misconfigurations still mean people don't get paged.

A simple solution is to make triggering a "fire drill" a standard part of your on-call rotation hand-off.


We solved that by just triggering an alert five minutes after the rotation changed.

This made sure the new on-call was aware they had started their shift and had access to acknowledge the page and clear it.

And if they didn’t it escalated to tier two, who would find out why tier 1 didn’t clear it.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: