Hacker News new | past | comments | ask | show | jobs | submit login
Upptime – GitHub-powered open-source uptime monitor and status page (upptime.js.org)
301 points by fahrradflucht on Dec 27, 2020 | hide | past | favorite | 82 comments



Given that in recent years Github had more downtime that your usual app, the github-powered part may be more of a liability than an asset ;).


Yeah, and GitHub Actions is also usually down more often than their other services.


It’s crazy to me how unreliable and slow the work on github actions has been. Their self hosted runners didn’t even support custom labels for like 8 months. And for at least a year you couldn’t trigger actions from pull requests from forks - even for private, in-org repos.

It’s truly been one of my worst experiences with any CI.


Although if github is down it is also possible you'll be able to do little to fix your site anyway :D


This is the reason why there should always be break-glass options to deploy to production for emergency circumstances, and well known protocols to follow when something like this is required.


Yep. All my deployments I set up so they can be run from anywhere (with the right deps and auth), even though I rarely need to use it. Makes it easier to test deployment processes too.


Maker here, I made Upptime as a way to scratch my own itch... I wanted a nice status page for Koj (https://status.koj.co) and was previously using Uptime Robot which didn't allow much customization to the website.

It started with the idea of using GitHub issues for incident reports and using the GitHub API to populate the status page. The obvious next step was opening the issues automatically, so the uptime monitor was born. I love using GitHub Actions in interesting ways (https://github.blog/2020-08-13-github-actions-karuna/) and have made it a large part of my toolkit!


https://news.ycombinator.com/item?id=25557032 mentions "~3000 minutes per month". GitLab's new pricing structure: [(runner_minutes, usd_per_month), (400, $0), (2_000, $4), (10_000, $19), (50_000, $99)]

You can run a self-hosted GitHub or GitLab Runner with your own resources: https://docs.github.com/en/free-pro-team@latest/actions/host...

GitLab [Runner] also runs tasks on cron schedules.

The process invocation overhead for CI is greater than for a typical metrics collection process like a nagios check or a memory-resident daemon like collectd with the curl plugin and the "Write HTTP" plugin (if you're not into using a space and time efficient timeseries database for metrics storage)

An open source project with a $5/mo VPS could run collectd in a container with a config file far far more energy efficiently than this approach.

Collectd curl statistics: https://collectd.org/documentation/manpages/collectd.conf.5....

Collect list of plugins: https://collectd.org/wiki/index.php/Table_of_Plugins

Is there a good way to do {DNS, curl HTTP, curl JSON} stats with Prometheus (instead of e.g. collectd as a minimal approach)?


Interesting idea, and props for well-executed landing page at Koj!

Personally I’m way of relying on GHA too much (I have a suspicion they might start introducing restrictions), but if it works it works.

I see you’re also using TypeScript, so I’m definitely taking a long look at your github-actions-starter (thanks for sharing).


> Interesting idea, and props for well-executed landing page at Koj!

Thanks so much! This isn't Koj's "Show HN" (expect that soon!), but we're doing some interesting stuff, both tech- and "home and living"-wise. Happy to hear any additional feedback for koj.co as well!

> I have a suspicion they might start introducing restrictions

I definitely think so, unlimited minutes for all public repos can only be sustainable with many many paid orgs buying expensive minutes. If restrictions kick in, I think I'll keep the "GitHub issues to status page" part of Upptime and have some integrations with self-hosted monitors or third-party services that then open those GitHub issues automatically.

> I see you’re also using TypeScript, so I’m definitely taking a long look at your github-actions-starter (thanks for sharing).

Enjoy! We're using Svelte and TypeScript everywhere at Koj, we also have some other starters here if you're interested: https://github.com/koj-co.


Wondering, do you have a sort of vocabulary of all the emojis you use in commit messages, or it’s more inspiration-based?


Interesting project and great docs!

Can you point out the code responsible for executing the http connection? If it at all modularized, I'd like to plugin code for a different kind of check.

(Are you thinking of supporting customized checks through plugins?)


This is where the `curl` happens: https://github.com/upptime/uptime-monitor/blob/master/src/he...

It is utilized by this function here: https://github.com/upptime/uptime-monitor/blob/master/src/up...

If you're thinking about different checks, perhaps you can explore this too? I have no idea where to start... https://github.com/upptime/upptime/discussions/73


Abusing version control and CI service as database sounds like a way for ruining it for open source projects that actually need it. The recent Travis thing didn't happen without reason.


I second this - it sounds like a massive waste to spin up a Github Actions job in order to trigger a basic request every 5 minutes, 300 times a day.

Why not just use a dedicated uptime service with a free tier? I've been using UptimeRobot [1] in the past - they give you 50 free checks at the same refresh rate of 5 minutes.

[1] https://uptimerobot.com/


I use Uptime Robot too. Been using them for years on the free plan. It's dependable.


Uptime Robot is great. Status Cake is another excellent service with a free/hobbyist/homelab tier.

https://www.statuscake.com/


How about self hosted enterprise GitHub?


Then the issue is getting a GHE license. Maybe if there was a community edition...


You say "spin up" which in my mind implies a whole VM. I think surely actions are just run in sort of a container of some sort like a lambda.

I would think it should be rather efficient to run a github action.


Yes and no. Depends on tenancy requirements. I know with docker and other common Linux container strategies you would want to keep each tenant on their own VM. A container isn’t safe enough.

So if this is your org’s only action. Then you’re probably spinning up a VM. If you have other options. You’re probably not adding any overhead.

(Edit: grammar)


Ya, but I doubt that is the case with actions, because I don't think you really have full access to everything. You provide a yaml file and their software runs that yaml which could easily exclude any dangerous commands. Plus, github offers a hosted runner service where you pay for a dedicated VM to run your actions in. So that makes it seem like actions are probably run together on larger VMs by default.


It is definitely the case with Actions.

> I don't think you really have full access to everything

You do.

> their software runs that yaml which could easily exclude any dangerous commands

Categorizing dangerous commands is impossible to do accurately by just looking at a yaml file.

> Plus, github offers a hosted runner service where you pay for a dedicated VM to run your actions in. So that makes it seem like actions are probably run together on larger VMs by default.

I'm not sure what this means. The paid hosted runners are not any different from the free hosted runners, but free runners can only be used on public repos.


You can also use more secure containarization technologies than native docker, like gVisor, to achieve both lightness and isolation.


I'm pretty sure GitHub actions do use a whole (though probably lightweight) VM. And I think AWS lamdas do as well, or at least used to.


Well, if lambdas are using VMs and have sub-second launch times, I don't think using a lot of github actions would cause much overhead.


This is a common misconception about Lambda and Functions. They don't always give you a fresh container/VM, but Actions does.

https://aws.amazon.com/blogs/compute/container-reuse-in-lamb...

https://docs.microsoft.com/en-us/azure/azure-functions/funct...


They support Linux, Windows, and macOS. Surely that cannot be covered just with containers. On Linux, they allow workflows with a large number of different containers involved and I don't think GH would be happy to debug all Docker-inside-Docker problems. So, I guess there is a control algorithm that keeps up to N (100?) VMs spinned up in a free VM pool with the KPI of VM allocation from the pool to be under X s (5?).

Edit: from https://docs.github.com/en/free-pro-team@latest/actions/crea... "Actions can run directly on a machine or in a Docker container".


I'm pretty sure the reason for "the recent Travis thing" is called Idera, Inc.


Yeah but bleeding money due to abuse of a free service surely contributed to the decision to sell.

Microsoft had much deeper pockets, but if people start using GH Actions as free Lambda, basically... the gravy train can't go on forever.

I figure it would be a better medium for GH to throttle CI workflows to something more reasonable, like 1/hour at minimum.


Does 1/hour really count as /Continuous/ integration/delivery?

They made 2000 action-minutes-per-month for free for a specific reason. I think they can bare the cost and the potential 'abuse' inhibited by free orgs.


I doubt Microsoft (GitHub's owner) cares that much about such tiny usage. They run all of Azure, after all.

A small price to pay, it seems, to lock more people into GitHub/GitHub Actions. We're talking pennies, here.


> I doubt Microsoft (GitHub's owner) cares that much about such tiny usage.

They probably don't care about one person doing this, but if tens of thousands of repositories start doing this it becomes a problem.

> They run all of Azure, after all.

They get paid for that.


They will for this too. Github actions aren't free. They just have a generous free tier.


No. Actions is 100% free for public repos. Full stop.


Currently, of course. Travis used to be free for oss-projects too.

I suspect Microsoft is less interested in making a profit on every single part of the github ecosystem, and less likely to change the pricing/limits for actions. But of course only time will tell.


Yeah, MS has been pretty vague so far on gha usage. Basically “have fun; don’t go crazy”. I imagine they are waiting and collecting more data before opting to impose any specific usage constraints.


They've only recently started playing nice with open source and developers. I doubt they'll do anything to jeopardise that.


Good point.


I've been using updown.io for my side-projects the last 18 months and I'm very happy with the service. One comment pointed out that status monitoring is expensive - I think I've payed a grand total of $5 for these 18 months, so I don't agree at all.


I love and would recommend updown.io. I use it for both personal and work. Recently, my personal version expired its free tier and I was in no hurry to upgrade. Nothing was mission critical, and I wanted to wait for the year to get over.

They drop me a good amount of credit to get my account running while I wait to upgrade.


updown.io is great - simple, reliable and cheap.


Throwing my old but reliable service in the mix, too: servercheck.in — I kicked it off with a Show HN about a decade ago, and it brings in about $170 MRR — enough to pay for the globally-distributed servers it runs on :)


Neat. A lot of the status page solutions are very expensive. Up to 1500 a month: https://www.atlassian.com/software/statuspage

And that doesn’t include the synthetic monitoring part.

That said, we use site 24x7 at work and it works well enough.


To be fair, 1500 a month for the companies that would be purchasing that level of the service isn't much. They would be spending more than that on resources to maintain an internal one via servers and people hours.


Yeah, it starts and 29 and goes up from there 1500 is their maximum price...


No, there are still limitations, e.g.: 50 Team Members


True didn't put that down my bad.


From GitHub usage policy:

“ (for example, don't use Actions as a content delivery network or as part of a serverless application, but a low benefit Action could be ok if it’s also low burden); or”

I think this would qualify as serverless application...


As a proof of concept this is fascinating, and I'm thankful the author chose to share it. It's got me thinking of how I might otherwise leverage Github Actions in unconventional ways. Acknowledging the potential abuse implications that others have mentioned in the comments, it's still a mind-expanding PoC that's going to make me think about things a little different in the future.


> It's got me thinking of how I might otherwise leverage Github Actions in unconventional ways.

I'm glad! I've been using GitHub Actions for many interesting things apart from building Upptime, like:

  - COVID-19 nonprofit work: https://github.blog/2020-08-13-github-actions-karuna/
  - Open-sourcing all my life data: https://github.com/AnandChowdhary/life
  - Open-sourcing my GitHub notifications: https://github.com/AnandChowdhary/backlog
  - Converting all Twitter "following" to lists: https://github.com/AnandChowdhary/twitter-list-sync
  - Writing notes and generating a summary: https://github.com/AnandChowdhary/notes


Why are there so many things these days about using some super complex cloud solution hack for something that can fundamentally be done easier and more efficiently from a cron job on a raspi zero on a home network.

curl yoursite.tld | some_script_that_parses_the_result and generates an uptime page > githubpagescontent.md; git commit -m $(get_the_current_time.sh); git push origin master

for $20, you can get the same thing with waaaay less stuff going on behind the scenes and without taking advantage of the ci resource for something it wasn't fundamentally meant for.


Well, outside of when GH goes down, managing this in the cloud allows you to make more stuff 'serverless' - you don't have to worry about hardware failure and the maintenance burden that comes with moving stuff [data, code, secrets, logical volumes, etc] to a new host when you need to decommission the existing hardware. As long as you commit to the repo manually every few months (GH will stop crons on repos that have no commit activity), GH will continue to run your code on good hardware.


The idea behind the "serverless" approach is a lightweight process which doesn't spend any measurable CPU / memory (at least for healthchecks). GitHub Actions is the opposite of that. After all, cloud is someone else's computer - in this case, several other projects can use this computer for something more important than health checks.


While I agree using Github is a bad idea for a status page, using any machine on a home network is an absolutely terrible idea.


What's the point of monitoring unless you can get paged about outages?


not to mention when, not if, github goes down


Might be a little off-topic, but I don't understand why monitoring solutions are both so expensive and simultaneously tend to offer such a low resolution. HTTPS-calls aren't particularly computationally expensive, why does 10s or lower poll intervals not exist for cheap? I expect that for most of these monitoring services, redundancy and fixed development costs are much more expensive than the actual underlying infra, so the marginal cost of extra HTTP calls is probably low. Why not offer high resolution as an offering to set you apart from the competition?


> Might be a little off-topic, but I don't understand why monitoring solutions are both so expensive and simultaneously tend to offer such a low resolution. HTTPS-calls aren't particularly computationally expensive

This is 100% the driving force behind pricing at my saas (lean20.com). Requests are cheap, but from our analysis people want 1/min intervals. What is your use-case of it needing to be more than that?


Internally it's about accuracy of reporting. If you do high uptimes (say >99.95%) rounding starts to matter.

If you want the less bureaucratic version: downtime for mission critical apps means customers call. Immediately. If your SaaS to manage a container terminal is down, that means the container terminal is down. That turns very expensive very fast. Knowing the system is down before you receive the first call is vital. Monitoring once a minute and taking 2 failing calls before reporting means you have a 2 minute delay. That is the difference between telling the customer 'yes, we've noticed and we're working on it' versus 'what downtime?' on the phone

EDIT: I looked into your startup, and definitely love that pricing model. That seems like the right stategy, and beats other monitoring solutions easily at scale (>10 hosts seems to be the cross-over point roughly). If you can report downtime within say 20s of it actually occurring via a webhook or Slack integration or so, I'd love to have an invite. E-mail is in my profile.


I think most monitoring services offer at most 1 HTTP request / minute checks, is there really need for lower intervals?


Unless the alert is managed by a downstream automated system, I don't see the point of having an internal that is smaller than 1 minute -- 1 second or 1 minute won't differ much for a human interaction, right? Am I missing something?


Yes it does. It takes about 20 seconds to place a call. If a mission critical system goes down, customers will refresh a grand total of 3 times, then call. As I explained in the sibling comment, reporting downtime within a minute is the difference between 'what do you mean the system is down?' and 'yeah, we noticed and are working on it'.

The idea that polling should be done every minute or so is madness. Here are a few thoughts:

- If your app is meaningfully impacted by even 1 requests-per-second extra you have more serious problems to fix.

- Rounding matters if you do high uptime. If 4 minutes downtime per month is high (e.g. ~99.99% uptime), rounding to the nearest minute could make a 50% difference in your downtime reported.

- You can speed up your resolving of any downtime by a almost a minute without any extra effort in terms of teaching people or having better processes in place. Even if the gains are small, this is the logical place to spend a tiny bit more to gain a minute on every downtime.

- For most things, standards don't need to be that high. But it doesn't hurt to have quicker feedback, and the cost is neglegible.

- Some systems are quick enough in recovery that you don't notice the downtime, like when switching over to a hot replica. Your customers will notice any short downtime that occurs. Why? Scale. Your polling client is only one, but you might have thousands or millions of customers. Chances are at least one of them notices a hiccup. You should view it as your responsibility to ALSO be aware of this, even if you choose not to actively improve on this.

- To phrase it bluntly: standards are too damn low in DevOps.


One limitation of their approach seems to be that it must be published in a public Github repo - it can't be a private repo.

This means it might not be so suitable for internal, or local apps - only public facing websites.

I wonder how difficult it'd be to make a mode for Upptime, that used say, local agents, and posted the results to a local webserver, rather than to public Github Pages.


Actually, it can! One of our users set up a proxy server for the GitHub API and used that as the source for the status website: https://github.com/upptime/upptime/discussions/54


I like using GitHub Actions for small automation tasks like this as well, but I'm curious what the execution time is for each of these runs. GH Actions has been reliable for me, but it tends to be rather slow due to the architecture. Also I feel like this would chew up all my build mins pretty quick (this is 8000+ runs/month).

Side note, I find it hilarious how many sites use those generic "modern" illustrations which have absolutely nothing to do with the service or page context. I know it's tempting to keep up with the modern design trends, but picking from one of those illustration packs almost always leads to questionable choices.


> I like using GitHub Actions for small automation tasks like this as well, but I'm curious what the execution time is for each of these runs.

I did a quick calculation and got to ~3000 minutes per month: https://twitter.com/AnandChowdhary/status/132966786702177075....

> Side note, I find it hilarious how many sites use those generic "modern" illustrations which have absolutely nothing to do with the service or page context

Absolutely agreed, but it makes the site so much cuter! I just made a Docusaurus template and used it for a bunch of my open-source landing pages, like https://upptime.js.org and https://stethoscope.js.org, but yes, it needs updating!


Guilty as charged, as I've used those illustrations for a few personal projects as well.

For anyone interested, you can generate custom variations of them here and see more from work from the original artists:

https://blush.design/collections


@anandchowdhary Can you explain the security implications of using Upptime?

Does it need a token to access all of the user's GitHub repos?

That's a little worrying because the code does not reside in the git repo it's fetched from third-party repos by the GitHub Action. Does this mean the user has no way of auditing and locking down the code to prevent future compromises to third-party repos?

If yes, then an attacker that compromises Upptime or one of the third-party GitHub Actions could hijack all of the users repos (including non-Upptime repos owned by the users).


This is great! Sure even github might have downtime. But keeping track of the status for small side projects this is perfect


I understand how people are saying "why not just do X?" but let's all try to keep in mind that this thinking goes against our ethos of invention and reinvention. As the adage goes, "if you have to ask why, you wouldn't understand".


Conversely, if you can’t explain “why?”, maybe the idea is just shit.


In case the designer of the page sees this, your dark mode needs some work: https://imgur.com/JHP9C6r


I get the negative comments sentiment, but the idea is clever.


In the same way a Rube Goldberg machine is clever? Sure.


In the not too distant past GitHub was hit by some massive DDOS attacks themselves.


Anything that is "GitHub-powered" as a requirement is not open-source.


Open-source is just "the source code is available under a permissive license", I don't see why having a project based on GitHub infrastructure is not open-source. All the code is MIT licensed here...


So is youtube-dl and thousand other api wrappers, if thats the case you wont have majority of the open-source community.


youtube-dl is a bad example because it isn't designed around using a site. It's designed around not using a site.


Love the name. Simple and original :)


I think I'm just lazy now... I first built Uppload (https://uppload.js.org) and now Upptime, so I guess I need a list of "up"-starting words to continue building projects!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: