Panic used to have an iOS app that did this. It was called Status Board, and was magnificent.
You could put it on an old iPad on an easel on your desk and watch everything from RSS feeds to ping statistics. In an office setting, you'd hook the 'Pad up to a cheap flat screen TV so everyone could see.
Sadly, Panic discontinued it when it decided to go after the video game market.
Aside: this is one of the biggest lessons of my adult life. Just because I could make something doesn't mean I should make something. Learning to value your time is a very underdeveloped skill.
If it is a sarcasm, then please mind that original comment author got really humped by this app's vendor when it stopped working. Maybe Stallman got something right after all?
In Stallman thinking though, he doesn't have the freedom to fix or update it himself.
(I don't think Stallman's ideas are necessarily right for everybody, but I'm glad he's doing his thing right out in his end of the bell curve to counteract the opposite end of the software philosophy craziness...)
Under Stallmans model the developer works for free, and I guess they need a second job to pay the rent. Ah but GPL doesn’t mean free as in beer? Oh yes for practical purposes it does mean exactly that. Outlier business models excepted of course!
Actually no. Under Stallman's model it is perfectly okay to demand and earn payment for your work, but results still would be free, quote, as in "free speech", not "free beer", unquote. Did you see bounty offers in open source repos' issues?
I really don't understand, is sounds too ignorant and almost arrogant.
Web browser doesn't need to be set up, it is already included with iPad (or any Android device, or any laptop or desktop, or your smart TV, smart fridge or smart watch).
You need to configure your app during first launch pointing it to your data source, and maybe entering login and password. It is exactly the same amount of hussle compared to opening web site and make it default homepage of your browser.
I don't even want to talk about money, it's totally irrelevant, $10 or $0.
They added a tiny mention in the installer when I complained, but they still don’t mention how to opt out. The opt out doesn’t disable the spyware in the webpage, either.
I have a patched version (sneak/netdata) on dockerhub, if you like. The issue is that it’s just not that great of a system monitor. Looking at, say, a “last 24h” chart is difficult. It’s a good and pretty replacement for top/htop/iptraf/iotop, but that’s pretty much it. You still need a graphite or prometheus or mrtg/rrdtool ultimately for serious understanding beyond “what happened in the last few minutes”.
There are 17 lines of text output to verify the installation choices, of them 5 are devoted to opting out of the anonymous telemetry. The wording seems clear.
NOTE:
Anonymous usage stats will be collected and sent to Google Analytics.
To opt-out, pass --disable-telemetry option to the installer or export
the enviornment variable DO_NOT_TRACK to a non-zero or non-empty value
(e.g: export DO_NOT_TRACK=1).
Also worth mentioning: they are lying when they say it’s anonymous. It includes your IP address, which is a globally unique identifier. I believe it also transmits an installation ID, which persists on the machine.
I doubt that my IP address is a globally unique identifier. I can think of 15-20 devices that are sharing it now. Advertising networks jump through a lot of hoops to tie those devices together in a small cluster of user profiles.
The y/n is for install. It’s saying “do you want to install with telemetry, xor not install at all.”
That’s like those EULAs on websites that say “by continuing to use our site, you agree never to sue us, to surrender your first born to us, and to never speak ill of us for any reason, et c.”
That’s simply not affirmative consent. Imagine if you tried that in life! “Anyone who stays in this room after 5pm is consenting to be groped! Proceed at your own risk.”
That’s not how any of this works. Stating your intentions to assume and potentially violate consent is not obtaining consent.
I took another look, it's not a y/n prompt as I wrote before. It lists all the settings and asks for confirmation. Each of the other settings requires a trip to the docs to find out how to change before re-running. The telemetry is given special prominence with instructions about how to change before the prompt.
Have you found any tools for writing custom dashboard in netdata?
The docs are really light and custom dashboards seems to involve hacking the default HTML and pulling out all the components (javascript/divs) you need. I thought about writing my own, but have too many projects already.
Set up netdata, check which plugin serves the type of chart you want, find shortest of that type, make a copy and hack it untill it works. I was doing it a while back so I cant remember the details but it wasnt something special to do... Ignore the docs, existing plugins are all the documentation you need.
I know people like wallboards and monitors but we found them anti-pattern. If you find yourself looking at a wallboard/dashboard, it should already be an automated alert.
Understanding your metrics is a key part of so many roles, from devops, to product teams, to marketers...
Yes, you should be automating alerts whenever possible. Yes, you should be putting up key metrics in a visible place so everyone can see how the product is performing.
I can’t tell you how many times I caught an issue because I knew our metrics backwards and forwards, but it didn’t trip an alert threshold. Not every issue follows a pattern easily defined in a check, and human brains are incredible computers capable of helping to fill in that gap.
> I can’t tell you how many times I caught an issue because I knew our metrics backwards and forwards, but it didn’t trip an alert threshold.
So how many times was an issue missed because you weren't in the office, or because you were looking at your own screen and not dashboards at the moment?
Humans are incredibly powerful, but our whole job as SREs is to make things reliable, repeatable, and scalable. We're doing an industry-wide migration from elegantly hand-crafted LAMP stacks running SSH to Kubernetes and infrastructure-as-code, not because you can't fix problems with SSH (you can, and you can usually fix them faster and better) but because you can't scalably fix problems with SSH. Similarly, if a human found an issue and alert didn't trip, I'd count that as a bug/missing feature in the monitoring.
It's valuable while you're still small and working out your monitoring to keep a human in the loop - but at some point you need to get rid of that single point of failure. By all means, rely on a human to figure out where your alerting is lacking (just like you rely on a human to write the infrastructure-as-code), but you should eventually not rely on human intervention to actually keep incidents from happening.
Instrumentation and alerts are vital - they leverage inhuman persistence, patience and low cost. But alerts do not substitute for a deep understanding of how your systems work.
A number of the more useful "pre-crime" alerts we have derived from that - if I hadn't been elbow-deep in our systems long enough to notice certain behaviors have non-obvious second- and third-order effects downstream, we wouldn't have the alerts at all.
So, I'm making a bit of a subtle claim - you should absolutely be elbow-deep in your systems, and you should be understanding things well enough to build these sorts of proactive alerts, but you shouldn't rely on people being elbow-deep for noticing problems in real time.
If you're ever at the point where you catch a problem and automated monitoring didn't, that's a bug in automated monitoring. If you are really good at finding new bugs in automated monitoring and more things to monitor because you're spending your time getting a sense of how the system behaves, that's fantastic, keep doing that. (That is one of the good reasons for dashboards IMO - a bunch of data to look at when you've already realized something's wrong. Just don't use dashboards to make the decision that something must be wrong.) If you don't improve your automated monitoring and you're worried things will start failing without humans watching dashboards, then you're not solving your existing bugs.
> but you shouldn't rely on people being elbow-deep for noticing problems in real time.
I completely and unreservedly agree.
> that's a bug in automated monitoring
As part of incident review, we explicitly added a "review monitor performance" step. My favorite part is that the number of times monitors are created, adjusted or complained about post-incident is in itself a highly useful datapoint.
So how many times was an issue missed because you weren't in the office, or because you were looking at your own screen and not dashboards at the moment?
That's not a problem with dashboards. That's a problem with training and staffing people.
because you can't scalably fix problems with SSH.
The number of businesses that need to worry about scalability is vanishingly small compared to the number of businesses that don't. Let's not pretend that one company's problems are the same as another's.
you should eventually not rely on human intervention to actually keep incidents from happening.
He didn't state that the dashboard was the only way his organization kept tabs on things. He indicated that it was only one way, and specifically stated that an alert system also exists.
Why should Mike have to remember this? Why should all of your infrastructure depend on Mike not getting a text from his wife while walking to the fridge for a La Croix?
I read your comment as sincerely saying that such an arrangement would be "brutal". Looking at your downvotes maybe people think you were being sarcastic?
> That's not a problem with dashboards. That's a problem with training and staffing people.
Again, the whole point of us being computer people is that we think computers can solve problems in repeated, reliable ways. You can run a highly reliable, say, delivery-based bookstore by having a well-staffed group of well-trained human phone operators who pass messages onto human shippers. People did that (and they still do), and it worked. But we have the thesis that you can do this more efficiently and more reliably - in short, that you can deliver more business value - by using computers to automate the process.
> The number of businesses that need to worry about scalability is vanishingly small compared to the number of businesses that don't. Let's not pretend that one company's problems are the same as another's.
I do fully agree that different companies have different priorities, and in particular I think it's totally fine to rely on humans in the loop while a system is still young (or has just been redesigned) and you don't have a good codified sense of how it behaves yet. However,
1) Wall-based dashboards aren't a best practice, any more than SSHing to production servers is a best practice. It's the right thing for some cases, some of the time. I'd agree with "It's a valuable skill, and it's been useful;" I disagree with "It's so valuable you should make sure everyone does it." If you have the option of either getting good at alerts or getting good at dashboards, spend your time getting good at alerts, first. I'd say the same about infrastructure-as-code vs. SSH-to-prod (and I say this as someone who regularly SSHs to prod and is real good at single-machine old-school sysadminnery).
2) Scalability isn't about absolute size, it's about how much you can do with the resources you have. Small teams and not-yet-profitable teams need to focus more on scalability (in the sense I'm using it) because they simply can't staff enough people to cover up gaps in operability. For example, you're much better off figuring out how to set up HA and automated failover than saying "We're too small for that," setting up a weekly pager rotation with people on call 24 hours a day, and alerting them so much they can't do non-toil work (or worse, burning them out and having them find another job).
Many years ago I was on a ~4-person team at my undergrad computer club running web hosting. We ended up getting popular enough that many real university applications (course websites for submitting assignments, etc.) depended on us. Our priority was that, as students, we couldn't get paged during finals week because our academics would take priority, and yet finals week was the most critical time for the service to stay up. So we got real good at HA, at reproducible deployments and config management, etc. (I remember one time we spun up a new server during finals week - and we didn't have to do any fiddling to add it to the cluster precisely because we'd automated the provisioning process.) We had web pages with graphed metrics to inform our capacity planning, but no dashboards that anyone was expected to stare at, just alerts on full outages.
> Similarly, if a human found an issue and alert didn't trip, I'd count that as a bug/missing feature in the monitoring.
The way that I took the GP's point was that humans can find things that haven't yet been automated, while automation can't (at least not yet, but I'd argue it'll take AGI for that.)
Yes, I agree with this. But if you're relying on humans to look at dashboards to keep your actual service up in the moment, you're not seriously committing to automating (just like if you SSH to every machine you Terraform to tweak things, you're not really committed to Terraform).
What you should do is rely on automation to detect problems and alert people, and in postmortems, look at graphs and have humans say things like "Hey, this queue kept steadily climbing for three hours before the outage" or "We would have noticed it in this metric but it's so noisy so we can't alert on it" or something. Then you can write more automation (or focus on some prerequisite dev work).
I don't think anyone is arguing that, though. Lots of things humans notice e.g. "we speculatively upped the virtual file system cache and now the service has worse throughput but better high nines response time" is not something you can really build an alert for, and neither is it something you really want an alert for -- but absolutely something that would show up on a dashboard you're intimate with.
In other words, people are not arguing replacing alerts with humans, but rather arguing that continuously looking at your metrics give you a mental model for how your system behaviour changes in response to changes in configuration, whether intentional or not.
From the very first formulation of Ubiquitous Computing, the idea of a calmer and more environmentally integrated way of displaying information has held intuitive appeal. Weiser called this “calm computing”..
When information can be conveyed via calm changes in the environment, users are more able to focus on their primary work tasks while staying aware of non-critical information that affects them. Research in this sub-domain goes by various names including “ambient displays”, “peripheral displays”, and “notification systems”...
A Taxonomy of Ambient Information Systems: Four Patterns of Design
One of the most important things a team needs over the long haul is a feel for their system. Many people refer to this as mechanical sympathy. And the way you develop that is long-term exposure to rich data.
Alerts are the red and yellow lights on your dashboard. But you get mechanical sympathy by listening to the sound of the engine, feel of the road, and the smell of things when you take a peek under the hood.
There are a lot of ways to achieve mechanical sympathy, of course. And information radiators are easily misused; you have to have the right information shown in the right ways for people to develop a correlative, intuitive understanding of what they've built. But nobody develops mechanical sympathy by looking at dashboard lights alone.
> you have to have the right information shown in the right ways for people to develop a correlative, intuitive understanding of what they've built
Lots of things have to be right for this to work, unfortunately, and company dashboards I've seen so far tend to be nowhere near it.
For instance, the dashboard refreshed $PERIOD only makes sense if you're showing data that updates $PERIOD, and if you can respond to changes in that data $PERIOD. $PERIOD = "in realtime" or "every minute" or "hourly" or whatever is relevant in a given context.
If you're looking at the dashboard much more frequently than the data changes, you're wasting time. If the data changes much more frequently than you're looking at it, you're likely to miss things, as 'geofft mentions elsewhere in the thread. And if you can't react to the data roughly as fast as it's updating, there's no point in looking at it so often. All those periods - recording, observing and reacting - must be roughly similar for the always-on dashboard to be useful, relative to generating reports every now and then.
Panels full of lights and charts work on fighter jets or on the bridge of the Enterprise, because the pilots/crew are in a tight feedback control loop with their dashboards.
(WRT. reacting in time, there are also error bars to consider. For instance, people on a diet are advised to weigh themselves weekly and not daily, because body mass varies by +/- 2kg during the day, so a naïve person checking weight daily would get fixated on those random oscillations. It's easier to tell regular people to reduce measurement frequency than to explain to them what a low-pass filter is and how is it relevant here. I have a feeling there's plenty of dashboard misuse that amounts to that too.)
--
Speaking of the Enterprise and "getting the feel for the system", there's something that I'd like to try one day: make a monitoring tool that translates various system metrics into background sounds, creating an ambience similar to the one you hear on the Enterprise-D[0][1]. I feel a somewhat unobtrusive mix of background noises would be better to develop "the feel for the system" than a visual dashboard. Real-life examples of this are combustion engine's RPM, or spinning rust hard drives, if anyone still remembers those.
[1] - In Enterprise's engineering, there's a well-known pulsating sound of the warp core; I can't find a good enough YouTube video (whatever there is, apparently got broken by YT's audio compression). This background pulsing correlated to the speed Enterprise was traveling with.
Agreed! I find dashboards at most companies disappointing. And often for the same reason I find other stuff on their walls disappointing: it's frequently irrelevant or actively unhelpful to the work actually being performed.
For me, good dashboards are like good checklists: they should be living entities owned by the team in question and regularly updated to address active concerns. And they don't even have to be complex. Back before CI was in fashion, I drove giant changes in a team's behavior just by having a single LED indicator (the now-departed Ambient Orb) show the state of the current build. Previously, the build would stay broken for weeks at a time, only converging to green around the time of release. Nobody liked that, but they were used to it, so they'd just work around it. But once it was visible and discussed, they eventually got so the build was green almost all the time. It was less painful and saved a bunch of time.
I would absolutely love to try out a set of ambient audio indicators. I suspect I'd want to try it along with a visual dashboard, because the moment I hear something anomalous, I'm going to want to look up and see the recent history, so I can correlate the audio with what it represents and what else is going on.
If you just want to have a nice visualization to look at some numbers, fine. But, if you want to detect problems, it's ineffective. I saw too many companies do it to actually monitor the state of things and find out problems with charts, numbers, traffic lights etc.
You can do both. Especially at the beginning of a system's lifecycle and you don't really understand its behavior yet. Lots of times, people wandering by have said hmm, that doesn't seem right… Later, as we learned more, these hunches evolved into more advanced automated alarms.
But that's my point, it isn't for alerting about problems, some things have a 'status' that might be interesting, but isn't a problem, or something to fire an alert on necessarily.
You could have unintrusive notifications (inaudible etc.) to 'alert' to such statuses I suppose, if they were kept in view and not 'dismissed' (whatever that means for the medium they came in) - but then really you're just implementing a version of something like this Monitoror in your inbox, phone notification tray, Telegram channel, or whatever.
You're not going to rip out logging, prometheus, or services' that this connects to own UI just because you have alerting, so I don't see why you would this. It's like prometheus & grafana for higher level stuff. (Of course you could use those tools for this sort of monitoring too, but that's not really the point.)
A "nice visualization" is not necessarily just a "pretty"/"shiny" thing to show off to people. Human beings are highly visual creatures with outstanding visual pattern recognition abilities. Maybe you personally don't get anything out of them but the value of visualization is proven. Here are a few sources to get you started: https://www.csgsolutions.com/blog/15-statistics-prove-power-...
I think the fundamental question of all such tools is "Why are we watching this, and what are we looking for," and there are limited but nonzero good reasons to have a display. "Someone should look at open PRs if there are too many" is a bad one - the number doesn't tell you about the urgency of the existing PRs. If you want to respond promptly, respond to all of them promptly.
"We need to know if we're falling behind" is a possible reason to create an alert, not a dashboard. If you really want people to drop what they're doing and triage issues if there are too many, make an alert. If you don't, you'll just get a rectangle that turns red at some point and train people to ignore red rectangles on the board. (Relatedly: I added a pageable alert to my team a few years back to check whether there are a large number of non-pageable alerts, because it usually means something has gone wrong at a low level and we should investigate urgently. It's worked out pretty well, but the alert looks only at tickets created by our monitoring systems, not at tickets created by humans.)
"We need to see if we're getting worse" is a reason to have managers review graphs periodically, not a reason for anyone to stare at a single display. You can't track long-term trends from a status board.
"I need to see what to work on" is a valid reason, but much more useful in the form of a website you can visit on your own computer with links to PRs, not a raw number on a TV screen. (My team has a TV showing open tickets in our queue, both support tickets and automated alert, but we all have an equivalent link locally, too. Showing the names of tickets is useful for "Hey teammate, can you look at the second ticket there? Sounds related to a thing you were working on.")
I'd say there are roughly two useful cases for screens like this. One is to show to internal customers, so they say "oh, service X is yellow, so the slowness I"m seeing isn't just me, I'll do something else for a while." But those screens aren't primarily for the team that owns the product, they're for teams that depend on the product. (Such status boards can be either automated or manual.) The other is to show graphs of various metrics to see abnormal behavior, with the idea that no action is ever triggered by someone looking at the graph, but if you're already investigating something, it's useful to say "Hey, that's funny, this other thing spiked at about the same time even though it's within acceptable limits" and then you have a clue for investigation.
All PRs are WIP, and minimizing WIP is very valuable in product development processes. See Reinertsen's The Principles of Product Development Flow for the math, but basically high/unpredictable latency drastically limits the pace of learning and causes a lot of upstream thrash and waste.
I remember talking with one team at the bird-themed social media company that was frustrated with slow PRs; they dropped average delay from 3-4 days to under 4 hours. They said it made a huge experiential difference and they loved the change.
Yes, I understand why you'd want to focus on solving the number of open PRs. I agree that keeping that number down is good. My question is why do you want to put this on a TV screen.
If you want people to focus on open PRs, tell them to open GitHub on their computers, don't tell them to look up at a TV screen periodically. Treat it like alerts: you have a list of open things to deal with and you need to get that number to zero. There's no threshold greater than zero of a long-term acceptable number of open PRs.
If the problem is that they have other things to look at too, installing yet another TV screen won't solve that, your team needs to make the management decision of what to prioritize. Options include making a unified dashboard of incidents/alerts/PRs/support tickets (and encoding which ones sort to the top), setting up a PR review rotation (i.e., for one week, completing reviews is your top priority barring all-hands-on-deck incidents), treating open PRs as alerts and escalating them if nobody replies within 4 hours, removing other work by deciding you'll deprioritize low-impact alerts (and hope that the increased development velocity ends up solving problems), etc.
The notion with information radiators not that you tell them to look up. The notion is that people naturally look at things while walking around or when idle, so it's valuable to make important things visible. It also serves as a way to trigger and focus discussions.
We loved having a physical map of what we were up to. We'd have our daily stand-up around board and discuss it. You'd know when something was completed, because you'd see somebody move a card. I would often know when the product manager was thinking about something he'd go over to look right at it. That often sparked conversations. And we'd all have a feel for how work was flowing, something we'd talk about in our weekly retro.
Could this have been replicated with a system of alerts? No. Alerts are interruptive and necessarily threshhold-driven. I don't want my people caught in a cycle of continuous reactivity to things that at some point in history were seen as important enough to configure an alert. Except for emergencies, I want them to be serene, thoughtful, and proactive, which is very hard to achieve if you're continuously juggling alerts.
So I'd put up something with PR stats if it were something I wanted us to be aware of. Especially so if it were an item of concern in previous retros. Maybe that would eventually lead to an alert (although I'd hope not). But the first step in solving a problem is understanding the problem, and I think information radiators are great for that, especially when problems are thorny and don't have obviously correct answers.
That's fair - I think part of it is also that you don't really have a green vs. red state (which is a good part of what I object to in the demo presentation), you just have a general feel, and no specific state is defined as an actual problem. (And most of what you're trying to achieve is a shared sense of what's being done, which is very different from a shared sense of what's broken and needs fixing.)
I think wallboards can be interesting. Do you want an alert if your site is suddenly trending on Twitter? If latency and error rates are good, probably not. Would you be interested if you walked by and noticed? Probably.
Well, I have a clock in the wall in front of me that permanently displays the time. I check the time several times a day, for example to check how much time left I have to do something before lunch or going home, or a having a meeting.
I don't know why are we discussing the practical uses of a clock. I can't imagine a life where one is allowed to look at a clock only when an alarm or alert is triggered.
Calendars, clocks, real-time notifications, and video chats are all anti-patterns, distracting developers from their zone of genius. Just send a concise email at the beginning/end of the day/week. (One can only dream..)
I’ll chime in here to say we use both at work.
In a NOC at a medium-sized ISP, we are getting hammered with alerts 24/7. Some are not urgent, while others need to be actioned much faster - I mean 100G transit link down is no good.
We’d receive an automatic email about a large circuit going down, we’d also receive a ticket about it; sometimes people dont look at the tickets closely enough, other times people get distracted with other topics, issues, etc. Having a large screen with interface status monitoring has proven to be effective enough; for example, someone walks by the monitor and says “why is this thing red, is it supposed to be?... and we immediately know one of the larger interfaces is down.
In an ideal world, we would not need it because every ticket will be diligently dealt with.... however in a real world, having a big red part of the screen flashing had proved quite effective.
Well the thing is alerts are indeed for actionable events.
For example many remote locations have an on-site battery backup, which would supply power in an event of loosing commercial power. Those are actioned in terms of notifying field teams and deciding whether a specific location needs to be placed on a generator.
Imagine a hurricane disrupted commercial power grid and there are thousands of “site on battery” alerts; somewhere among them there is also an alert for OSPF down between two core switches.
Having a monitor with a large red warning saying “Link X at location Y is down!” - is a pretty effective way to not miss important notifications.
I mean playing devil’s advocate one might say “Then your alerts should have better filtering system with the important ones staying at the top of the page”... which is true. A lot of smart design features can render dashboards less relevant - however when there aren’t enough resources in a DevOps team to implement those solutions, a simple dashboard can go a long way!
There‘s a wide variety of „requires action“. It might be that it‘s fine to act within 1 hour or within 10min. Both deserves an alert, but only one requires you to immediately stop your coffe break...
In an ideal world, I agree. But sometimes an automated system can not perfeclty decide about the severity of an alert which leads to some alerts being ignofed, which is fine.
I've spotted interesting "things" from idly looking at our dashboard while chatting with coworkers (and more than a few were interesting enough to warrant a lot of investigation and double-checking of metrics, providers and stack). They were not alert-able, or not very easily unless we wrote some complex time series analysis system for our internal metrics.
You know, for some people I think that's true and for others it's not. There is real value in making some data reactive rather than proactive in communication. Knowing current active traffic, open PRs, time til build is done, all that kind of stuff is 'I would like to see it/check it...but I do not want it to interrupt me.'
People who deal with tens of interruptions at that level are clearly not very productive.
On the other hand, for the site returning non-200 or for API issues, that should be an alert, for sure.
Kinda surprised that Slack or MS Teams isn't in this market.
By that logic a speedometer is an anti pattern and your car should just send up an alert when you're speeding... since when is getting accurate real-time information a bad thing?
I strongly disagree. I can think of a ton of reasons why a driver may need (or even be legally required) to know their speed regardless of speed limit:
* when speed restricted by equipment (trailer, temporary spare, etc)
* when observing advisory speeds
* when observing minimum speed requirements
* as a reference for judging appropriate speeds under inclement conditions
* as a reference for judging appropriate acceleration/deceleration rates when entering/exiting the roadway
Of course an alert system would have to be able to understand all those things. That's why we don't have that kind of system.
A single number in isolation is rarely useful. Graphs with trends are useful. Alerts are useful.
The only reason we don't have alert based speeds is because it can't get all the necessary information to make a useful alert, so we compromise by telling you the number.
> as a reference for judging appropriate acceleration/deceleration rates when entering/exiting the roadway
A perfect example of why a graph would be ideal here, not a single number.
> reference for judging appropriate acceleration/deceleration
> perfect example of why a graph would be ideal
A gauge chart, maybe? :D
But seriously, if we have a system that appropriately judges everything on my laundry list above, you probably won't need an alert system anymore because the cars will be self-driving.
I have that in my 2020 Ford, and you can tune the alert threshold from 0-5mph over the limit. I wouldn't even consider turning it on unless I could set it to at least 10mph, however.
The vehicle has a camera that looks out for speed limit signs, and then updates a little icon on the instrument cluster with the current speed limit. It works very well.
I was driving a car like that few times, with beeping when the limit was exceeded. Annoying, but informative, unless road sign was limiting vehicle mass to 10t and not speed to 10km/h. It was carsharing vehicle, so I didn't bother to turn it off.
Funny thing, this is actually a feature in Teslas. You can set it to chime once the speed limit is exceeded (in areas where it knows the limit). Although, I've never seen anyone turn that on.
Yes, which is why you shouldn't use wall boards for alerting, only for visualization.
https://demo.monitoror.com/?configUrl=https://monitoror.com/... is full of things that aren't visualizations at all (no graphs, no sense of whether things are abnormal but not past an alerting threshold, etc.) and are in fact alerts (the website is fine, one PR failed, the QA nodes are ... doing something but there isn't enough space to see what is wrong).
If you want some graphs, great. If you want your team to look up every few minutes and poll some graphs (or worse, some colored rectangles) to figure out what they're supposed to be doing, consider that polling is usually the wrong approach.
(To be clear, this is a criticism of the choice of demo data, not of the product overall. A product like this has its uses, but "our alerting system is people looking up at the TV" is not one of them.)
Getting any sort of interesting insight from that surely requires the context of historical build statuses?
How long will it be down for? When will it be down next? How likely is it that it goes down next week? Is it just me or has it been down a lot this month?
For my use case ("check when the cronjob X on this machine last ran successfully"), setting up a data ingress pipeline which I could later configure as a time series data source seems like 3 times the effort it should actually take.
I've been looking at different status board tools and the one thing I've always found missing is dual-stack IPv4+IPv6 tests. It'd be nice to be able to see that both protocols to a given port are working as expected.
I don't want to write my own, so I'll probably settle on one and try to offer up a PR for dual-ip stack checks. I'll take a look at this one too.
Problem with this is that any half competent team can put something like this up in an hour or so. Wallboards are for high level stats - like 1 or 2 numbers the team should focus on.
Maybe the tiles are super smart and can do uptime testing, log monitoring etc in which case this should be positioned as an uptime tester/log monitor etc
Speaking from personal experience, our team originally made our own wallboard, and it was put on a big monitor in our space. Originally all was fine, the board would stop updating numbers once in a while, but nobody really cared. Just ssh into our raspberry pi, and restart the services.
Turns out that our scrum masters and product owners looked at this board when they walked by, now they wanted to see other things as well. So they started allocating developer time to build these statistics, obviously a job nobody wanted to do. So we bought an existing solution that had all the data sources we needed, and let the business manager their stats.
So yeah, I agree anyone could build it themselves, but it rarely sticks to those 1 or 2 numbers, in which case, it's cheaper to spend a couple of bucks, than have developers continue to support it.
Dashing was kinda garbage through. There was no standard/sane way to install new plugins. I haven't checked out the currently maintained for, but the original is dead/archived. I made the following for Dashing for tracking Seattle Transit:
Could you grab and parse content with this?
I'm not really using CI stuff, but showing events (calendar), grabbing weather data or output from other simple commands (health checks) could be of use. Didn't find any of that in the example tiles
The first UI config example has a PING tile, but PING type seems to be disabled by default, and I can't find how to enable it in the docs. So maybe a good thing to make more clear for people wanting to test quickly.
It looks like it doesn't actually support changing the port currently, despite the documentation saying it is possible. I already use port 8080 so kind of stuck until I can use a different port.
Turns out its just too early in the day. I wasn't saving the variable beyond setting it. So when I switched terminals it didn't exist. Put it in my bash profile and all is well.
Nice design! Looks to be targeted at developers, but it could be good for product managers too. Some tile ideas; Issue counts, PR counts, vanity metrics... plenty of room for extension :)
You could put it on an old iPad on an easel on your desk and watch everything from RSS feeds to ping statistics. In an office setting, you'd hook the 'Pad up to a cheap flat screen TV so everyone could see.
Sadly, Panic discontinued it when it decided to go after the video game market.