We've been using Sleuth at LaunchDarkly as a single pane of glass for all changes going out into production. The addition of DORA metrics is exciting-- as an engineering leader, Accelerate (https://itrevolution.com/accelerate-book/) is one of the few books I've read where the practices described meet reality. Having a tool track those metrics is a welcome addition.
One thing I'm curious about is Sleuth's approach to the "garbage in, garbage out" problem-- if I'm tracking (e.g.) MTTR, I've found that our teams aren't always perfect about tracking the start / end times of an incident. If the data's incorrect, can I modify it in manually?
That's a great question, failure rate and MTTR are the two metrics teams have the most trouble getting a real handle on. We've found that different teams define change failure and MTTR in widely differing ways. Some customers just want to track incidents where others are using team KPIs as their definition of failure.
Today, you can manually update the status of a deploy as an incident, rollback, unhealthy or ailing. This allows you to "correct" data that Sleuth may have gotten wrong via integrations to Datadog or your incident management system. Right now the correction is at the deploy level. However, we do have more control coming soon so you can override any period of time as having been in a specific state.
We've been using Sleuth for a while now, and I can say it really helps to track what was deployed, when. This is one of those tools that all software teams need, and most don't have, or just have bits and pieces in disparate systems.
I’ve been very fortunate work with this legendary team and lead design for this project, and I have to say that working with teams that releases constantly, I absolutely love that Sleuth lets me know each time the work i've contributed to gets deployed to our customers and its impact - It's also super insightful to finally see the lifecycle of a deployment from my vantage point.
Hi HN, Dylan here, co-founder of Sleuth (https://sleuth.io) along with Don and Michael. Together we have over 40 years of experience building developer tools at Atlassian (Jira, Bitbucket, Hipchat, Statuspage).
Over the past 15 years we've witnessed first-hand the positive impact of frequent deployments, but watched as teams struggled to get the visibility needed to continue progressing towards Continuous Delivery.
In our experience, the larger the team, and the faster they move, the harder it becomes for engineering leaders to find "trust but verify" moments - the moments where you should dig in and ensure your team is improving or in a good place.
With Sleuth, we decided to focus solely on Accelerate / DORA metrics because 1) studies have proven they affect software delivery performance, and 2) they're project- and team-based, vs. targeting individual developers. A healthy team depends on trust between team and leadership. Sleuth helps you verify without breaking trust with your developers.
Sleuth is the #1 most accurate way to track Accelerate metrics because it integrates with sources of truth beyond source control - such as issues, builds, observability metrics, incidents, etc - and tracks deploys to all your environments, so it knows when a change actually deploys to your customers.
Deploy Frequency and Change Lead Time: Sleuth gives you a detailed breakdown of time spent by activity, like coding, waiting for review to begin, reviewing, and how long it takes to deploy. More importantly, it shows you exactly how the metrics came to be.
Change Failure Rate: we recognize that failure could mean something different to different teams. Sleuth allows teams to define failure as broadly as an incident, to something as simple as a rollback, or as finely as a custom observability metric being outside of its normal range. To do this, Sleuth connects with observability tools like Datadog, New Relic, Sentry, etc.
MTTR: Understanding failure means we have an accurate view of MTTR, and Sleuth lets you know how much time your project has spent in: incident, rollback, unhealthy or ailing states. With instant Slack notifications Sleuth sends to developers, you can easily drive your MTTD to zero.
We want to help teams improve, not just track metrics, and the best way to do that is to empower developers! Features like deploy locking, Slack-based deploy approvals, and deploy verification help make deploys easier and less stressful to developers - and makes Sleuth a tool for devs as much as it is for managers.
To date, Sleuth is used by teams at companies like LaunchDarkly, Ujet, Secure Code Warrior, Flatfile, Automox, Atlassian, and more.
You can check out Sleuth by going to our website (Sleuth | Accelerate metrics and deployment tracking ) or better yet, watching us sleuth in Sleuth (https://app.sleuth.io/sleuth/sleuth/metrics/lead_time) - building dev tools is the best! Or you can try our 30-day trial and quickly find out what your Accelerate (DORA) baseline looks like.
We’re sure many HN members will have encountered similar challenges with engineering productivity and/or have expertise in this area. We’d love to hear from you: How do you track your team's performance? Do you track Accelerate metrics? What works and doesn’t? Thank you!
One thing I'm curious about is Sleuth's approach to the "garbage in, garbage out" problem-- if I'm tracking (e.g.) MTTR, I've found that our teams aren't always perfect about tracking the start / end times of an incident. If the data's incorrect, can I modify it in manually?