Writing post-mortems is generally pretty kludgy. You might have a Slack bot that records the big picture items, but ideally, a post-mortem would include connections to the nitty-gritty details while maintaining a good high-level overview. The other thing most post-mortems miss is communicating the discovery process. You'll get a description of how an engineer suspected some problem, but you rarely get details as to how they validated it such that others can learn new techniques. At a previous job, I worked with a great sysadmin/devop who would go through a concise set of steps when debugging things. We all sat down as a team, and he showed us the commands he ran to confirm transport in different scenarios. It was an enlightening experience. I talked to him and other DevOps folks about Rundeck, and it was clear that the problem isn't whether something can be automated, but rather whether the variables involved are limited enough to be represented in code. When you do the math, the time it would take to write code to solve some issues is not worth the benefit.
Iterating on the manual work to better communicate and formalize the debugging process could fit well into the notebook paradigm. You can show the scripts and commands you're running to debug while still composing a quality post-mortem, as the incident is happening where things are fresh.
The other thing to consider is how often you get incidents and how quickly you need to get people up to speed. In a small org, devs can keep most things in their head and use docs, but when things get larger, you need to think about how you can offload systems and operational duties. If a team starts by iterating on operational tasks in Notebooks, you can hand those off to an operations team over time. A quality, small operations team can take on a lot work and free up dev time for optimizations or feature development. The key is that devs have a good workflow to hand off operational tasks that are often fuzzier than code.
The one gotcha with a hosted service IMO is that translating local scripts into hosted ones takes a lot of work. On my laptop, I'm on a VPN and can access things directly, where you need to figure out how to allow a 3rd party to connect to production backend systems. That can be a sticky problem that makes it hard to clarify the value.
Iterating on the manual work to better communicate and formalize the debugging process could fit well into the notebook paradigm. You can show the scripts and commands you're running to debug while still composing a quality post-mortem, as the incident is happening where things are fresh.
The other thing to consider is how often you get incidents and how quickly you need to get people up to speed. In a small org, devs can keep most things in their head and use docs, but when things get larger, you need to think about how you can offload systems and operational duties. If a team starts by iterating on operational tasks in Notebooks, you can hand those off to an operations team over time. A quality, small operations team can take on a lot work and free up dev time for optimizations or feature development. The key is that devs have a good workflow to hand off operational tasks that are often fuzzier than code.
The one gotcha with a hosted service IMO is that translating local scripts into hosted ones takes a lot of work. On my laptop, I'm on a VPN and can access things directly, where you need to figure out how to allow a 3rd party to connect to production backend systems. That can be a sticky problem that makes it hard to clarify the value.