However, this makes me wonder about the premise of this tool.
Have systems gotten so messy and complicated that we build them without knowing they're actually doing, until after they're built?
In other words, UML diagrams (including sequence diagrams) were intended ways to clarify and reason about systems during design time. This project seems like it could be a reverse engineering tool, but it isn't presented that way.
I can think of a few use cases that aren't really reverse engineering. Documenting legacy systems that weren't properly documented up front. Documenting/identifying divergence from design time documentation. Documenting a microservice stack where each service is individually well documented but the totality is not. Generating documentation from a prototype to use a starting point for a more structured design process.
Generally, I've seen very few projects that have architecture documentation which is kept up to date past the first release. So it's not an insane idea given the on the ground reality, but yes it does seem a bit like shutting the barn door after the horse is already out.
> Have systems gotten so messy and complicated that we build them without knowing they're actually doing, until after they're built?
Oh, hey, you've just practically described one popular implicit formulation of Agile. :)
Detailed and coherent design documentation is rare in my experience, even at organizations that are staffed at a level including dedicated staff ostensibly for this purpose. I can think of precisely one place I've worked for that had a design process that produced documentation effective for specifying the project ahead of time and as a reference from then on out. A culture committed to this plus dedicated technical writers who regularly met with a team of a domain expert, tech lead, and UX/UI folks for the purpose of producing project documentation helped a lot.
Other organizations may have had individuals producing documentation but it was a lossy process, magnified when they moved on from the project.
Working in secops, we aren't always the ones building the tools, and the products we're required to support never come with this kind of documentation.
Jeez Louise I worked with the Oracle BI stuff once and it was so poorly documented that using Wireshark was the only to figure out what all pieces did what and how it all worked. I'd imagine other such enterprisey stuff that's been hobbled together over the years through acquisitions may be similar and these companies are always rather terse or ask you to put in a ticket and wait a month to find out.
In 30 years of doing consulting on 'systems' work, it's astoundingly rare to see documentation of any sort that accurately & completely represents the current running state of the system. Sometimes it's close, but more often than not it's whatever was presented to get budget before any development has started.
Occasionally, the folks involved actually believe their doco does represent reality. They are invariably wrong, sometimes in small details, usually in large ones. Tools like this are an invaluable check on reality.
> Have systems gotten so messy and complicated that we build them without knowing they're actually doing, until after they're built?
Yes, in general, at least in my world (medium size to enterprise). Agile has made this in practice much much worse.
I really like this! I like the notion of automatically generated documentation, because it's always in line with what the source code is doing (or should be). Thanks for making this!
It is cool (compared to Visio or even Rational Rose), but it makes me wonder about the value of the diagram itself. The textual input is just as readable and takes up less space on a screen (or, God forbid, a printed piece of paper) - other than giving managers a warm and fuzzy feeling that there's a pretty picture there, why bother generating the diagram at all?
Depending on how you organize your relationships, the diagram can make shared dependencies more clear.
If you have multiple levels of inputs and outputs, you might define a set of relationships together that are all related to a single output. In this case shared inputs are not trivially realizable in the text.
E.g.
@startuml
database DB1
database DB2
... lots of other artifacts ...
artifact ArtifactN
DB1 --> Artifact1
... lots of other input to Artifact1 ...
... other relationships grouped based on output - more than a screenful - only some of which have DB1 as input ...
DB1 --> ArtifactN
@enduml
In the above, it might not be clear how often DB1 is an input to an artifact. Whereas the diagram will show the size of the tree pretty clearly.
It is unlikely that such a diagram would be very illustrative to its author, but the visual representation may be a much better communication tool to others who are not as intimately familiar with the system in question.
A little off topic, but even though I wrote a UML book a long while ago, I have more or less stopped using hand written UML except for sequence diagrams. Any class diagrams I put in documentation are auto-generated from source code.
This is the greatest idea! I've been writing soapui test cases that show the communication flow between microservices and use cases to provide new developers (and product owners) some incite into how the application is hung together. Something like this is the logical next step and exactly what I was hoping I could derive from new-relic/open tracing/logs (but these are only as good as the technical details)
I haven’t read the source code, but I assume that it just looks at the timings of the request. After all, it’s just one user, no need to worry about 50 concurrent requests.
It's really cool, but from my understanding you get only the traffic exchanged with your browser and not between the servers. I wonder if something similar could be done at the server level, aggregating traffic from all the servers, merge together and display
The bottom half of the example diagram shows traffic between the "backend" and "keycloak" that clearly isn't browser to nginx traffic.
You'd need to be running wireshark on a network interface that could see all that traffic, so yeah - sort of "at the server level" - in the sense that you'd need to be sitting on the server's network(s) to sniff that.
However, this makes me wonder about the premise of this tool.
Have systems gotten so messy and complicated that we build them without knowing they're actually doing, until after they're built?
In other words, UML diagrams (including sequence diagrams) were intended ways to clarify and reason about systems during design time. This project seems like it could be a reverse engineering tool, but it isn't presented that way.