I don't have a background in ML (I'm still somewhat early in my CS Bachelor's degree, and I'm not focusing on ML anyway), but this is very interesting, regardless that I had to look up nearly every term you used.
I was initially thinking a directed weighted graph might work well here, but I'm assuming that would scale terribly relative to something like this.
I was initially thinking a directed weighted graph might work well here, but I'm assuming that would scale terribly relative to something like this.