Mykel Kochenderfer (one of the authors) is my uncle! Very friendly and smart guy. He helps run the Stanford Intelligent Systems Lab (SISL) and I've gotten to see some very neat projects like self-flying drones when visiting.
He was one of my favorite professors (had him for CS238/AA238)! I still remember some of his project assignments fondly. I got a SISL shirt for being in the top three of the leaderboard on one of them. Inspired me to interview at Lincoln Labs (which I did not land, probably for the best :P)
May be a little OT but came across this technique called MABs multi armed bandits , related to , finding optimal ways to decide among a sequence of actions and measuring their utility
Has applications in a wide variety of fields : optimal design of clinical trials, public policy decision making.
apparently has some relation to the work abhijit banerjee & esther duflo, the nobel prize winning economist couple have done on design of experiments, to measure impact of developmental interventions
Not OT at all, IMO. As it happens, the linked book actually touches on MAB problems explicitly, albeit briefly. And as the book points out, you can formulate a MAB problem as specific form of Markov Decision Process, and MDP's are referenced extensively in the book.
I find MAB's incredibly interesting, and for pretty much the same reason(s) I find the rest of this stuff interesting.
Yes, it's unfortunate that two somewhat different fields have such similar sounding names. "Decision Theory" (or "Decision Science") vs "Decision Procedures" (or "Decision Problems"). But what can one do?
It jumped out at me as a nice example of the Edward Tufte style (which I'm a big fan of -- I once published a tech report in that style). People have put together style sheets emulating it for LaTeX [1][2], CSS [3], and R Studio Markdown [4], among others.
Oh how I wish I had this document 7 years ago. I was knee deep in utility curves and felt like I was on the path to madness.
I did find a solution, but could not convince anyone that I wasn’t just making it all up!!
Oh well…
In my opinion the book is horrible. It doesn’t provide any in depth details. It functions as a high level survey of the field. It is very superficial and just scratches the surface. It doesn’t describe the presented algorithms in any detail. It’s a very poor version of Daphne Kollers amazing book on graphical models.
These broad survey type classes were the most useful ones in university for me. It's not like I would remember the specifics after the class anyway, but when I run into a problem I now have a huge bag of potential solutions. When I need details, there are often scientific papers or technical reports going in-depth on each of these things.
It's interesting how one algorithm, a 'master algorithm', can presumably subsume all the others in the book, presumably a neural/evolutionary algorithm, that can simply learn/evolve when the other algorithms are useful for decision making/maximizing reward.
The more assumptions you relax, the more general the algorithms become, for example going from immediate reward to delayed reward means going from supervised to reinforcement learning.
The trade-off is the more general algorithms needs many times exponentially more data and compute to come to a similarly good solution.
That's why reinforcement learning has seen so practical few applications relative to supervised learning. There's no free lunch.
That said, as a ML practitioner I would love it if I could just apply a single master algorithm to all problems, but that is likely many years away.
At the same time, fine-tuning sample efficiency increases with scale, so at some point you can possibly one-shot learn state and get rid of exponential searches, solving NP-Hard problems with heuristics. Sounds like a free lunch to me. At least if you can afford a net large enough.
It sounds like the more general the algorithm, the more stateful it needs to be before it can be useful. On the other hand, specialized algorithms need less to zero state but have limited applications.
Julia's extensive use including relevant comments is noteworthy. In particular, the use of the mathematical notation of the variables facilitates the understanding of the formulas. At the same time it helps to learn Julia.
this book is a wonderful primer on all things related to MDP based algs. That's a very specific subfield of computer science that can work wonders in some domains.