So, from the perspective I have within the subfield I work in, explainable AI (XAI), we're seeing a bunch of fascinating developments.

First, as you mentioned, Rudin continues to prove that the reason for using AI/ML is that we don't understand the problem well enough; otherwise we wouldn't even think to use it! So, pushing our focus to better understand the problem, and then levy ML concepts and techniques (including "classical AI" and statistical learning), we're able to make something that not only outperforms some state-of-the-art in most metrics, but often even is much less resource intensive to create and deploy (in compute, data, energy, and human labour), with added benefits from direct interpretability and post-hoc explanations. One example has been the continued primacy of tree ensembles on tabular datasets [0], even for the larger datasets, though they truly shine on the small to medium datasets that actually show up in practice, which from Tigani's observations [1] would include most of those who think they have big data.

Second, we're seeing practical examples of exactly this outside Rudin! In particular, people are using ML more to do live parameter fine-tuning that outwise would need more exhaustive searches or human labour that are difficult for real-time feedback, or copious human ingenuity to resolve in a closed-form solution. Opus 1.5 is introducing some experimental work here, as are a few approaches in video and image encoding. These are domains where, as in the first, we understand the problem, but also understand well enough that there's search spaces we simply don't know enough about to be able to dramatically reduce. Approaches like this have been bubbling out of other sciences (physics, complexity theory, bioinformatics, etc) that lead to some interesting work in distillation and extraction of new models from ML, or "physically aware" operators that dramatically improve neural nets, such as Fourier Neural Operators (FNO) [2], which embeds FFTs rather than forcing it to be relearned (as has been found to often happen) for remarkable speed-ups with PDEs such as for fluid dynamics, and has already shown promise with climate modelling [3], material science [4]. There are also many more operators, which all work completely differently, yet bring human insight back to the problem, and sometimes lead to extracting a new model for us to use without the ML! Understanding begets understanding, so the "shifting goalposts" of techniques considered "AI" is a good thing!

Third, specifically to improvements in explainability, we've seen the Neural Tangent Kernel (NTK) [5] rapidly go from strength to strength since its introduction. While rooted in core explainability vis a vis making neural nets more mathematically tractable to analysis, not only inspiring other approaches [6] and behavioural understanding of neural nets [7, 8], but novel ML itself [9] with ways to transfer the benefits of neural networks to far less resource intensive techniques; which [9]'s RFM kernel machine proves competitive with the best tree ensembles from [0], and even has advantage on numerical data (plus outperforms prior NTK based kernel machines). An added benefit is the approach used to underpin [9] itself leads to new interpretation and explanation techniques, similar to integrated gradients [10, 11] but perhaps more reminiscent of the idea in [6].

Finally, specific to XAI, we're seeing people actually deal with the problem that, well, people aren't really using this stuff! XAI in particular, yes, but also the myriad of interpretable models a la Rudin or the significant improvements found in hybrid approaches and reinforcement learning. Cicero [12], for example, does have an LLM component, but uses it in a radically different way compared to most people's current conception of LLMs (though, again, ironically closer to the "classic" LLMs for semantic markup), much like the AlphaGo series altered the way the deep learning component was utilised by embedding and hybridising it [13] (its successors obviating even the traditional supervised approach through self-play [14], and beyond Go). This is all without even mentioning the neurosymbolic and other approaches to embed "classical AI" in deep learning (such as RETRO [15]). Despite these successes, adoption of these approaches is still very far behind, especially compared to the zeitgeist of ChatGPT style LLMs (and general hype around transformers), and arguably much worse for XAI due to the barrier between adoption and deeper usage [16].

This is still early days, however, and again to harken Rudin, we don't understand the problem anywhere near well enough, and that extends to XAI and ML as problem domains themselves. Things we can actually understand seem a far better approach to me, but without getting too Monkey's Paw about it, I'd posit that we should really consider if some GPT-N or whatever is actually what we want, even if it did achieve what we thought we wanted. Constructing ML with useful and efficient inductive bias is a much harder challenge than we ever anticipated, hence the eternal 20 years away problem, so I just think it would perhaps be a better use of our time to make stuff like this, where we know what is actually going on, instead of just theoretically. It'll have a part, no doubt, Cicero showed that there's clear potential, but people seem to be realising "... is all you need" and "scaling laws" were just a myth (or worse, marketing). Plus, all those delays to the 20 years weren't for nothing, and there's a lot of really capable, understandable techniques just waiting to be used, with more being developed and refined every year. After all, look at the other comments! So many different areas, particularly within deep learning (such as NeRFs or NAS [17]), which really show we have so much left to learn. Exciting!

  [0]: Léo Grinsztajn et al. "Why do tree-based models still outperform deep learning on tabular data?" https://arxiv.org/abs/2207.08815
  [1]: Jordan Tigani "Big Data is Dead" https://motherduck.com/blog/big-data-is-dead/
  [2]: Zongyi Li et al. "Fourier Neural Operator for Parametric Partial Differential Equations" https://arxiv.org/abs/2010.08895
  [3]: Jaideep Pathak et al. "FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators" https://arxiv.org/abs/2202.11214
  [4]: Huaiqian You et al. "Learning Deep Implicit Fourier Neural Operators with Applications to Heterogeneous Material Modeling" https://arxiv.org/abs/2203.08205
  [5]: Arthur Jacot et al. "Neural Tangent Kernel: Convergence and Generalization in Neural Networks" https://arxiv.org/abs/1806.07572
  [6]: Pedro Domingos "Every Model Learned by Gradient Descent Is Approximately a Kernel Machine" https://arxiv.org/abs/2012.00152
  [7]: Alexander Atanasov et al. "Neural Networks as Kernel Learners: The Silent Alignment Effect" https://arxiv.org/abs/2111.00034
  [8]: Yilan Chen et al. "On the Equivalence between Neural Network and Support Vector Machine" https://arxiv.org/abs/2111.06063
  [9]: Adityanarayanan Radhakrishnan et al. "Mechanism of feature learning in deep fully connected networks and kernel machines that recursively learn features" https://arxiv.org/abs/2212.13881
  [10]: Mukund Sundararajan et al. "Axiomatic Attribution for Deep Networks" https://arxiv.org/abs/1703.01365
  [11]: Pramod Mudrakarta "Did the model understand the questions?" https://arxiv.org/abs/1805.05492
  [12]: META FAIR Diplomacy Team et al. "Human-level play in the game of Diplomacy by combining language models with strategic reasoning" https://www.science.org/doi/10.1126/science.ade9097
  [13]: DeepMind et al. "Mastering the game of Go with deep neural networks and tree search" https://www.nature.com/articles/nature16961
  [14]: DeepMind et al. "Mastering the game of Go without human knowledge" https://www.nature.com/articles/nature24270
  [15]: Sebastian Borgeaud et al. "Improving language models by retrieving from trillions of tokens" https://arxiv.org/abs/2112.04426
  [16]: Umang Bhatt et al. "Explainable Machine Learning in Deployment" https://dl.acm.org/doi/10.1145/3351095.3375624
  [17]: M. F. Kasim et al. "Building high accuracy emulators for scientific simulations with deep neural architecture search" https://arxiv.org/abs/2001.08055

Thank you for providing an exhaustive list of references :)

> Finally, specific to XAI, we're seeing people actually deal with the problem that, well, people aren't really using this stuff!

I am very curious to see which practical interpretability/explainability requirements enter into regulations - on one hand it's hard to imagine a one-size fits all approach, especially for applications incorporating LLMs, but Bordt et al. [1] demonstrate that you can provoke arbitrary feature attributions for a prediction if you can choose post-hoc explanations and parameters freely, making a case that it can't _just_ be left to the model developers either

[1] "Post-Hoc Explanations Fail to Achieve their Purpose in Adversarial Contexts", Bordt et al. 2022, https://dl.acm.org/doi/10.1145/3531146.3533153

I think the situation with regulations will be similar to that with interpretability and explanations. There's a popular phrase that gets thrown around, that "there is no silver bullet" (perhaps most poignantly in AIX360's initial paper [0]), as no single explanation suffices (otherwise, would we not simply use that instead?) and no single static selection of them would either. What we need is to have flexible, adaptable approaches that can interactively meet the moment, likely backed by a large selection of well understood, diverse, and disparate approaches that cover for one other in a totality. It needs to interactively adapt, as the issue with the "dashboards" people have put forward to provide such coverage is that there are simply too many options and typical humans cannot process it all in parallel.

So, it's an interesting unsolved area for how to put forward approaches that aren't quite one-size fits all, since that doesn't work, but also makes tailoring it to the domain and moment tractable (otherwise we lose what ground we gain and people don't use it again!)... which is precisely the issue that regulation will have to tackle too! Having spoken with some people involved with the AI HLEG [1] that contributed towards the AI Act currently processing through the EU, there's going to have to be some specific tailoring within regulations that fit the domain, so classically the higher-stakes and time-sensitive domains (like, say, healthcare) will need more stringent requirements to ensure compliance means it delivers as intended/promised, but that it's not simply going to be a sliding scale from there, and too much complexity may prevent the very flexibility we actually desire; it's harder to standardise something fully general purpose than something fitted to a specific problem.

But perhaps that's where things go hand in hand. An issue currently is the lack of standardisation, in general, it's unreasonable to expect people re-implement these things on their own given the mathematical nuance, yet many of my colleagues agree it's usually the most reliable way. Things like scikit had an opportunity, sitting as a de facto interface for the basics, but niche competitors then grew and grew, many of which simply ignored it. Especially with things like [0], there are a bunch of wholly different "frameworks" that cannot intercommunicate except by someone knuckling down and fudging some dataframes or ndarrays, and that's just within Python, let alone those in R (and there are many) or C++ (fewer, but notable). I'm simplifying somewhat, but it means that plenty of isolated approaches simply can't worth together, meaning model developers may not have much chance but to use whatever batteries are available! Unlike, say, Matplotlib, I don't see much chance for declarative/semi-declarative layers to take over here, such as pyplot and seaborn could, which enabled people to empower everything backed by Matplotlib "for free" with downstream benefits such as enabling intervals or live interaction with a lower-level plugin or upgrade. After all, scikit was meant to be exactly this for SciPy! Everything else like that is generally focused on either models (e.g. Keras) or explanations/interpretability (e.g. Captum or Alibi).

So it's going to be a real challenge figuring out how to get regulations that aren't so toothless that people don't bother or are easily satisfied by some token measure, but also don't leave us open to other layers of issues, such as adversarial attacks on explanations or developer malfeasance. Naturally, we don't want something easily gamed that the ones causing the most trouble and harm can just bypass! So I think there's going to have to be a bit of give and take on this one, the regulators must step up while industry must step down, since there's been far too much "oh, you simply must regulate us, here, we'll help draft it" going around lately for my liking. There will be a time for industry to come back to the fore, when we actually need to figure out how to build something that satisfies, and ideally, it's something we could engage in mutually, prototyping and developing both the regulations and the compliant implementations such that there are no moats, there's a clearly better way to do things that ultimately would probably be more popular anyway even without any of the regulatory overhead; when has a clean break and freshening up of the air not benefited? We've got a lot of cruft in the way that's making everyone's jobs harder, to which we're only adding more and more layers, which is why so many are pursuing clean-ish breaks (bypass, say, PyTorch or Jax, and go straight to new, vectorised, Python-ese dialects). The issue is, of course, the 14 standards problem, and now so many are competing that the number only grows, preventing the very thing all these intended to do: refresh things so we can get back to the actual task! So I think a regulatory push can help with that, and that industry then has the once-in-a-lifetime chance to then ride that through to the actual thing we need to get this stuff out there to millions, if not billions, of people.

A saying keeps coming back to mind for me, all models are wrong, some are useful. (Interpretable) AI, explanations, regulations, they're all models, so of course they won't be perfect... if they were, we wouldn't have this problem to begin with. What it all comes back to is usefulness. Clearly, we find these things useful, or we wouldn't have them, necessity being the mother of invention and all, but then we must actually make sure what we do is useful. Spinning wheels inventing one new framework after the next doesn't seem like that to me. Building tools that people can make their own, but know that no matter what, a hammer is still a hammer, and someone else can still use it? That seems much more meaningful of an investment, if we're talking the tooling/framework side of things. Regulation will be much the same, and I do think there are some quite positive directions, and things like [1] seem promising, even if only as a stop-gap measure until we solve the hard problems and have no need for it any more -- though they're not solved yet, so I wouldn't hold out for such a thing either. Regulations also have the nice benefit that, unlike much of the software we seem to write these days, they're actually vertically and horizontally composable, and different places and domains at different levels have a fascinating interplay and cross-pollination of ideas, sometimes we see nation-states following in the footsteps of municipalities or towns, other times a federal guideline inspires new institutional or industrial policies, and all such combinations. Plus, at the end of the day, it's still about people, so if a regulation needs fixing, well, it's not like you're not trying to change the physics of the universe, are you?

  [0]: Vijay Arya et al. "One Explanation Does Not Fit All: A Toolkit and Taxonomy of AI Explainability Techniques" https://arxiv.org/abs/1909.03012
  [1]: High-Level Expert Group on AI "Ethics Guidelines for Trustworthy AI"
  Apologies, will have to just cite those, since while there are some papers associated with the others, it's quite late now, so I hope the recognisable names suffices.

Thanks a lot. I love the whole XAI movement, as it often forced you think of cliff and limits and non-linearity of the methods. Makes you circle back to an engineering process of thinking about specification and qualification of your black box.

Thank you! especially for the exhaustive reading list!!

