> Down the line, those needs include accessibility, a serious Achilles heel for imgui-flavored designs in particular.
Thank you for drawing attention to this. I spent a good chunk of last weekend pondering how accessibility might be added to IMGUI toolkits, and I do think it would be difficult. In particular, as you said:
> I believe a proper approach to this problem involves stable identity of widgets, about which much more below.
Agreed. In platform accessibility APIs such as UI Automation, each node has an identity ("runtime ID" in UIA) which is expected to be stable. One of the IMGUI toolkits I looked at was Nuklear, and I didn't come up with a way to derive stable identities for its widgets. On the other hand, Gio [1] (an immediate-mode toolkit for Go) looks more tractable, because the application holds a struct for each widget's state. Still, in an accessibility API like UIA, even simple static text nodes are supposed to have stable identity; I don't know how that would be solved with something like Gio.
I don't know much about other IMGUI systems, but at least Dear ImGui has "stable identity of widgets" across frames, but unlike traditional UI systems, those identifiers are not created by the UI system and need to be stored by the client code, instead the client code pushes those identifiers into the UI system as strings and/or numeric ids.
I would go as far as putting immediate mode UIs and reactive UIs into the same "mental model bucket". Both only describe the desired UI state (e.g. no separate "UI creation" and "UI updating" phases), and let an intermediate layer figure out the minimal required state changes to "realize" this desired UI state (whether this is exactly how the UI system works under the surface is mostly irrelevant to the API user).
The only (visible) difference between reactive and immediate-mode UIs seems to be that reactive UIs seem to prefer nested data structures, while immediate mode UIs prefer nested code blocks to describe the desired UI state.
> I don't know much about other IMGUI systems, but at least Dear ImGui has "stable identity of widgets" across frames, but unlike traditional UI systems, those identifiers are not created by the UI system and need to be stored by the client code, instead the client code pushes those identifiers into the UI system as strings and/or numeric ids.
Thanks for that info. I hadn't looked closely at Dear ImGui yet.
I'm probably missing something here, but is there a reason these toolkits don't require the dev to specify a stable, unique ID for each node? Feels like there's strong precedent with HTML's id attribute and CSS.
Because IMGUIs are designed to be procedural, making it easy to couple changes in UI with changes in control flow. Requiring an ID for every widget would get really annoying really fast, considering IDs would become voided pretty often. The default is no IDs, leave the ID assignment up to the developer who knows if and when an ID would be stable
You can wrap tools like Dear ImGui and Nuklear into a reactive framework that handles state management and IDs and such and presents an API with a declarative interface a la React, but at that point you're pretty much building your own UI engine and just passing rendering on to those tools, which is not simple to architect.
Dear ImGui does have stable widget ids, it's just not very obvious when looking at the API, and often also irrelevant for the API user. Usually it's the label string, but you can make those unique with special '#' postfix strings which are not displayed but only used to create a unique hash. There's also a push/pop API for identifiers (often useful when creating lists of items where labels are not guaranteed to be unique).
Further, you're in no way required (nor would you expect) to provide IDs for most HTML elements. In modern frontend development you often avoid styling elements by ID entirely, favoring classes instead.
I'm not sure I agree about the priority of accessibility.
First, most of the immediate mode UI's tend to be more "vector based" so the UI actually actually has the possibility of scaling properly unlike--well--basically everything else (giving me 1.0, 1.5 and 2x (effectively reducing my 4K monitor back to 1920x1080) does not count as "scaling"). That's actually a really important "accessibility" feature that STILL doesn't work properly for almost everything.
Second, for accessibility features like tabbing and screen reading, every user and every programmer bears that overhead (which the author points out may have significant architectural implications) for what is a user base of zero for the vast majority of programs.
Finally, who died and made "tabbing navigation" a pronouncement from God? Why isn't "spatial navigation" a better choice, for example? Why isn't there something better than that?
In addition, why is it the job of the GUI toolkit to do accessibility? Yes, I understand why "accessibility" wants to attach to the GUI toolkits as the programmers will do the math and never implement it otherwise. However, that doesn't mean the GUI toolkits should necessarily accept that task without at least thinking about the implications of doing so.
I hate it when y’all give made up stats, like “99% of all websites should not be an SPA” or in this case, “Vast majority of websites do not need accessibility features”
What this article pointed out is that accessibility imposes a cost on programmers. At that point, a programmer is entitled to ask whether that cost is justified and whether he wishes to pay it.
This plays out in other areas. UC Berkeley was told that it needed to make public videos "accessible". That was going to be expensive, so they decided to simply remove the videos. This, of course, benefited no one because now nobody else could make the videos accessible either.
Accessibility isn't free and I wish people would quit acting like it is. We, as a society, choose to impose that burden because, on the balance, everyone needs accessibility to some degree as they age or if they get injured.
However, if nobody is willing to pay for it, don't be surprised if some people decide, like Berkeley, that exiting the arena is the better choice.
I disagree with you on pretty much every point and find your positions generally poorly thought out.
• All UIs are based around positioning and drawing things. There is no difference whatsoever between any paradigm in that regard: any UI can be scaled properly to whatever level you desire, it just needs to actually do it.
• Navigating through controls is far more common than you seem to imagine. On desktop platforms, a significant fraction of users (though still certainly a minority) will become extremely frustrated if your app doesn’t conform to platform norms; and it’s not just power users: in line-of-business apps, all the best interfaces use Tab for rapid navigation between fields. Mice are really slow as field navigation devices. On mobile platforms, this is more tightly integrated with the keyboard, so that in forms, the “Enter” key gets changed to “Next”. Again, you’ll frustrate a lot of people if you flout platform norms.
• Exposing an accessibility tree for things like screen readers, now that part is more commonly associated with overhead, especially on the web; I’ve seen one web app not do ARIA stuff by default, but have a checkbox in its settings to enable the accessibility tech, with the warning that it’ll make it slower. Can’t remember what app it was. The ideal might be to not build that until something requests it, but the web platform at least doesn’t provide the means of doing that at this time. I’m not familiar with the underlying protocols and whether native apps can work this way.
• Linear field navigation matches how people think most readily and how almost all apps work, and is a long-standing convention across all platforms, whether done with a Tab key or by other means. It’s what everyone is used to, which means you’d better have a really good reason if you decide to disregard it, and provide an obvious alternative. The most common alternative to linear field navigation is 2D spatial navigation with arrow keys, used in things like spreadsheets (augmenting Tab) or games (typically supplanting Tab). Gaming platforms tend to replace Tab navigation with this 2D navigation with arrow keys or a gamepad. But for most apps, 2D field navigation doesn’t tend to be as convenient as 1D.
• I’m baffled about who you think should provide for accessibility if not the GUI library. If you are saying “let the end developer do it”, I respond: are you serious? That would have the end developer duplicate basically everything from the GUI library, so people would abstract that into an accessibility library that wraps the GUI library clumsily, then hey, let’s merge the two so it’s not a pain, and now we’re back where we started. If that’s not what you’re saying, then I’m baffled, because those are the only two options I can see—unless you would have the screen reader do OCR on what’s on-screen and throw AI at it to guess how the app will work!
> The ideal might be to not build that until something requests it, but the web platform at least doesn’t provide the means of doing that at this time. I’m not familiar with the underlying protocols and whether native apps can work this way.
Native accessibility APIs do allow this. Chrome, for example, doesn't build accessibility trees unless it detects that an AT is actually consuming them. But you're right that the web platform itself doesn't allow this. The concern is that websites could then discriminate against people with disabilities, or offer misguided "alternative" versions. Some would probably also argue that if you're working with the grain of the web platform and not against it (i.e. semantic HTML and not too much JS), you get accessibility at no additional cost. I'm more pragmatic than that.
> unless you would have the screen reader do OCR on what’s on-screen and throw AI at it to guess how the app will work!
This may be the only way that the long tail of applications using custom toolkits will ever be accessible, especially considering the responses I get on threads like this one. And the VoiceOver screen reader in iOS 14 is actually doing this. It kind of sucks though that we have to burn battery power to reconstruct the UI semantics that are already there inside the application.
Basically your examples are about how to navigate a text based application--mostly web and CRUD.
What about a PCB layout program? What about Blender and animation? What about video editing? What about a 3D modeler?
If you are using an immediate-mode UI, you probably aren't making a text-based CRUD system--especially as immediate-mode UI's tend to be remarkably bad at text rendering.
A UI toolkit doesn't magically make every application accessible. If the aforementioned applications are to be accessible, that application has to change in major ways.
Not everything is text.
Forcing everything through that narrow lens is an impediment.
Neither Raph nor I were ever saying that the UI toolkit can magically make everything accessible; rather, we were saying that it’s roughly impossible to make an application accessible without support from the UI toolkit. These are very different things. (I’m afraid the terminologies we were all using contained some ambiguities that led to us talking at cross-purposes. I was always talking about framework concerns, since this whole discussion is all about the framework.)
Any fancy components will still need to be able to notify any accessibility tech what they are.
I contemplated saying more about other types of programs like those you mention, but decided that they’re really no different, beyond Tab being far less useful and often justifiably repurposed. The “canvas” parts may easily be just a black box that isn’t exposed to accessibility tech, but all of the rest of the UI around it (menus, configuration panels, property sheets, &c.) will still need to be exposed properly.
Also, a good GUI framework should help the application developer decompose even a highly custom UI for a niche, non-CRUD application into widgets. As part of that, it should help the developer implement accessibility for those widgets. It's not hard to do much better than Win32 here, and probably not hard to do better than Cocoa as well.
Thank you for drawing attention to this. I spent a good chunk of last weekend pondering how accessibility might be added to IMGUI toolkits, and I do think it would be difficult. In particular, as you said:
> I believe a proper approach to this problem involves stable identity of widgets, about which much more below.
Agreed. In platform accessibility APIs such as UI Automation, each node has an identity ("runtime ID" in UIA) which is expected to be stable. One of the IMGUI toolkits I looked at was Nuklear, and I didn't come up with a way to derive stable identities for its widgets. On the other hand, Gio [1] (an immediate-mode toolkit for Go) looks more tractable, because the application holds a struct for each widget's state. Still, in an accessibility API like UIA, even simple static text nodes are supposed to have stable identity; I don't know how that would be solved with something like Gio.
[1]: https://gioui.org/