As of a recent release, egui pushes out some events that I'm able to consume and provide TTS feedback for. You can find a simple Bevy example with a bit of event coverage here:
In my game, I'm able to use this for buttons, checkboxes, and single-line text edit fields. Sliders are a bit wonky but I'm hoping to eventually get those working better as well. Much of this development is needs-driven for me, and as of now I only need these few widgets. I certainly wouldn't complain if coverage improved, though. :)
Before anyone jumps on me about screen readers not being the accessibility end game, I do know that. But as a totally blind developer myself, I can't easily work on solutions for low vision, high contrast, etc. And for my needs (simple keyboard navigable game UIs for games targeted at blind players) it works well enough, and should hopefully get better soon.
For true cross-platform accessibility with integrations into native APIs, we'll need something like https://accesskit.dev.
> my needs (simple keyboard navigable game UIs for games targeted at blind players)
Huh... I am surprised that, with that target, you are even using a UI library--and then dealing with its resulting level of accessibility--rather than building something more akin to an IVR tree out of direct usage of text-to-speech and keyboard APIs.
> Each node has an integer ID, a role (e.g. button or window), and a variety of optional attributes. The schema also defines actions that can be requested by assistive technologies, such as moving the keyboard focus, invoking a button, or selecting text.
This sounds very similar to what I'm using for my Semantic UI project (which has similar aims).
Accessibility systems require the ability to programmatically interact with the UI, too (install Accerciser if you're on an AT-SPI2-based system to have a play around); I'm not sure how your system supports typing. (Is it all done via Action::ReplaceSelectedText?)
Also, have you thought about latency? AT-SPI2 is really laggy (“bring down your system for several seconds at a time” levels of laggy), and from a cursory inspection AccessKit looks even heavier.
I'd like to know more about the Semantic UI project.
The way text input is implemented depends on the user's platform and input needs. When using a screen reader with a hardware keyboard, the screen reader will often use the accessibility API to programmatically move the keyboard focus, but once the focus is in a text input control, the input itself happens as usual, not through the platform's accessibility API. For users who require alternate input methods such as speech recognition, it depends on the platform. On Windows, for instance, text input isn't even done through the accessibility API; it's done through a separate API called Text Services Framework. But AccessKit will offer the ReplaceSelectedText action for platforms that can expose it.
I have certainly thought about latency; as a Windows screen reader developer, it has been a difficult problem for a long time. The relevant factor here is not the amount of information being pushed, but the number of round trips between the assistive technology (e.g. screen reader) and the application. If I'm not mistaken, this is what makes AT-SPI problematic in this area. This has also been a problem for the Windows UI Automation API, and a major focus of my time on the Windows accessibility team at Microsoft was to help solve that problem. As for AccessKit, I'll refer you to the part in the README about how applications will push tree updates to platform adapters. Since a large tree update can be pushed all at once, AccessKit doesn't make the problem of multiple round trips any worse.
> The relevant factor here is not the amount of information being pushed, but the number of round trips between the assistive technology (e.g. screen reader) and the application. If I'm not mistaken, this is what makes AT-SPI problematic in this area.
That explains a lot! AT-SPI2 has, as you say, a lot of round trips – and some applications (e.g. Firefox) seem to use a blocking D-Bus interface that means they drop X events while talking to the accessibility bus.
> I'd like to know more about the Semantic UI project.
I don't think it qualifies for a definite article just yet. :-) I got annoyed with the lack of good, lightweight, cross-platform GUIs in Rust, and I tried to make my own, but then faced the same issue with accessibility APIs… so now I'm trying to solve both problems at once: defining a schema and interaction protocol for the semantics of a user interface, as a first-class citizen – all the information needed to construct a GUI interface would be present in the “accessibility data”, but in principle any kind of UI could be generated just as easily. (Of course, a GUI auto-generated from the SUI data would be like a CSS-free webpage; I'm planning to make a proper GUI library too, later.)
There are three types of thing in the schema I've got so far:
• “Widget type” – basically a role. Each widget has exactly one widget type, which implies a certain set of features (e.g. section-with-heading has a heading)
• “Feature” – a group of attributes with a semantic meaning (e.g. the heading feature consists of a reference to the heading widget (which must have the feature providing its natural language representation)). I'm not sure how to deal with stuff like “can be scrolled”, because I still haven't finished bikeshedding things like “should there be implied zero-size features, or should widget types just have a lot of semantics, or should there be a load of explicit-but-redundant mandatory features on every button widget saying it can be pressed?”. (I'm leaning towards the latter, now, under the assumption that simplicity is better than trying to reduce bandwidth.)
• “Event”. Every change to the state of widgets is accompanied by an event. There are separate events for semantically different things even if the same thing happened; for instance, when LibreOffice Calc deletes table cell widgets that have gone off-screen, the widgets have been deleted but the actual cells are still there; that's a different thing to what happens when someone deletes a worksheet, so it should have a different event. This makes SUI retained-mode, but it should be usable with immediate-mode UIs in the same situations as AccessKit is.
I haven't worked out how to represent “alternate interface interacts with program” yet, but I'm leaning towards a second kind of event, with the set of valid user events (and hence what the alternate UI “looks” like) determined by the
Another question is how to represent cursors. Obviously there should be co-ordinate-positional (mouse-like) and cursors over the widget graph, but keyboard-driven GUIs don't behave like either of those things… so do I just let the alternate interface deal with cursors? But then how does the application know what's currently selected by the cursor? (Focus, hover, select… all with different semantics, not all of which I'm aware of.) Maybe SUI should just completely keep out of that, and pass through a cursor ID and various events without trying to fit them to a model?
You can tell I'm not very good at this; if I'd heard of AccessKit earlier than a week into this project, I wouldn't've started it! :-p
Since pretty much every OS supports Unix Domain Sockets, I intended to use that as the communication protocol. Backends for existing accessibility systems (e.g. AT-SPI2, IAccessible2) were planned as daemons, but to be honest I don't know enough about IPC to have planned this out properly, and I haven't really got attached to any one architecture. I don't even know that that would work properly; IAccessible is a COM interface, and afaik Windows cares strongly about which process provides those.
I thought amount of information was a factor in IPC latency (even though computers can download GBs of data over a network in seconds), so I've been distracting myself with trying to “lazy-load” lots of the data. If you're right about latency – which you probably are – then that's worse than useless, and I should just bubble that up.
A final question is: how to deal with reading order? I have no answers to this.
It basically tells you what you clicked on, so for example "Widget Gallery: checked checkbox" or "Click me: button" or "width: slider, 250".
It also reads these things out when you tab through the interface (which mostly works). Obviously a bit rough around the edges, but I've seen far worse.
Yeah, I believe accessibility is not addressed at all. Similarly when compiler for web, it would be great if the UI would actually behave as text (actual web page) and could be copied into the clipboard etc.
Why is everyone concern with accessability on small gui tool? Someone make a cool project and always hackernews with the question on how it serve this very small demographic.
I think I get what you mean: the percentage of people who access the web using a screenreader is not huge. That doesn't mean we should discount it, it's just common courtesy and it's a legal requirement in many circumstances.
But if you take one step back, the demographic is not very small at all! It includes everyone who is a bit older and maybe can't see as well as when they were 23, so they like to increase the contrast a bit. I myself am not very far above 23 and I read hackernews zoomed in to 130%, it's just more readable. Many people have an easier time using a computer by for example making buttons larger so that a mouse can click them faster. There are more examples.
And last but not least: Writing functional UI tests for applications is enabled, in a good chunk by accessibility capabilites built into the gui toolkit. My job would be much harder if that wasn't baked in to most UIs.
So, while I agree with you: let people have fun playing with technology and create new things without having to implement everything from the beginning, I don't think it is fair to just dismiss accessibility concerns like that.
A large percentage of users benefit from good accessibility features. These don't just include screen readers, but things like being able to change colors and font sizes (manually, or to pre-sets designed to help people with deteriorating vision or colorblindness, et c.)
Most people's vision and hearing go to hell at some point. Practically all older people can benefit from a11y features—whether they know they're there, and know how to enable them, is another matter.
Folks ask because everybody really really really wishes there was a GUI tool that made accessibility concerns easy. And so when they ask, they are asking: Is this finally "The One"??
And why do folks care about accessibility so much? Because it instantly makes everybody's life easier, and prevents you from needing to reinvent the wheel for a whole host of features.
I don't have any disabilities, but I love it when accessibility is done right anyway. Examples:
- I browse the web at 175%
- I prefer keyboard navigation whenever possible
- I like knowing what something is about to do before it does it, even if I'm not hovering my mouse (see "keyboard navigation", or "touch screen input")
- My connection is often unreliable, and I like knowing what images are supposed to be and, for that matter, being able to simply read text content (which some websites manage to break)
- tools with good accessibility, whether on the web or native, are almost always easier for me to write automations for, because they "follow the rules" and thus have reliable hooks for automation
Those are just a few of the reasons as a person who doesn't need accessibility tools, that I am always eager to know how well a GUI tool handles accessibility questions.
Well, it's a somewhat important and key feature for these kinds of projects. It's not much different than asking "does it support checkboxes" or "how does it handle scaling?"
Accessibility is probably a good way to discern is gui usable for a particular use where one needs accessibility. It's an interesting feature to have for many. I don't ever read it as a way to disparage the gui (guis after all are quite hard to implement - any experimentation no matter how playfull is probably more than welcome in this space).
The point of accessibility is not minorities, but taking all users in various circumstances into account. Sometimes you may yourself be on a device where images do not show and you need alt text, or where you need to resize fonts to be able to read text.
Totally, I am a dyslexic and frequent user of my systems text-to-speech vocaliser as well as a big proponent of immediate mode graphics. Every single time this topic comes up people who often neither use accessibility features nor program immediate mode graphics make this point as a cheap put down. Sure this is a downside of the paradigm, just like a hovering dropdown menus is hard to always do right in immediate mode layouts. In both cases there are ways this can be mitigated. This point has been made so many times that I got my cheap comeback ready: Here is a blog post by the web agency of the W3C giggle explaining why they had to create there own CMS system because existing solutions don't have good support for accessibility giggle: https://w3c.studio24.net/updates/on-not-choosing-wordpress/
Looking at the custom widget example didn't leave me with the impression that it has that in mind, but I could be mistaken.