cat /blog/my-linux-accessibility-wishlist

My Linux Accessibility Wishlist

Category: Accessibility
Published:

All the improvements I want to see on my computer

In my previous article, I wrote about my first experience trying to add screen reader support to my game engine. It got a lot of support, for which I’m very thankful. It also came out the same day as this one from my friend Hari. I think it’s time to follow up on those two banger posts with something else that’s important to me. Screen readers are important for those who need them, but I don’t. What sorts of things do I wish I could do on my computer?

What can I do already?

If you don’t know me, I’m legally blind. I have some vision left - enough to be independent, but I need a lot of help reading things or finding my way around places I’m new to. I recently met Hari Rana in real life, and while he didn’t get to see me code, we both got to experience my blindness where it’s most challenging.

This is all to say: I’m not desperate - but I’m also not fine with the way things are; my challenges are just different from the usual “using a computer without sight” issues most people talk about or even think about.

U can zoom into the screen, that’s how I’m writing this. I can, to an extent, read the text I’m writing. It’s very blurry and some letters look like others, but I can technically read it. I can* play some video games, I can get work done, and I can write software. I can do it so well that KDE hired me and I’m currently working on a shape-preserving image upscaler for KWin’s screen magnifier.

It’s just slow. And I don’t enjoy it.

Where do I struggle?

Here are some of the things I absolutely hate needing to do on my computer, and why.

I hate troubleshooting things.

Whether it’s my code, your code, desktop code, system software, firmware, the kernel, I couldn’t care less. I hate needing to read giant walls of text just to figure out why my computer froze for a minute, randomly rebooted, can’t run Minecraft above 10 FPS despite having the hardware to do it, why my friend can’t hear me in a voice call, why my screen share looks like the resolution of a Commodore PET, or even why KDE won’t compile today (because I broke it).

I do not want logs read to me aloud. It’s just too much information to process. That’s if the issue is obvious in the current batch of logs. What if I don’t know what’s actually causing the problem? Maybe Plasma crashed because my firmware decided “I need to conserve power, this thing needs to sleep” and something didn’t expect that to happen. I don’t know, because I have to read logs to find out.

Assuming I do find the problem, that’s great - now I need to figure out who to blame and where to write up a bug report. I have to figure out what actions I need to take to reproduce the problem. If it’s potentially hardware related, I need to see if other people with similar or diffrent hardware can reproduce the issue. Well, I don’t need to do these things, but I know the information I’d need if I were on the other side of this table - so I try to gather it for my fellow programmers.

But I can barely read without help.

I don’t like trying new things

You made a new desktop, compositor, app, feature, whatnot. Great! I’m proud - but I likely can’t use it.

I’d love to try a tiling compositor, but many of them don’t have zoom - or it doesn’t work the way I need. I’d love to try other distributions (and did try Fedora), but then I need to figure out how to actually install it.

This is such a problem for me that it affects my ability to try new hobbies. I want to try music production, but “visual impairment friendly DAW” is an oxymoron. Learning my way around new interfaces is like trying to get me to walk outside at night, even with bright street lighting I’ll still trip over the invisible bush.

I’m not saying that everyone is bad at writing accessible software, I’m saying that I need to make sure my needs are met before I even think of trying new things - because if they’re not* (if I can’t zoom into my screen, and have a dark theme with good contrast), then my device is unusable until I can turn certain things on. If I don’t know how to do that, or need to go through menus or accept prompts to do it, then now I need to get someone else to do it for me - assuming one of us knows where to go.

Plasma, a while ago, added a confirmation dialog for double-clicking a theme in Global Themes. If I weren’t familiar with the layout of Plasma already, then if Breeze Dark isn’t on by default, then I can’t turn it on without help and therefore can’t use the computer without help.

I don’t want to fill out a checklist just so I can try new things.

I can’t be as secure as you.

I can’t put long passwords on my user accounts on my computers, because I can’t see what I’m typing.

I shouldn’t ever need my phone to log into things, because I can barely read the screen on it.

I need to keep my password wallets unlocked.

I need other people to help me complete CAPTCHAs.

I need software to behave like a keylogger sometimes.

I need sudo to not prompt me for a password every time I need to do something administrative.

I write dodgy software that does terrifying things over USB to devices whose manufacturers refuse to support Linux, because those devices are assistive technology for me and I need access to them. Even if it’s just a gaming headset that has zero-latency sidetone so I can hear myself typing, I need access to that - I can write code that gives me that access, but I need the ability to do it without root.

I can’t see my damned login manager! (SDDM)

I’m tired of trying to be secure, but I try - because I need to be.

I don’t like using the terminal

No, it’s not faster for me. Terminals are text. We’ve established I struggle to read.

Also, as far as I’m aware, TUIs generally don’t have accessibility trees. Not even a screen reader would save me there. I’ve asked other blind people how they do it, I still don’t understand it.

So no, vim won’t solve my problems. It’ll cause more.

I struggle to read man pages, some of you guys use evil color schemes for error messages, some of you in the server space like to spam my screen with ASCII art, others go even further and start flooding the screen with ANSI art, a lot of you print things weirdly, some of you proprietary software developers don’t say anything useful, and the worst of all - many of you stay completely silent when the task is successful. What’d I just…do? …Oh, no…

I tried to help my friend fix his firewall. I accidentally blocked all traffic in and out of his computer. It was a baremetal server in a datacenter. It was a whole thing. Had to get a guy to run over and hook up a KVM. Not fun.

I’ll reiterate: I don’t like using the terminal.

I don’t like using the GUI either.

Plasma is great.

Where it isn’t, they pay me to make it great.

Libadwaita is not. Buttons aren’t in the same place, visual indicators are different, the save and open dialogs work differently, it doesn’t properly integrate with or respect some of my important settings in KDE, hitboxes are different, window borders are different, modals are hard for me to distinguish from the background window they’re physically bound to. I can’t move them to somewhere with better contrast, in some cases, but not others. Some of this are things GNOME can solve, and that’s great - but their goal isn’t to fit in with KDE, and that’s fine, but it’s a problem for me. I need to have muscle memory. That’s what helps me enjoy my computer.

While I did just dress down Libadwaita, I want to be very clear what I’m actually trying to say here. The actual problem is when different programs use different toolkits that do not fit in with Plasma, it’s just that I typically only use Qt-based or Libadwaita-based applications. There are ways many of these issues can be addressed.

I also see the Web as just another toolkit that doesn’t fit in with the rest of my system’s UI. The problem is even worse there, since no layout or color guidelines are actually enforced from site to site or app to app. The exact same problem exists on mobile, and in game development, as well. If you still want to look at this in a negative light, then my statement is: You’re all bad; none of you work the same on my computer. Let’s fix that.

Where I want to see improvements

I’m done dressin’ down the entire software industry’s accessibility for the visually impaired. Most of you reading this are already well aware there’s a problem that needs to be fixed, and we’re all already deep in the trenches. Awesome. I’m happy to be a part of it too. How would I solve these problems?

Embracing Portals for Usability

Portals are a solution to a security problem on modern devices. How do we let apps have access to things, without letting malware have access to it, and how do we make it so the user gets to say “I want this” or “No, please go away?”

If you don’t already understand what a portal is or what it’s trying to do, you use them on your phone without realizing it. Any time you use an app, and it asks if it’s allowed to access your photos and music when you want to upload something, you’re seeing the concept of a portal.

Some of them are orange, some are blue. It really depends on the desktop environment, or whether or not your portal of choice was manufactured by Aperture Science. My favourite portal is the KDE portal, because I can make it red.

Jokes aside, portals are a de facto standard now. If you’re a Flatpak and don’t want to break the sandbox, you almost must use portals. They’re also a necessary part of how Wayland works, portals are how you communicate with the desktop without having to speak that desktop’s language, just like Wayland protocols are how you do the same with the display server.

The intent behind portals is to enforce security. But an interesting side effect comes from them. The best example of this is the file chooser portal. Its intent is to allow apps to open and save files without having full access to the user’s drives. If you want to open and edit a text file, the only thing the app should need access to is that specific file. It doesn’t need to see your password backups. File Chooser Portal makes that possible.

…It also puts the desktop in control of what “Open File” and “Save File” dialogs look like and how they behave. Remember how I said I hate having to use non-KDE apps because they don’t fit in well with my workflow? This is how you solve that! I want more of this! Color picker portals! Message dialog portals! Print and Page Setup portals! More! More! PLEASE!

I want that, because it puts KDE in control of how those UI elements look. If I find them hard to use, oh well - but it’s KDE’s fault, not yours.

I want to see everyone stop reinventing “Are you sure?” dialogs and such. Just let the desktop provide this. That’s how it works on Windows and macOS, and it’s a blessing for muscle memory.

I’m not saying you shouldn’t have the option to reinvent a color picker if your specific app needs to do so. Maybe the color picker needs to have some specific way to enter a color, or maybe you’re…writing a desktop environment. But I am saying that I’d rather use the file choosers in the KDE portal.

Accessible compositor, everywhere

Windows lets me zoom into the login screen.

GDM lets me do it as well.

SDDM doesn’t.

Sometimes I need to do that.

I’d like Plasma Display Manager to be able to let me do it.

My only stipulation is it must not require a visual toggle. I don’t want to ask for zoom after logging in the first time, I want to be able to zoom into my screen with a familiar shortcut (<Meta>= on Windows and Plasma, please don’t be esoteric like GNOME or macOS). I may not be logging into the correct session, and I need to be able to zoom in to check that or change it. If it is physically possible to render a user interface with a compositor, even if for just 2 seconds while I log into my PC or 2 minutes while I figure out why I can’t, it’d be a huge help.

A major problem I had with X11 back in the day was that I couldn’t use most window managers or desktop environments because most of the window managers for X11 aren’t also compositors, none of the standalone compositors for X11 support the effects I need, and neither KWin nor Compiz can run alongside another window manager and just provide compositing for it. In practice this meant I could only ever use Plasma with KWin or MATE/XFCE with Compiz.

Wayland solves this in practice by letting the window manager be in control of rendering the entire screen. This means that I can wear my accessibility engineering hat, write a fancy upscaling shader that preserves sharp text at the cost of some slightly weird curves here and there, and your app doesn’t need to concern itself with it.

Since Wayland compositors need to be…compositors, your desktop environment must have one. That’s great. While it means I’m stuck using Plasma if I want KWin, it also means that I’ll always have KWin behaving how I expect, so long as I use Plasma.

I don’t have that luxury with my login manager, any kind of system rescue tools, my UEFI firmware, or my boot manager. In some cases you can’t solve that, but GNOME already lets me zoom into its login screen and I think every desktop environment needs to be like that.

Better text rendering

Linux font rendering is horrible.

You can read it.

It’s still horrible.

Horrible.

Horrible

It’s blurry. It’s darker than it should be. It’s hard for me to easily spot typos. It strains my eyes after a while. All of this is because I’m zoomed into the screen and the desktop is being upscaled from 1080p to roughly 80 times that. Worse is that the only reason I’m zoomed in this far is because zooming out makes it blurrier. (And not because of my vision loss.)

The screenshot above is also the best I can get. Near-white text (you wouldn’t know it) on a near-black background, the blue highlight in Kate is also of adequate darkness. Text isn’t blocky, the kerning is fine, the font is theoretically very legible to me (on other platforms, it is), and the anti-aliasing is grayscale.

The problem is that only KDE apps seem to respect those settings. The other day, I was livestreaming and someone complained of the text in my IDE looking rainbowy. It’s because the JetBrains IDEs have their own font rendering settings and don’t respect what I have set in Plasma. Subpixel AA fundamentally does not work with screen magnifiers, you will introduce rainbow artifacts, you cannot render fonts like that if someone is zoomed into the screen. We should be telling applications how to render fonts. Or, even better, this should be something the Wayland compositor has control over. This would allow me, in the zoom effect, to force you to render text at 100x the font size so I don’t have to waste GPU time on upscaling 10pt JetBrains Mono while also making sure I don’t upscale the AA noise.

Seriously, I’ve written my own font atlas library, I’m writing that aforementioned upscaler, I do know what I’m doing when it comes to making text legible. I (as a compositor dev) need to be able to dictate whether you’re permitted to use subpixel AA, grayscale AA, hinting, how much hinting, whether you’re allowed to do AA at all, and at what DPI you must render fonts at. In fact, Windows (in most cases) does text rendering on its own, allowing Magnifier to do exactly what I just said I want to do. Windows will only fall back to pixel upscaling when it has to, because it makes things blurry.

I don’t want to instruct each toolkit individually, as a user, that I need grayscale AA. As a user, I want to be able to go into my desktop settings and say “I want grayscale AA,” and all my apps listen. Some of the rogue ones, like video games, might not listen - but oh well, that’s what the fallback upscaling is for.

Specifically, I want either one of two things:

  1. I want control over text rendering. If GTK wants to render text, it tells KWin to render that text.
  2. I want to be able to tell each app that the user is zoomed in, and by how much, and that the app should render text at higher font sizes accordingly.

I don’t want to break layouts. This isn’t about making UIs appear larger, it’s about giving me higher-resolution surfaces that don’t need to be upscaled.

I want more accessible terminals.

I want terminals that let me highlight text and read it aloud.

I’d like to be able to write TUIs that provide accessibility trees to screen readers.

I want to be able to increase the font size arbitrarily in my TTY sessions, should Plasma ever break.

I need my thick white blocky rectangle, not a thin white line or underscore, in my terminal. (Talking about the cursor)

I need visual feedback when entering passwords. I need to see asterisks.

I want sane levels of security

I’m not saying I want keyloggers to be able to run rampent on my system. I absolutely am saying that I want to be able to write a keylogger for personal use.

I’m not writing a keylogger; I’m writing a hacky accessibility tool.

I want to have the ability to do evil things, even if I need to flick a “allow this app to do evil things” toggle.

I want to be able to know, and if the desktop actually supports it - have control over, which part of which display that a window in my app is positioned at. I’m not saying all compositors need to let me control where my game engine’s windows get placed, but I am saying I need to know where I was placed.

I’m writing an editor for my custom game engine. When I ask SDL3’s Wayland backend where the editor’s main window was placed, it shrugs. Wayland does not allow me to discover or control window positions, even though KWin is a floating window manager and definitely lets me place windows arbitrarily. …I can drag them around.

Screen readers also have this really nifty feature that’s absent on Wayland where they will visually indicate what they’re reading. Even if KWin has to render that rectangle, I’d really like it to exist at all. Screen readers need to know where to render that rectangle.

What I’m saying is: while there is a security benefit to not letting apps do things you consider to be evil, there are non-evil reasons to want to do those things. I’m also saying that I don’t see workspace geometry as sensitive information. Can I please just have access to it?

Sometimes I need my phone

I hate it, but I need it.

I’d like the ability to make phone calls from my PC, on Linux, either over Bluetooth or over an actual cellular modem. It’d be awesome, and make my life so much easier outside of doing open-source things.

I can only use your desktop if…

…from the moment I see a graphical UI, I can:

  • zoom in and out of the screen

  • have the ability to invert colors

  • have the ability to turn on a dark theme, or have it on by default

  • have access to the first two things by a standardized global shortcut