The Voice [A to Z a11y]

Title image for the V - The Voice post. It features an image of a microphone in front of 4 seated judges. Also shown is a Lego minifigure in front of a giant soundwave.

“You’re the voice, try and understand it”

If you read the above and started singing then I applaud you. If not, then please enjoy the below video as an education. Everyone should know how to respond with Whoa-oa-oa-oa to “You’re the Voice”.

In the K post of this series, which you can read here, I looked at Keyboard Navigation. That is not the only form of alternative navigation that exists with eye-tracking, touch, mouth control, and voice amongst others.

In this AI-assistant world of Copilots, Agents, and ever-listening smart devices, Voice is becoming more prevalent. What if this is the primary source of control for people though?

This post applies to many areas but focuses predominantly on Power Apps, Copilot Studio, Microsoft Windows, and Microsoft 365.


Oh, and here’s the video I promised:

OK, computer

There’s a great moment in Star Trek IV: The Voyage Home. Scotty looks at a computer and tries talking to it. When that fails he picks up the mouse and tries talking into that. Their reality was a voice controlled world with manual input as an option. We’re still a way off but the idea that we can interact naturally with technology has been becoming more real.

Voice control of computers is something that’s been around for a while but is still not “natural”. It requires very clear commands, often technical or long-winded, and relies on software that is sympathetic to such methods.

Likewise dictation and audio-transcribing have come a long way over time. Tools such as Dragon Dictate go from strength to strength. Built in dictation tools have also become more natural feeling.

There is still a way to go. Punctuation, emphasis, formatting etc. all require specific commands and can interrupt the flow of the dictation. They can also give sometimes comical errors in what they transcribe. Longer term users become used to this, and become amazingly accurate in their dictation. However, the learning curve is full of pitfalls and frustrations. AI and Copilots will hopefully see this area become more reliable and intuitive though.

Text to speech has also come along a long way. Memories of Dr SBAITSO reading out the latest BOFH have been supplanted by audio narration that is getting closer to human. However, I am not going to go into that area too much as it crosses over more with Screen Readers.

I’m sorry Dave. I can’t do that.

Windows 11 is the most accessible operating system. That’s not me being a fanboi, it’s just a fact. It was designed with accessibility properly baked in from the beginning.

It’s not perfect, but nothing is. It’s not complete but again nothing is, nor can it be. At the core are services and designs that are there to be built upon. These then enable more and more capabilities to be added.

One of the frustrations I have is that Voice Control in Windows came along a long way, and has been taken backwards with Copilot becoming a Progressive Web App rather than a native application. This means that Copilot is cannot trigger Windows actions and merely tells you where to find that action.

Such capabilities in other OS and on other devices is also hit and miss, it’s not just a Microsoft thing. So, if the tech giants can’t get it fully working, what can we do to incorporate voice capabilities?

Clarity and Consistency

“Say what you see” as the old game show Catchphrase would tell you. Voice commands only work when users can give clear instructions and it is obvious what is being asked.

For example, if a screen had two buttons labeled next then how would you communicate which one you want to click?

In many ways this ties in to the “N – what’s in a Name?” post that you can view here. The key difference is that the naming is actually what is visible to the user. Clear labeling and consistent layout will enable users rather than frustrate them. Using section headers (with containers) means people can say “Select ‘Personal Details’ section” and then interact with the controls within easily.

It’s not a major shift and actually isn’t a shift at all if proper UX/UI has been implemented.

And that is The Voice!

As always, thank you for reading this post. “The Voice” is part of the A to Z Accessibility: Power Platform Edition. To see all the other posts in the series you can head to the category page here. There is also the intro article to the series here which has a table of contents in it.

This series is © Mike Hartley. It’s taken a lot of work to compile all of these pages so please don’t just take the content. Using snippets is welcomed if there is attribution. Any more than just that then please contact me using any of the links on this site.