Microsoft Desperately Wants Users To Talk to Their Windows PCs

The next version of Windows will be stuffed to the gills with AI. You may be asking, “Even more than before?” Yes, and Microsoft hopes to train you to quit using your keyboard and mouse to handle your PC. The company hopes you’ll use your voice to command your PC, like you’re some domineering captain on a ship and your crew is a hapless chatbot who’s desperate to understand your vague whims.
Starting Thursday, Microsoft is pushing more “experimental” features and future apps that will put the company’s Copilot AI directly in front of your Windows experience. Microsoft already ensured there would be a Copilot key on all new PCs that access Microsoft’s chatbot. Now, once you enable it in the Copilot app settings, you can start talking to your computer by screaming “Hey, Copilot” at your screen.
Copilot Vision looks at your screen and tells you what to do
If anybody still misses Cortana, now’s your time to shed a single tear. Microsoft already had its Copilot Vision function available on Edge browsers, but now it’s stretching its legs within the wider Windows software suite. Unlike past voice assistants, the new version of Copilot will have AI image recognition, and it should be able to comprehend what’s happening on your screen. This should mean you may have to issue less detailed prompts to the AI to get it to do what you want. And what does Microsoft expect you to use AI for? Well, it could replace all those how-to articles you see online. If you tell Copilot, “Show me how to get better quality audio in Spotify,” Microsoft said it will highlight the setting you need to hit on your screen.
Microsoft sat me down for a demo with the new Copilot Vision feature, though I’ve yet to try it using my own voice. The Voice dialogue was surprisingly fast in responding to queries about a math problem or about buying a dress online. However, when the user tried to get the AI to point out the right controls for changing image resolution on their Shopify account, the AI circled the wrong part of the page. It’s the curse of all live demos that something will likely go awry, but we can expect some idiosyncracies as Microsoft tries to get us talking to our Windows machines.

Microsoft said this AI vision system can look at an image on your screen and offer descriptions of what it sees. Apparently users would use this to type out a resume based on their own portfolio. In another example, Microsoft showed Copilot humming a mindless tune for a musician to riff off (no, the tune did not sound especially appealing). Copilot can now look at all your browser tabs at once and find products based on what you’re looking at. Google has also promoted AI shopping, though with more virtual “try on” features that create an AI image of yourself to imagine your body in that dress.
You’re going to start seeing new Copilot commercials real soon. These are designed to “teach” you about the fun and pleasure of using Copilot with your voice. But that’s not all. There are more full-blown features supposed to get you using AI. Windows Insiders will be able to access beta features that will put a Copilot function crowding out the other functions on your taskbar, replacing the regular Windows search bar (you can still use it to search for files or settings, just as before).
A Copilot Actions app literally takes over your PC
Microsoft said users already talk to their PCs, though usually for the sake of dictation or notetaking. Plus, speech recognition is already a standardized feature for accessibility purposes. Still, there’s a wide gulf between those use cases and literally talking to your PC without annoying your deskmates trying to work just a few feet away. Instead of offering the ability to type to Copilot Vision out of the gate, Microsoft is limiting it to Windows Insiders beta testers to start.
Beta testers will also be first to try out the newfangled Copilot Actions app. Microsoft described the application as an “AI agent” that can take actions for you across different apps and files. In AI circles, an “agent” is essentially multiple AI models working together to complete a more complicated task. On Windows, this means it could essentially take over your PC, run programs for you, and fulfill your demands. Anthropic’s Claude AI showed off similar PC-takeover capabilities last year.
Over the last few years, Microsoft has tried promoting hardware-specific AI capabilities, first with “AI PCs” in 2023 and then “Copilot+ PCs” in 2024. Now, according to Microsoft, “every Windows 11 PC” is an AI PC once it’s connected up with Microsoft’s cloud-based AI. Microsoft itself admitted the AI PC “hasn’t really come alive yet.” You could lay some of the blame for that at the Windows maker’s feet. Last year, it tried to push Recall, a feature that would screenshot everything you do on a PC. An AI would scrape those screenshots and let users search through their past activity to find old web pages or documents they were working on.
Security researchers found the feature could screenshot sensitive data like bank info, and anybody with access to the PC could find it. Microsoft pulled Recall and didn’t release it for an entire year. Even after a big push for security, the feature still isn’t foolproof, and several companies have blocked it over fears it could lead to sensitive data being shared. The Copilot Actions app and all the other features are “opt in.” The Actions app is turned off by default, and you need to enable it through settings. Microsoft promised you can take control “at any time,” and the program may ask for permission for specific actions.
No matter what, since these features are cloud-based, your data will need to be processed on a foreign server, not on your device. Microsoft promises it isn’t storing or abusing your prompts or whatever appears on your screen. After the Recall snafi, it’s increasingly difficult to trust the Windows maker. Now that Microsoft wants you to put privacy concerns aside for the sake of its ever-present AI, it’s going to have to offer a more compelling use case than an AI that hums on command.
gizmodo