.aligncenter {
display: block;
margin-left: auto;
margin-right: auto;
}
Luke Larsen / Digital Trends
This story is part of our coverage of Computex, the world’s biggest computing conference.
When Microsoft announced Copilot+ PCs a few weeks back, one question loomed large: Why can’t I just run these AI applications on my GPU? Well, at Computex 2024, Nvidia finally provided an answer. Nvidia and Microsoft are collaborating on an Application Programming Interface (API) that will enable developers to run their AI-accelerated apps on RTX graphics cards. This includes the various Small Language Models (SLMs) that form the basis for features like Recall and Live Captions. With this toolkit, developers can allow apps to run locally on your GPU instead of relying on the NPU. This not only opens up the possibility of more powerful AI applications, as the AI capabilities of GPUs are generally superior to those of NPUs, but also allows for running on PCs that don’t currently fall under the Copilot+ umbrella. It’s a smart move. Copilot+ PCs currently require a Neural Processing Unit (NPU) capable of at least 40 Tera Operations Per Second (TOPS). Currently, only the Snapdragon X Elite meets that criterion. However, GPUs have much higher AI processing capabilities, with even low-end models reaching up to 100 TOPS, and higher-end options scaling even higher. In addition to running on the GPU, the new API adds retrieval-augmented generation (RAG) capabilities to the Copilot runtime. RAG gives the AI model access to specific information locally, enabling it to offer more useful solutions. We witnessed the full power of RAG with Nvidia’s Chat with RTX earlier this year.
Get your weekly teardown of the tech behind PC gaming
ReSpec
Subscribe
Check your inbox!
Privacy Policy
It’s a great development. Copilot+ PCs currently require an NPU with at least 40 Tera Operations Per Second (TOPS). Currently, only the Snapdragon X Elite meets this requirement. Nevertheless, GPUs have significantly higher AI processing capabilities, with even low-end models reaching up to 100 TOPS, and higher-end options scaling even higher. In addition to running on the GPU, the new API incorporates RAG capabilities into the Copilot runtime. RAG grants the AI model access to local specific information, allowing it to provide more helpful solutions. We saw the remarkable capabilities of RAG demonstrated vividly with Nvidia’s Chat with RTX earlier this year.
Nvidia
Aside from the API, Nvidia announced the RTX AI Toolkit at Computex. This developer suite, set to arrive in June, combines various tools and SDKs that empower developers to fine-tune AI models for specific applications. Nvidia claims that by using the RTX AI Toolkit, developers can make models four times faster and three times smaller compared to using open-source solutions. We are witnessing a wave of tools that enable developers to create specific AI applications for end users. Some of this is already evident in Copilot+ PCs, but I anticipate seeing even more AI applications by this time next year. After all, we have the hardware to run these apps; now we just need the software.