In Windows 10 Version 2004, the development team added an option for the system called “Hardware Acceleration GPU Plan”, which is hidden in Display Settings-Graphics Settings and is available as an experimental option. Microsoft officials haven’t explained much about this option, and we’re the hearsay that says it reduces the display delay or what.
At the end of last month, officials finally gave a full explanation of this option, this article on the official blog post, about the new hardware accelerated GPU program options, the specific workings and role, and integrated a number of media test data to see how much it plays.
“Hardware Acceleration GPU Program”
The official translation of the word Hardware-accelerated GPU scheduling into the hardware-accelerated GPU program, which actually turns into “scheduling” here will allow more people to understand what it means, and it’s a more Microsoft-inspired translation.
Regression is true, to talk about GPU scheduling management, we first want to understand what the WDDM GPU scheduler is.
WDDM GPU Scheduler and Command Buffer Queue
Starting with NT 6, Microsoft introduced a new display driver model to Windows, known as the Windows Display Driver Model, or WDDM. Before the advent of WDDM, the application could submit tasks directly to the GPU, where the system had only one global task queue, and the task was scheduled in strict accordance with the first-come, first-performing principle. Given that the scene spent on GPUs at the time was basically a full-screen game or a professional rendering of something, this scenario was not a problem and has been used for many years.
To the time when the application began to use GPU acceleration, for example, Windows to directly use GPU to accelerate the rendering of the entire UI interface, then then use this global task queue will have a problem, such as rendering the system interface in front of the task of a task submitted by another program, then the GPU will first deal with that task and then back to deal with the system’s needs, which will cause the entire system UI Cardon. In order to properly prioritize GPU work, a new task scheduler is required, which is responsible for prioritizing GPU tasks.
WDDM then introduces a task scheduler that runs on the CPU as a high-priority thread, coordinating, prioritizing, and scheduling the work of various application submissions. From The WDDM 1.0 above Vista to Windows 10 Version 2004’s WDDM 2.7, officials have been beefing up the function of this scheduler. However, there are limitations to this way of managing, mainly because of the additional overhead of submission and the delay in the task reaching the GPU, although these limitations are covered up by the rendering buffer queue of traditional graphics applications in practice. Buffering is used to hold things like pre-prepared rendering commands, and so on, and when the GPU renders the current frame, the CPU is already preparing the next frame, the next frame, and even more frames thereafter. This approach guarantees good execution parallelism between the CPU and GPU, and also reduces overall performance overhead, which is now a common GPU call. At the same time, in order to reduce the additional performance overhead of submitting rendering commands at a reduced frequency, the average application prepares multiple frames of content to be sent together to the queue in advance. The problem is that the more frames are buffered, the higher the delay that the user can feel.
However, if you want to reduce the depth of the buffer queue to reduce latency, it can also result in increased commit overhead, affecting performance. There is a trade-off between programs that may reduce latency by committing fewer frames per commit at a higher frequency and less frames per commit at a lower frequency to reduce additional scheduling and commit overhead. So Microsoft decided to modify the infrastructure of its display driver model and introduce the Hardware Acceleration GPU Program.
Give task scheduling to dedicated hardware
The new option introduced in Windows 10 Version 2004 is to allow the system to leave the vast majority of scheduling tasks to THE GPU’s proprietary hardware scheduler, and Windows will continue to control the priority of program calls to the GPU, but high-frequency tasks will be managed by the GPU’s dispatch processor, which is responsible for quantum management and switching contexts for various GPU engines.
In NVIDIA’s official term, the new option is to allow the GPU to directly manage its own memory, or memory, which was previously managed by the system.
There are two prerequisites for enabling this new scheduling method, one is that it requires hardware support: it requires the GPU itself to have a hardware module that handles scheduling tasks, and the other is driver support: the system requires a display driver that meets the WDDM 2.7 standard. This option only appears in the system settings when both your drivers and hardware are supported. In addition, the introduction of new scheduling methods has a major and fundamental change in the driver model, which may have an unknowable effect at some point and in certain scenarios, so Microsoft uses it as an experimental option that is turned off by default. The development team is still comparing performance differences between the two schedulers, while also monitoring the reliability of the new scheduler, an option that may become the default on on on supported hardware in the future.
The GPUs currently supporting this feature are NVIDIA’s Pascal GPU and Turing GPU, AMD’s RDNA GPU, and Intel’s side is unknown.
Actual testing: Not much impact on high-end platforms
Well, with that in mind, let’s take a look at how this feature is actually performing, and we’ve got test data from Tom’s Hardware and Wccftech (the following picture is from Tom’s Hardware and Wccftech).
Tom’s Hardware uses three test platforms, namely Core i9-9900K-RTX 2080 Ti, Ryzen 9 3900X-RTX 2080 Ti and Core i9-9900K-GTX 1050. Judging from the test results of the five games, there is basically no user-perceived difference.
Wccftech chose two platforms, core i9-9900K and RTX 2080 Ti/GTX 1650 SUPER, and the performance changes in enabling hardware scheduling on the RTX 2080 Ti are not noticeable, but there is a significant improvement on a mainstream graphics card such as the GTX 1650 SUPER. The reason, if explained by NVIDIA, is that the efficiency of direct GPU management has brought about a certain improvement in the efficiency of existence. Perhaps this feature will deliver a significant amount of free performance growth for many mainstream platforms, with minimal impact on high-end platforms.
Summary: Good technology that still needs time to improve
Therefore, the Hardware Acceleration GPU Program is essentially a new technology that has a great impact on the Windows graphics architecture, which requires new hardware and new drivers to achieve, and can bring some performance improvement to the platform. But at present it is still in a test state, THE GPU plant’s support for it is still a just available state, and needs further official optimization and refinement. This is also Microsoft for the next generation of graphics applications to make changes to the system, in order to minimize latency, so that the system to keep up with the development of the times. It’s a good technology, but there’s still a long way to go.