On the whole, Windows 10 is a good system, although we jokingly call it “Bug 10” every day, but it is undeniable that since the beginning of the project, the development team has been trying to add new features to it, many of which are quite practical, for example, they in the 1709 version, for the task manager to add GPU performance monitoring unit, users can intuitively see the current GPU occupancy rate through the task manager, It’s much easier than ever to open a GPU-Z program.
But many users in the actual use of the time also found that the performance monitoring for the GPU seems to be not quite accurate, my graphics card in full computing, the task manager inside the GPU occupancy rate is so low?
To find out, we found the developer’s blog when the new feature was introduced, which was included in the DirectX Developer Blog because of the graphics-related content.
First the developer tells us how Task Manager learned about GPU occupancy. Above Windows 10, the GPU is abstracted by Windows Display Driver Model (WDDM, Windows Display Drive Model), whose core, the graphics kernel, is responsible for abstracting, managing, and allocating GPU resources across all processes. It contains a GPU recorder (VidSch), which is responsible for assigning the various engines of the GPU to the processes that want to use them, and for arbitrates and prioritizing access, and a video memory that is responsible for managing the GPU-callable memory, including dedicated memory and shared system memory.
Task Manager is the use of GPU through The VidSch and VidMem return data, so that no matter what API the program uses (DX, OpenGL, OpenCL, and even the proprietary API such as CUDA, Mantle can be monitored), it can accurately collect the use of GPU, and because the two are actually responsible for allocating GPU resources, at the drive level, They also have higher accuracy in return data than many third-party tools, giving task manager sqr with high precision.
Given the high precision, why does it still not report my GPU occupancy? This involves another problem, the GPU engine.
In addition to the unified computing units used primarily for graphics and general computing, modern GPUs will integrate some other circuits, such as dedicated modules for video codec. Their relationships are generally parallel, the GPU can run both graphical computing and video encoding tasks, and at the drive level, these different modules are abstracted into different Engines, i.e. engines, for example, a typical GPU can have the following engines:
When performing specific tasks, different tasks are performed on different engines, such as when I play games, using a 3D engine;
Because of the multiplexing relationship between some engines, such as 3D engineand and CUDA engine reuse CUDA Cores, then if the occupancy rate is calculated by simple addition, the occupancy rate may exceed 100%. The development team has also considered using average utilization as a representation, but it is not reliable. Isn’t that 3D engine the most used, so what about it? It’s not very good, for example, when the video engine is fully loaded and the 3D engine is empty, it will show a 0% occupancy rate, which is also inaccurate. Ultimately, the development team chose to represent the current busiest engine occupancy rate as the overall GPU occupancy rate.
Well…… Bowen said very well, so far today this feature on line also for some time, its specific performance is what? Let’s look back at the chart at the top, where the overall occupancy rate on the left side of the GPU’s CUDA engine is still very low, apparently not as the development team says.
We tested something else here, using NVENC to encode the video, and you can see that the GPU occupancy rate in the left pane is running back to full load.
And when you run a typical 3D application, it’s also normal.
Finally, we tried the OpenCL load, and this time mission manager reflected the occupancy of the CUDA engine.
Thus, schrodinger usage at Task Manager GPU may be the result of a bug in Windows 10 that, in most cases, reflects the occupancy rate of the most heavily loaded engine, but in some cases it does not correctly show the current footprint of the busiest engine.