num_physpages and support for 3.11 and later kernels
The NVIDIA Linux driver tries to guarantee proper GPU behavior by validating that memory allocated to the driver by the kernel can always be addressed by the installed GPU. A failure to do so can cause the GPU to truncate addresses and access the wrong location in system memory. Traditionally, the NVIDIA Linux driver has accomplished this by checking the highest allocatable system memory address via the page frame number identified in the num_physpages variable. If this address surpasses the GPU's capabilities, the driver will fall back to the 32-bit DMA zone to ensure stability. The Linux 3.11 kernel removed access to the maximum page frame number via num_physpages. This was replaced with get_num_physpages(), which reports the total number of physical pages in the system. While the highest page frame number and amount of memory in a system are related, they are distinct values, and cannot be treated equivalently. Knowledge of the amount of memory is not sufficient to identify whether all memory will be addressable by the GPU. It is possible to offer limited support for 3.11 and newer kernels by using get_num_physpages() and falling back to the 32-bit DMA zone slightly more aggressively, and this is the approach that will be taken in the NVIDIA Linux driver for now. This is a temporary measure, which should be sufficient for most users of the NVIDIA Linux driver; however, systems with very large amounts of memory (128 GiB or more) may be adversely affected. Users of such systems are advised to continue using pre-3.11 kernels. A more robust driver solution is still in the works, but will not be available in the immediate future. Upcoming driver releases will include this workaround, and the driver documentation will be updated to include a more detailed explanation and enumeration of GPU addressing capabilities; in the meantime, official patches to add support for Linux 3.11 and later are attached to this message. These patches can be applied to the NVIDIA .run installer with the --apply-patch command line option. Please make sure to select the correct patch for your driver version.
The NVIDIA Linux driver tries to guarantee proper GPU behavior by validating that memory allocated to the driver by the kernel can always be addressed by the installed GPU. A failure to do so can cause the GPU to truncate addresses and access the wrong location in system memory. Traditionally, the NVIDIA Linux driver has accomplished this by checking the highest allocatable system memory address via the page frame number identified in the num_physpages variable. If this address surpasses the GPU's capabilities, the driver will fall back to the 32-bit DMA zone to ensure stability.

The Linux 3.11 kernel removed access to the maximum page frame number via num_physpages. This was replaced with get_num_physpages(), which reports the total number of physical pages in the system. While the highest page frame number and amount of memory in a system are related, they are distinct values, and cannot be treated equivalently. Knowledge of the amount of memory is not sufficient to identify whether all memory will be addressable by the GPU.

It is possible to offer limited support for 3.11 and newer kernels by using get_num_physpages() and falling back to the 32-bit DMA zone slightly more aggressively, and this is the approach that will be taken in the NVIDIA Linux driver for now. This is a temporary measure, which should be sufficient for most users of the NVIDIA Linux driver; however, systems with very large amounts of memory (128 GiB or more) may be adversely affected. Users of such systems are advised to continue using pre-3.11 kernels. A more robust driver solution is still in the works, but will not be available in the immediate future.

Upcoming driver releases will include this workaround, and the driver documentation will be updated to include a more detailed explanation and enumeration of GPU addressing capabilities; in the meantime, official patches to add support for Linux 3.11 and later are attached to this message. These patches can be applied to the NVIDIA .run installer with the --apply-patch command line option. Please make sure to select the correct patch for your driver version.

Daniel Dadap
NVIDIA Linux Graphics

#1
Posted 10/31/2013 03:01 AM   
Scroll To Top