Titan X GPU throttling due to capped fan speed?

Hello Community,

I don’t know if this topic fits here - if not, feel free to move my topic.

I am using a Titan X (Pascal) to speed up various CUDA applications in AI / Machine Learning. Mainly I am utilizing PyCuda and TensorFlow (Linux exclusively).

Recently, I found out that all of my applications have performance drops after running for a while. To me it looks like this is an cooling issue, because on load the GPU reaches 83°C quite fast and stays on this temperature (with the GPU clock being throttled to prevent further heating, I guess).

However, I noticed that the GPU’s fan speed is capped at about 50%. Is this done intentionally?

If not, how can I (preferably permanently) change the fan’s preset (to go for 100% under load)?

Is there any reason why CUDA accelerated applications should not run at maximum speed (with 100% fan speed)?

best wishes

Daniel

The fan-speed capping is regularly complained about by power users. I have two hypotheses why it exists:

(1) Noisy fans quickly lead to many complaints from users. NVIDIA has historical experience with that phenomenon, so fan control curves for consumer parts are tuned for silent operation.

(2) The fans used may not actually be rated for 24/7/365 usage at 100% fan speed for the projected useful life-time of a GPU (say, four years). Would be a pity if a failing $5 fan rendered a $1000 GPU unusable.

As far as I am aware there are third-party apps that override the NVIDIA-blessed fan control curves. I don’t use them and wouldn’t know where to find them, but Google could be your friend. You could also research after-market water cooling solutions.

Note that fan speed is just one part of the cooling equation. Ambient temperature and air flow to the GPU are other parts, and you may be able to adjust these, especially the latter. Some systems have excessive venting of hot air into the case, which increases the ambient temperature seen by the GPU at air intake. Also, other components, including other GPUs and cabling, frequently obstruct proper air flow reducing the effectiveness of the fan at given speed.

Unsurprising, whether the machine as a whole is in a temperature-controlled environment will also have an impact on device cooling. For cost reasons, I let the temperature in my home office fluctuate between 60 deg F and 90 deg F depending on the time of year, and have noticed a definite effect on the CPU cooling and throttling in my workstation.