Very(!) slow ramp down from high to low clock speeds leading to a significantly increased power cons

Old description: [s]a simple act of scrolling a web page in Mozilla Firefox 52.0.2 makes power consumption of my GTX 1060 6GB go from 7W to 35W.

I’m not running any compositing manager - I’ve got a plain X11 desktop with zero effects.

Please fix this bug.[/s]

New description: new NVIDIA drivers may take up to 36 seconds to go from high clocks to lower clocks even when you have a very light load (say, a web browser without running WebGL). This leads to a decreased battery life and significantly increased power consumption.

Edit: in newer drivers the situation is even worse, so I’m quoting a message later in the thread.

Affected drivers: 381.xx series, 384.xx series.

Edit 2:

It looks like people find this thread and walk out without a solution.

There’s a way to force maximum power saving mode at the expense of not being able to run fast the newest and shiniest games:

Section "Device"
        Identifier      "Videocard0"
        Driver          "nvidia"
        Option          "Coolbits" "28"
        Option          "metamodes" "nvidia-auto-select +0+0 {ForceCompositionPipeline=On, ForceFullCompositionPipeline=On}"
        Option          "RegistryDwords" "PowerMizerEnable=0x1; PerfLevelSrc=0x2222; PowerMizerLevel=0x3; PowerMizerDefault=0x3; PowerMizerDefaultAC=0x3"
EndSection

Edit 3 2019-07-31:

In drivers 418.56 and newer the situation has been improved and now ramp down takes around 15 seconds.

Unless you’re also using the fbdev or vesa X driver, then X11 rendering itself is accelerated. Are you seeing short spikes in power, or do you see sustained 35W power draw after scrolling the web page and then letting it sit idle for a while?

A temporary boost in GPU clocks and correspondingly higher power draw is expected.

Yes, I see spikes in power consumption whenever I’m scrolling a web page or I’m visiting a page where information changes based on a timer or push events. My concern is that the driver spends too much time in P0 mode (with increased power usage).

Out of curiosity I’ve just carried out a test to estimate the time the driver keeps power usage up - it turns out it’s roughly five seconds, which is a tad too much in my opinion. For instance the CPU driver changes power levels in less than a tenth of a second. It’s understandable that the GPU driver might need to spend more time in high power modes to provide smooth desktop experience, but five seconds are definitely an overkill.

Here’s another confirmation if this issue. Whenever my desktop session is completely idle and I don’t do anything, my GPU temperature drops to roughly 34C. If I start using a web browser (for some reasons, e.g. Google Chrome makes the driver spend even more time in P0), my GPU temperature rises to ~53C and stays there.

That all means we are talking about a difference of 7W vs 35W for prolonged periods of time which is definitely what you might want to take into consideration.

Also I’ve been asking for this for years already, but I would like the NVIDIA control panel (nvidia-settins in Linux) to have “Maximum Power Savings/Minimum Power Consumption” for PowerMizer Preferred Mode.

I guess there’s a hard coded variable (in seconds) in the Linux driver which says after how much time the GPU cools off, downclocks itself and changes its power mode. I would be forever grateful, if you exposed it via mod params in case you don’t wanna change the default value (which seems to be 5).

Bump.

Another bump.

Same here, I have found this to be related to my virtual desktop resolution, whenever I switch to using single 1080p display with my 1080 (and with my older 970) it allows for higher power saving levels, but whenever I have a two 1080p displays turned on (combined resolution, 3120x1920 or 3860x1080 depending on tilt) it remains in P0.

Birdie any chance you use multimonitor setup or can reproduce this to switching to lower resolution?

tomas@tpnb ~ % nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 381.22                 Driver Version: 381.22                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    Off  | 0000:01:00.0      On |                  N/A |
| 30%   54C    P0    51W / 200W |    863MiB /  8105MiB |     34%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0       385    G   firefox                                         58MiB |
|    0     16884    G   mupdf                                           17MiB |
|    0     17142    G   ...el-token=                                    67MiB |
+-----------------------------------------------------------------------------+
tomas@tpnb ~ % xrandr -q
Screen 0: minimum 8 x 8, current 3120 x 1920, maximum 32767 x 32767
DVI-D-0 connected primary 1920x1080+1200+0 (normal left inverted right x axis y axis) 531mm x 298mm
   1920x1080     60.00 + 144.00*  119.98    99.93  
   1440x900     119.85  
   1280x1024    119.96    75.02    60.02  
   1024x768     119.99    75.03    60.00  
   800x600      119.97    75.00    60.32  
   640x480      120.01    75.00    59.94  
HDMI-0 connected 1200x1920+0+0 left (normal left inverted right x axis y axis) 546mm x 352mm
   1920x1200     59.95*+
   1920x1080     60.00  
   1680x1050     59.95  
   1600x1200     60.00  
   1440x900      59.89  
   1280x1024     60.02  
   1280x960      60.00  
   1024x768      60.00  
   800x600       60.32  
   640x480       59.94  
DP-0 disconnected (normal left inverted right x axis y axis)
DP-1 disconnected (normal left inverted right x axis y axis)
DP-2 disconnected (normal left inverted right x axis y axis)
DP-3 disconnected (normal left inverted right x axis y axis)
DP-4 disconnected (normal left inverted right x axis y axis)
DP-5 disconnected (normal left inverted right x axis y axis)

When turning off the monitor:

+-----------------------------------------------------------------------------+
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 381.22                 Driver Version: 381.22                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    Off  | 0000:01:00.0      On |                  N/A |
| 56%   64C    P8    12W / 200W |    470MiB /  8105MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0       385    G   firefox                                         69MiB |
|    0     16884    G   mupdf                                            7MiB |
|    0     17142    G   ...el-token=                                    36MiB |
+-----------------------------------------------------------------------------+

I’ve got a single monitor ;-)

Is it 4k or something?

Back in my GTX970 I mitigated this problem by flashing custom bios that lowered TDP base clock to 500Mhz, but that is impossible on my 1080 since it requires signed bios images. So its pulling 50W extra 24/7 (I only turn off my computer to load new kernels every week or so).

1080p ;-)

I cannot flash a custom BIOS because Pascal BIOS’es are signed and you cannot modify them any longer. Flashing a ROM from another GPU might be disastrous since different GPUs have different fan profiles/voltage curves/boost frequencies/etc.

Thats quite literally what I said in my comment (I have both GTX970 and GTX1080).

Darn, I was sleepy and skipped most of your comment ;-)

Bump.

I think I have the same problem.

First, a little history. I don’t use my PC all that much. The last time I played a game on it was about a year ago. I turn it on maybe a couple of times a month to read up on things and update the OS and look for any newer NVidia drivers. (OS of choice is Linux Mint 18.1 but I’ve been booting into Manjaro and Ubuntu as well.)

Last time I played Insurgency was about a year ago with friends. Everything worked fine. However, recently, we all went to play Insurgency again and after a map loads, I notice that my UPS fan kick into overdrive because of the sudden increase in power load.

After some troubleshooting, I determined that the NVidia driver is the culprit.

When sitting at the desktop, my GTX 670 has a temp of about 30C and the load against my UPS is around 270 watts. As soon as the Insurgency map loads, GPU temp shoots to 50-53C and stays there and the draw on my UPS goes from 270W to about 470-500W. Quit the game (or just minimize it) and everything goes back to normal. I repeated this with Manjaro, Mint and Ubuntu. (No problem in Windows 7)

I switched to the Nouveau Opensource driver and I don’t have this GPU/power consumption problem. I have this problem with the 375.39 driver and the 375.66 version.

Looks like I might need to roll back to the 375.20 or 375.26 driver.

Nvidia devs, could you please address this runaway GPU/power consumption problem?

Do you have sync-to-vblank enabled? If you run a game, the driver will bump the clocks up to the maximums in order to provide the best performance, and then start dropping them back down if it finds that the higher clocks aren’t helping. If you start a game and the clocks and power draw stay high, then it’s because the game is using every bit of processing power it can get.

@birdie, I can bring up adding a module parameter to adjust the clock curves, but I’ll be honest: I think the answer from that team is going to be a hard no. Historically, the main thing that kind of knob has been good for is shooting yourself in the foot.

Nouveau likely doesn’t exhibit this behavior because dynamic reclocking is not implemented. You’ll notice a corresponding reduction in performance to go along with the lower power draw.

Well this is odd. So tonight I left the .66 driver on there and fired the game up again. GPU temp’s still climbed to 50-53C (with or without sync-to-vblank enabled) and even though my wattage use jumped back to 450-500…my UPS fan never kicked on. That fan kicking on is the only reason I noticed this. Maybe when I last played this game, there just wasn’t sufficient load on the UPS and it was able to handle the increased draw by the GPU without needing the fan!!!

Well, everything is fine now. For me at least. Thanks for taking the time to reply!

Oh, and for what it’s worth, please keep the love for Linux coming! I finally ditched Windows 7 for Linux and I’m never ever ever ever using Windows 10 (or any future Microsoft OS for that matter) and I need me some good GFX drivers…for that one or two times a year I game. :)

This sounds like a great idea.

Keeping the GPU at the highest performance level for 5 seconds straight when there’s no discernible load is shooting one’s power bill and killing one’s thermals.

I can understand why there might be resistance to adding a knob which can adversely affect performance if misused, but what changed in the behavior between the new and old drivers? I used to easily get 4 hours on my laptop when I wasn’t using any 3D accelerated apps, now I’m struggling to get 3 hours using nothing but email, web, and terminal (and I have hardware acceleration disabled in Firefox and Chrome). I get around 5 hours with the Nouveau drivers, for reference.

The nvidia-settings and [irq/132-nvidia] readings in powertop are consistently higher, and my GPU temperature is 5-10 degrees hotter on average than with the previous drivers. If there were settings in nvidia-settings that allowed for maximum battery/minimum power, I would gladly choose these for destkop work and create profiles for the few applications I use that need 3D accel.

To jump back in here… was thinking about this at work today… why don’t I see my UPS sucking as much power (and GPU temp not climbing as high) in Windows?

Not complaining…just wondering why the driver under Linux seems to run my GPU hotter than when playing the same game in Windows?

Thanks

Most Linux games are Windows ports using D3D->OpenGL translation which is far from 100% effective which means lower FPS and higher power consumption. Also make sure your Vsync settings in Windows and Linux match.

Any updates on this one, Aaron?