Arch Linux version 367.27 GT 650M overheats (stays at 950Mhz) on battery

Hello,

Nvidia GT 650M 2GB DDR3 on bumblebee

since last nvidia driver update (from nvidia-364.19 to 367.27), i observed a problem related to the nvidia clock frequencies.

On the previous driver, the card would never surpass 835Mhz when nvidia-settings reports: “battery”. This decreases my laptop’s temperature a LOT while gaming and i used this workaround in order to decrease temps. (from 90C max to ~82C)

After the recent nvidia driver update, my card reaches 950Mhz no matter if on battery or not, which makes my laptop overheat.I want to keep my card on 835Mhz max frequency.

What happens with driver 364.19 and is expected:

  • fresh boot to KDE on AC power, nvidia card is OFF (verified through cat /proc/acpi/bbswitch)
  • unplug computer
    or alternatively:
  • fresh boot on battery, nvidia card OFF

then:

  • open nvidia-settings. nvidia-settings reports “Power Source: battery”
  • nvidia card doesnt surpass 835Mhz while gaming

What happens with 367.27:

  • fresh boot to KDE on AC power, nvidia card is OFF (verified through cat /proc/acpi/bbswitch)
  • unplug computer
    or alternatively:
  • fresh boot on battery, nvidia card OFF

then:

  • open nvidia-settings. nvidia-settings reports “Power Source: battery”
  • nvidia card overclocks at 950Mhz while gaming.

Powermizer modes:

0: 135-405Mhz , 810Mhz
1: 135-950Mhz , 1800Mhz

Please let me know if you need any more info. I hope this issue is resolved ASAP.

Thanks for your help in advance.

here is nvidia-bug-report.gz Dropbox - nvidia-bug-report.log.gz - Simplify your life

EDIT i should also note that nvidia-340 has the expected behaviour of 835Mhz
EDIT2 clarified reproduction steps
EDIT3 added “sudo dmidecode” output

SAMSUNG NP550P5C-S02GR
Intel Core i5-3210M+Intel HD 4000
Geforce GT 650M 2GB DDR3
8GB ram
1TB 5400rpm disk
nvidia-bug-report.log.gz (172 KB)
dmidecode_output.txt (23.4 KB)
nvidia-bug-report.log.gz (70.8 KB)
nvidia-bug-report.log.gz (195 KB)
nvidia_364_output.txt (3.86 KB)
nvidia_375_output.txt (2.56 KB)
nvidia_settings_output.txt (25.1 KB)

>>nvidia card overclocks at 950Mhz while gaming.

What game you are playing? Is this issue hit as soon as you launch the game or its hit at some point while playing game? Please attach nvidia bug report to existing post.

Hello sorry for the late reply,

it happens with all my games immediately, after gpu usage surpasses ~40-60%. Actually it happens with (PAYDAY2, Shadow of Mordor, Team Fortress 2, Euro Truck Simulator 2, War Thunder, Chivalry Medieval Warfare, World Of Tanks/wine and Blender/CUDA rendering).

It doesnt happen with glxgears/beyond gravity.

Another issue i forgot to mention is that nvidia-settings only reports 1997 out of 2048mb VRAM on that driver. Will test with nvidia 367.35 and report back with new nvidia-bug-report.
I’ll attach the old bug report to the original post.

Is this issue hit without bumblebee ?

It has nothing to do with bumblebee, since those problems only started after the nvidia driver update to 367 from 364, which is still perfectly working, after i downgraded to it.
Only the driver changed.

However, I will try to test with PRIME as well just in case, though that has been problematic on my laptop in the past on Ubuntu (couldnt get to the login screen with any driver after 331) and is hard to setup on arch linux. Will report back with the results and bug report of driver 367-35 ASAP (most probably in 24 hours from when i posted this reply ).

Please share o/p of dmidecode command. What is the make and model of system you are running ?

OK finally on to this. I tried nvidia 367-35, but it didnt work at all. Probably an Arch Linux-specific packaging issue as i found out by googling it.

Hello again,

Might possibly be related to this: https://devtalk.nvidia.com/default/topic/949433/massive-fps-drop-down-on-gtx660-with-367-drivers/#4954490

The symptoms are different for me though, except from the incorrect memory, all of which happened to me since 367.27. Will test with 370.23.

Update: still having the same issues with 370.23

updated nvidia bugreport with output from version 370.23

Tracking this issue under bug 200233175

Bump. Sorry for the delay. nvidia 370.28 is still exhibiting the same problems. Updated bug report.

thanks oanonymos0 for this info but we don’t have exact notebook to reproduce this issue. We tried repro on similar notebook GPU Lenovo ideapad Geforce GT 650M + Ubuntu 16.04 + steam + driver 364.19 + PAYDAY2

Sorry, I should have clarified it first. 364.19 is NOT on the affected drivers,so you can’t repro on it. It is the last driver that works correctly and i have 0 issues with it. Faulty drivers are 367.27, 370.23 and 370.28.

Bump, i discovered that the Lenovo Ideapad you are testing with does not have the version of my card, it has the 2GB GDDR5 version, which has lower clocks.

I think the Lenovo laptop may be fine, for reproducing the wrong memory reporting issue. As i mentioned before, dinosaur_ here reported a similar issue

The other more pressing (for me) issue of overclocking on battery
could also be reproducible on GT 650M 1GB versions i suppose. However, I did mention my available powermizer modes and they must match for correct repro.

You shouldnt need my exact notebook model.

From my search :

Asus N56VZ, Acer Aspire V3-771G and Alienware M14x R2 also have the same GT 650M as mine)

Ideally you should try on a GT 650M 2GB DDR3 card.

I can make a video demonstration of the issue, if necessary. [including desired behaviour = what used to happen on previous driver versions && what happens with the affected drivers]

https://devtalk.nvidia.com/default/topic/949433/linux/massive-fps-drop-down-on-gtx660-with-367-drivers/
This is different issue and related to game FPS drop.

We have tested with Asus G75VW notebook Which supports max Graphics clock to 950 Mhz

Config:- ( ubuntu 16.04 + Driver 367.44 + Steam + payday2 game + GPU GTX 660 M )

Prepared setup and installed OS + steam + payday2 game + driver 364.19 and 367.44

Observation are as below tested with AC and battery mode, played game payday2 and noted powermizer value:-

Driver 367.44
AC mode → Crosses 950 Mhz During playing game
Battery Mode → Crosses 950 Mhz During playing game

Driver 364.19
AC mode → Crosses 950 Mhz During playing game
Battery Mode → Stays to 405 Mhz During playing game.

As you states with battery mode test on driver 367.27 and 364.19, I want to confirm whether the above results are expected behavior?


Also wanted to know the query about " I used this workaround in order to decrease temps. (from 90C max to ~82C) "

Did you customized any setting with powermizer? if yes need to know what user has customized ?

  1. You (almost) reproduced the issue. The only difference with my setup is:.

Driver 364.19
Battery mode → stays to 835 Mhz during playing game

To my understanding battery mode just disables turbo boost [correct me if i’m wrong here] and produces the desired result on <= 364.19 .

As for why it is 405 Mhz in your case, did you measure with payday on fullscreen minimized? AFAIK payday destroys the gl context upon minimize so makes the card reduce clocks (which seems to happen faster on battery mode on <=364.19). Alt+tab on windowed mode may not cause that.

Powermizer settings are all default (adaptive)

  1. Workaround: with drivers <=364.19 i turn on the GPU on battery, through bbswitch, by opening nvidia-settings. Then through the lower max clock, while running intensive apps, my laptop doesnt overheat. The card keeps its ‘battery’ state even if plugged in AC afterwards, so i can keep lower temp on AC as well.

In short, if you can make the GTX 660M and similar not surpass 835Mhz on battery, then you should have solved the issue. I will keep you informed as you release new drivers.

However, the problem might not be specific to my card, but if i am correct regarding turbo boost, it should be reproducible on more cards [my own speculation]

Please run below command in a terminal when reproducing this issue and send the log file.

nvidia-smi -i 0 --query-gpu=timestamp,pci.bus_id,utilization.gpu,utilization.memory,clocks.sm,clocks.mem,temperature.gpu,power.draw,pstate --format=csv -l 1 -f <file_name>

problem: nvidia-smi doesnt seem to be able to detect much. Here is a sample output, while idle:

timestamp, pci.bus_id, utilization.gpu [%], utilization.memory [%], clocks.current.sm [MHz], clocks.current.memory [MHz], temperature.gpu, power.draw [W], pstate
2016/10/21 00:02:29.745, 0000:01:00.0, [Not Supported], [Not Supported], [Not Supported], [Not Supported], 43, [Not Supported], P8
2016/10/21 00:02:30.746, 0000:01:00.0, [Not Supported], [Not Supported], [Not Supported], [Not Supported], 43, [Not Supported], P8
2016/10/21 00:02:31.747, 0000:01:00.0, [Not Supported], [Not Supported], [Not Supported], [Not Supported], 43, [Not Supported], P8
2016/10/21 00:02:32.747, 0000:01:00.0, [Not Supported], [Not Supported], [Not Supported], [Not Supported], 43, [Not Supported], P8

Just to clarify. You DID reproduce the issue[950Mhz in battery in 367.44]. What you failed to repro AFAIK is the expected behaviour on 364.19 [should be 835 instead of 405 maximum clock while gaming on battery, doesnt necessarily stay there all the time, depending on the scene of course].