Sluggish Performance/no Reclocking (Ubuntu 17.04, Kernel 4.12RC2, Nvidia Quadro M2200, Driver 381.22...

Hello!

I’ve got a brandnew Lenovo Thinkpad P51 (Kaby Lake Xeon + Quadro M2200) and I want to use ONLY Linux with it.

In “hybrid graphics mode” (UEFI Setting), when one can switch between Intel and Nvidia graphics, only in Intel-mode graphics performance is satisfying. There is adequate video and 3D acceleration (on Intel level of course…), but I only can use the internal Notebook display.
When I switch to Nvidia mode, performance is really sluggish. I’ cant even resize my windows in a useful manner, video acceleration doesnt work not to meantion the inexistent 3D acceleration. In nvidia-settings tool one can see, that there is no reclocking. Adaptive Clocking is ENABLED, set to AC-Mode but the Performance Level seems to be stuck at the lowest (0 out of 2). Maybe thats the culprit.

There is also a UEFI Setting “discrete graphics mode” which will force usage of the Quadro. The problems are exactly the same as in hybrid mode set to nvidia usage.

In addition to that, only with Nvidia graphics I can use external monitors. So a working nvidia system is really important for me.

Software used:
Ubuntu 17.04 Mate
Kernel 4,12RC2
Nvidia Driver 381.22

I can try different software configurations, but I tested older Kernel/Driver combinations and it didn’t work either.

Help is appreciated, because I can’t make use of this expensive piece of hardware in it’s current state.
nvidia-bug-report.log (266 KB)

The bug report you attached was generated while you where using the iGPU, please switch to nvidia and run nvidia-bug-report.sh again.
BTW, you dmesg shows a problem with your wifi (lots of kernel oopses due to iwlwifi) and probably a power management problem.
Edit: maybe go back to a 4.4 kernel to have a conservative setup.

I’m sorry for attaching the wrong logfile. Here is the new one…

I will try more (even older kernels like 4.4 in the next few days). But I can tell you, WiFi is working. At least I can connect to an access point and get decent data rates. I didn’t try more sophisticated modes.
nvidia-bug-report.log (584 KB)

i have the same Problem with ubuntu 16.04 / 4.10 / 381.22 / dell precision 7520

@repzion_isnogood: the new logs look fine regarding iwlwifi, maybe just a one-time hickup. On the downside, they don’t give a clue what is going wrong concerning the dGPU. Which drivers did you try so far, 375,378?
Can you please unmark my answer if that’s possible? Seeing ‘Answer accepted’ anybody would think your problem has been resolved.
Unrelated to this, is it possible to suspend and resume while switched to nvidia?

@kred: please also run nvidia-bug-report.sh and attach output while switched to nvidia.

Since both of you are running quite new Quadros, it seems to me this is a driver bug so it is important to try every driver instance from 375 on to rule out a possible regression.

@repzion_isnogood: noticed something, nvidia-smi -q gives:
Clocks Throttle Reasons
Applications Clocks Setting : Active

Applications Clocks
Graphics : 696 MHz
Memory : 2754 MHz

Can you reset that with nvidia-smi -rac or nvidia-smi -ac <memclock,gpuclock>

nvidia-smi -q

Clocks Throttle Reasons
Idle : Not Active
Applications Clocks Setting : Not Active
SW Power Cap : Active
HW Slowdown : Not Active
Sync Boost : Not Active
Unknown : Not Active

vor 2 week i have tlp enabled :-(

PCI Express Active State Power Management (PCIe ASPM):

default, performance, powersave

PCIE_ASPM_ON_AC=performance
PCIE_ASPM_ON_BAT=powersave

Set CPU performance versus energy savings policy:

performance, normal, powersave

Requires kernel module msr and x86_energy_perf_policy from linux-tools

ENERGY_PERF_POLICY_ON_AC=performance
ENERGY_PERF_POLICY_ON_BAT=powersave

and … i dont have any setting with “nvidia” in “tlp” configfile :-( only with radeon

@kred: you have SW power cap active, what does
nvidia-smi -q -d POWER
tell you, can you reset it using nvidia-smi -pl

I did’nt have time yet, to try the other things. I’ll try that in the evening.

For nvidia-smi -pl 45, or other power limits it outputs as following:

Changing power management limit is not supported for GPU: 0000:01:00.0.
Treating as warning and moving on.
All done.

nvidia-smi -q -d POWER

==============NVSMI LOG==============

Timestamp                           : Tue May 30 06:39:05 2017
Driver Version                      : 381.22

Attached GPUs                       : 1
GPU 0000:01:00.0
    Power Readings
        Power Management            : N/A
        Power Draw                  : N/A
        Power Limit                 : N/A
        Default Power Limit         : N/A
        Enforced Power Limit        : N/A
        Min Power Limit             : N/A
        Max Power Limit             : N/A
    Power Samples
        Duration                    : Not Found
        Number of Samples           : Not Found
        Max                         : Not Found
        Min                         : Not Found
        Avg                         : Not Found

So maybe the card has a fixed (very low) limit, as long as the driver doesn’t tell it otherwhise? With some of the hints above I indeed are able to reclock the card for short periods, but performance won’t improve. I guess that is because the still active power limit.
We are getting there…

I had some spare time for testing the combination of Kernel 4.12RC2 and following additional nVidia drivers: 378.13 and 375.66

In both cases the behaviour was the same. It did’t get better, but it also didn’t get worse.

I will switch back to 381.22 and update the kernel to 4.12RC3.

Tests with the older kernels 4.11 and 4.10 will follow.

Same here.

@Functor, please also run nvidia-bug-report.sh and attach output file.

Hi repzion_isnogood,
Looks like you are running PRIME setup. Did you try switching from NVIDIA (Performance mode) and Intel (Power saving mode) from nvidia-settings ? Any difference ? Also what if you only use nvidia gpu when disabled Intel from SBIOS settings? Can you share video recording showing desktop sluggishness ? Also test with old drivers too from r375, r370, r367 if these is any improvement in performance.

In PRIME Mode with Intel graphics, performance is as expected, but I cannot use external displays. When I use nvidia in PRIME Mode, performance sucks, but external displays work.

I tried PRIME Setup (“Hybrid” Setting in BIOS) and without PRIME (“dedicated” Setting in BIOS), but everytime the dGPU is being used, performance is bad.

I tested 375.66, 378.13 and 381.22 drivers, each with Kernel 4.10, 4.11 and 4.12, everytime the same.

In the meantime I also updated the BIOS/UEFI of this particular notebook from 1.07 to 1.08. Nothing in the changelog seems to be affecting graphics. It also didn’t improve my problems, so this is just a sidenote for other users with these problems.

Atm I can’t provide you with a video, but I will create on on the weekend.

Another sidenote:

Noveau has decent performance, when using multiple monitors. But you can only use the miniDP output on the notebook. The connectors on the docking station do not work.

Dell Precision 7520 / Quadro M2200 / Ubuntu 17.04 / NVIDIA 381.22

Same issue as everyone else - desktop super slow on NVIDIA driver using the discrete card; Intel integrated controller is good with the NVIDIA driver enabled and “Intel (Power Saving Mode)” selected; nouveau is bearable.

If in my BIOS I check “Enable switchable graphics” and uncheck “discrete graphics controller direct output mode”, I can drive my 4K monitor over display port with the Intel controller. Performance is good. Same driver but with “NVIDIA (Performance Mode)” selected is very slow. The slowness is especially apparent in compositing operations and scrolling. I’ll post video of it this evening.

I’ve tested r370 and r375 and the performance seems to be the same. Let me know if there’s anything else I can do to help in resolving this.

If in my BIOS I check “Enable switchable graphics” and uncheck “discrete graphics controller direct output mode”, I can drive my 4K monitor over display port with the Intel controller. Performance is good.
What If in BIOS I check “Enable switchable graphics” and check “discrete graphics controller direct output mode”, Performance is good?

I think nobody here running external display. Issue is with only internal laptop display.

Please share nvidia bug report as soon as issue observed. Get Video for both INTEL and NVIDIA performance. What desktop env you are running unity, gnome or else? Is the issue observed without connecting external monitor also?

Actually I AM running an external display. It is an additional 4k monitor.

Better: I would LIKE to run an external display.

ATM I run the internal display + external display (both 4k) with nouveau driver.
That means I can resize windows and watch Youtube videos, but nothing that’s more demanding…

The sluggish behaviour running the proprietary nvidia doesn’t change if theres an external display or not.

Can someone share video showing difference in sluggishness between NVIDIA and INTEL options from nvidia-settings with only laptop internal display ? Hope you all are using same desktop environment.

Here you go: