Vsync issue Nvidia Prime (UX32VD with GT620 M)

Hi,

I am using last Ubuntu 14.04 LTS release on my Asus ultrabook UX32VD.

In order to get Optimus working i installed Prime.

I have noticed that when i am switching on my GT620M Nvidia card (running 331.38 Nvidia proprietary driver) :

  1. I have no Vsync to blanck Option in Nvidia-setting compare to my desktop computer (no optimus 570GTX)
  2. I am suffering large tearing effects due to lack of Vsync during gaming.

Is there any solution to avoid this ? Is this issue known and is there any chance to see an update to correct this ?

Thanks in advance for your reply.

When prime is enabled, there is currently no synchronization between the source device producing the pixels and the sink device reading them. I.e., in a typical NVIDIA + Intel configuration, the Intel chip just scans out the shared buffer constantly, without regard to when the pixels are copied into it.

The README mentions this in Chapter 32, “Offloading Graphics Display with RandR 1.4”:

Thanks for your reply, i hope it could be fixed one day by X.org team.

Regards.

Hello aplattner, what kind of limitations the X.Org X server have?

Ty in advance for any consideration.

Cheers.

I have an idea. If all rendering is done on the Nvidia GPU and all the display frame display is done on the Intel GPU, is it possible to force the intel GPU to wait long enough and store the buffer, then synchronize the frames before displaying it on the screen via intel’s TearFree solution in conjunction with nvidia-prime? or DMA-BUF cross-buffer synchronization is already worked on for proper Nvidia Optimus GPU switching using the official Nvidia drivers?

From intel kernel interface manual

Option “TearFree” “boolean”

Disable or enable TearFree updates. This option forces X to perform all rendering to a backbuffer prior to updating the actual display. It requires an extra memory allocation the same size as a framebuffer, the occasional extra copy, and requires Damage tracking. Thus enabling TearFree requires more memory and is slower (reduced throughput) and introduces a small amount of output latency, but it should not impact input latency. However, the update to the screen is then performed synchronously with the vertical refresh of the display so that the entire update is completed before the display starts its refresh. That is only one frame is ever visible, preventing an unsightly tear between two visible and differing frames. Note that this replicates what the compositing manager should be doing, however TearFree will redirect the compositor updates (and those of fullscreen games) directly on to the scanout thus incurring no additional overhead in the composited case. Also note that not all compositing managers prevent tearing, and if the outputs are rotated, there will still be tearing without TearFree enabled.

Thank you. And sorry to bother you.

Since X11 has no concept of frames as far as I’m aware, it’d break the buffers not because the GPU output isn’t synced to the monitor, but because the Intel GPU displays the buffer while the nvidia GPU is still rendering into it, if I read aplattner’s reply correctly.

The hopes are that widespread support for wayland will eventually fix this by making X11 obsolete for the home user, as wayland is frame-perfect.

fratti, while it’s true that X doesn’t traditionally have frames (though they were sort of added with the new Present extension), the issue here is a separate lack of synchronization between PRIME devices. Recent kernels added some locking / fencing support that is on my TODO list to look into. From a quick skim, though, it doesn’t look like the Intel kernel driver implements it either so it’ll probably require some more work.

Thanks for putting this on your TODO list. It’s probably quite a long list considering the recently unveiled driver changes to make EGL/Wayland work, but I’m glad there’s hope for the future.

Since most laptops with dedicated GPUs also come with an Intel GPU these days, it’s hard to avoid Optimus.

please hurry up …
It is the last problem to be solved for a perfect gaming experience on notebook linux

well take a look at how nouveau/bumblebee are handling it? they seem to handle it fine, and as far as i’m aware primusrun(bumblebee) works kinda the same way, it also passes frames onto the intel GPU
i’ll be moving back to bumblebee, i personally can’t live with screen tearing, but i’d be happy to hear once this gets fixed :)

Bumblebee runs 2 X11 servers and has quite a bit of overhead in compressing and copying buffers around. For example, my laptop cannot reach more than ~40 FPS no matter the application with bumblebee, while PRIME happily gives me 6500 FPS in glxgears.

As aplattner said, it’s not really an issue of “We don’t know a solution”, it’s that it needs someone to work on it. I don’t know how many people nvidia has employed to work on Linux device drivers and generally UNIX desktop/laptop support, but from the presentation concerning Wayland and EGL they’ve made at XDC it seems like they’re quite busy reworking big parts of the driver at the moment.

As for nouveau, googling reveals this: http://lists.freedesktop.org/archives/nouveau/2014-September/018830.html
Which seems like a patch submitted by an nvidia employee to nouveau implementing fencing, though I’m not sure if this is for the same use-case that we’re talking about here.

EDIT: Dug up some more stuff. Seems like this is somewhat related? Might be of interest for anyone working at nvidia. (Though last I checked nvidia didn’t have kms support yet or something)

any news?

I hope it is resolved before the release of this SteamMachine:

If the outputs (= monitors) are directly connected to the nvidia GPU, this is not an issue, as the Intel GPU is then not involved at all from what I know.

However, on many laptop devices, the outputs are connected to the Intel GPU, so for those it’s relevant.

in short, Nvidia Optimus sucks

working on this issue, tracking this issue under bug 1629916

well try AMD’s hybrid graphics solution then you know what sucks

Weird the performance difference for me is Much smaller
are you using optirun or primusrun for glxgears? also you need to force vsync off with this: vblank_mode=0 primusrun glxgears

I’d appreciate an update on the status of this, since it currently breaks QtQuick in non-obvious ways and thus leads to issues with KDE, on top of the ugly tearing in virtually any application.

For anyone interested in a workaround for the QtQuick animation bugs, do the following:

  1. Create a script file somewhere with the following contents:
    export QSG_RENDER_LOOP=basic
    
  2. Go to System Settings->Startup and Shutdown->Autostart, add script to start pre-KDE startup
  3. Restart your session

Until nvidia fixes vsync or until Qt adds automatic detection for optimus, this seems to be the only way you can avoid 100% CPU usage while copying files/connecting to networks/…

You’ll still get tearing, of course. This just tells QtQuick not to rely on VSync actually working.

Scratch that, don’t use the above described workaround because it breaks Plasma 5.
So we’re back to waiting for nvidia and intel to implement PRIME buffer fencing.