396.24 + 1070 Max-Q + External HDMI Monitor = 100% kworker thread

I just received a SAGER NP8851 (CLEVO P950ER) [http://www.xoticpc.com/sager-np8851-clevo-p950er.html].

After installing the latest drivers, everything works great. Until I plug in my external HDMI monitor.

As soon as I plug in the monitor, one kworker process pegs out at 100% of one core, and I always see an ‘irq/###-nvidia’ process getting much more CPU than normal (>6% most of the time). The kworker process stays stuck at 100%, even after unplugging the monitor, and can’t be killed. Other than this one pegged process, everything - including the external monitor - seems to be working fine.

I’ve used this same monitor with other systems without this issue.

PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                  
   83 root      20   0       0      0      0 R  97.0  0.0   1:05.58 kworker/0:1                                              
 2402 root      20   0  385660  93996  49080 S   2.0  0.3   0:09.14 Xorg                                                     
 2656 evil      20   0 3225828 102348  82324 S   2.0  0.3   0:08.00 kwin_x11                                                 
 2407 root     -51   0       0      0      0 S   1.3  0.0   0:04.39 irq/130-nvidia                                           
 2680 evil      20   0 4242472 321128 155512 S   1.3  1.0   0:12.16 plasmashell                                              
 3664 evil      20   0  790744  21756  18412 S   1.0  0.1   0:03.40 conky                                                    
 3993 evil      20   0  607524 162948  89496 S   1.0  0.5   0:05.00 steam

I tried back-leveling to the 390 drivers from the official Ubuntu repos, but they produced the same issue. I haven’t tried anything older yet, as I’ve not researched how long the 1070 has been supported by the linux drivers.

I’m attaching a bug report. Nothing in the dmesg/logs seems to be jumping out at me to explain the problem.

Also, here’s what I see when I force a backtrace for the maxed CPU:

[ 1447.277376] NMI backtrace for cpu 0
[ 1447.277378] CPU: 0 PID: 83 Comm: kworker/0:1 Tainted: P           OE    4.15.0-20-generic #21-Ubuntu
[ 1447.277378] Hardware name: Notebook                         P95xER                         /P95xER                         , BIOS 1.05.04dRLS2 04/25/2018
[ 1447.277381] Workqueue: kacpid acpi_os_execute_deferred
[ 1447.277382] RIP: 0010:_raw_spin_unlock_irqrestore+0x1b/0x20
[ 1447.277382] RSP: 0018:ffffa8cb034b7ba0 EFLAGS: 00000293
[ 1447.277383] RAX: 0000000000000293 RBX: ffff894d9884f2d0 RCX: 0000000180330029
[ 1447.277384] RDX: 0000000000000001 RSI: 0000000000000293 RDI: 0000000000000293
[ 1447.277384] RBP: ffffa8cb034b7ba8 R08: ffff894d9cd51550 R09: 0000000180330029
[ 1447.277385] R10: ffffa8cb034b7b90 R11: ffff894d9cd88000 R12: 0000000000000002
[ 1447.277385] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
[ 1447.277386] FS:  0000000000000000(0000) GS:ffff894d9d200000(0000) knlGS:0000000000000000
[ 1447.277386] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1447.277387] CR2: 00007f8ee7d9c000 CR3: 000000055540a005 CR4: 00000000003606f0
[ 1447.277387] Call Trace:
[ 1447.277389]  ? acpi_os_release_lock+0xe/0x10
[ 1447.277390]  acpi_ut_update_ref_count.part.1+0x51/0x6e1
[ 1447.277391]  acpi_ut_update_object_reference+0x113/0x20e
[ 1447.277392]  acpi_ut_add_reference+0x64/0x6a
[ 1447.277394]  acpi_ex_resolve_node_to_value+0x23f/0x46c
[ 1447.277395]  acpi_ex_resolve_to_value+0x391/0x43f
[ 1447.277397]  acpi_ds_evaluate_name_path+0xb2/0x168
[ 1447.277398]  ? acpi_db_single_step+0x1f/0x29d
[ 1447.277399]  acpi_ds_exec_end_op+0x120/0x736
[ 1447.277401]  acpi_ps_parse_loop+0x918/0x9c2
[ 1447.277425]  ? acpi_ut_remove_reference+0x72/0x79
[ 1447.277426]  acpi_ps_parse_aml+0x1ac/0x4bd
[ 1447.277427]  acpi_ps_execute_method+0x1fa/0x2bc
[ 1447.277429]  acpi_ns_evaluate+0x2ee/0x435
[ 1447.277430]  acpi_ev_asynch_execute_gpe_method+0xbd/0x159
[ 1447.277431]  acpi_os_execute_deferred+0x1a/0x30
[ 1447.277433]  process_one_work+0x1de/0x410
[ 1447.277434]  worker_thread+0x32/0x410
[ 1447.277435]  kthread+0x121/0x140
[ 1447.277436]  ? process_one_work+0x410/0x410
[ 1447.277437]  ? kthread_create_worker_on_cpu+0x70/0x70
[ 1447.277438]  ret_from_fork+0x35/0x40
[ 1447.277439] Code: 89 d0 5d c3 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 55 48 89 e5 c6 07 00 0f 1f 40 00 48 89 f7 57 9d 0f 1f 44 00 00 5d <c3> 0f 1f 40 00 0f 1f 44 00 00 55 48 89 e5 c6 07 00 48 89 f7 57

Any help getting to the bottom of this is appreciated.
nvidia-bug-report.log.gz (125 KB)

I found a fix: I upgraded to the mainline 4.17 kernel, and the problem seems to have gone away.