Driver 331.20 crashes with kernels 3.12.x

The 331.20 driver randomly crashes during boot (4 times of 48 boots, 2 with 3.12.1 and 2 with 3.12.2). The dmesg shows this:

[ 4.622709] wmi: Mapper loaded
[ 4.629961] [drm] Initialized drm 1.1.0 20060810
[ 4.679571] ACPI: Requesting acpi_cpufreq
[ 4.686859] input: Power Button as /devices/LNXSYSTM:00/device:00/PNP0C0C:00/input/input2
[ 4.686889] ACPI: Power Button [PWRB]
[ 4.686917] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input3
[ 4.686937] ACPI: Power Button [PWRF]
[ 4.713896] mei_me 0000:00:16.0: setting latency timer to 64
[ 4.713926] mei_me 0000:00:16.0: irq 45 for MSI/MSI-X
[ 4.721014] microcode: CPU0 sig=0x306a9, pf=0x2, revision=0xc
[ 5.118242] nvidia: module license ‘NVIDIA’ taints kernel.
[ 5.118245] Disabling lock debugging due to kernel taint
[ 5.125357] ------------[ cut here ]------------
[ 5.125413] kernel BUG at drivers/cpufreq/cpufreq.c:79!
[ 5.125466] invalid opcode: 0000 [#1] SMP
[ 5.125607] Modules linked in: nvidia(PO+) snd_page_alloc acpi_cpufreq(+) snd_seq_midi snd_seq_midi_event snd_rawmidi microcode(+) mei_me button video processor drm wmi snd_seq mei i2c_i801 i2c_core evdev snd_timer snd_seq_device lpc_ich snd mfd_core soundcore ext4 crc16 jbd2 mbcache ata_generic sg hid_generic sd_mod sr_mod cdrom crc_t10dif crct10dif_common usbhid hid ahci libahci libata scsi_mod firewire_ohci firewire_core crc_itu_t ehci_pci ehci_hcd e1000e xhci_hcd ptp pps_core usbcore usb_common thermal fan thermal_sys
[ 5.128249] CPU: 0 PID: 412 Comm: modprobe Tainted: P O 3.12.2-amd64 #1
[ 5.128305] Hardware name: MSI MS-7751/Z77A-GD65 (MS-7751), BIOS V10.2 02/27/2012
[ 5.128361] task: ffff88003783f140 ti: ffff8800378bc000 task.ti: ffff8800378bc000
[ 5.128441] RIP: 0010:[] [] lock_policy_rwsem_read+0x1b/0x3a
[ 5.128572] RSP: 0018:ffff8800378bdca0 EFLAGS: 00010246
[ 5.128637] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000000bfebfbff
[ 5.128705] RDX: ffff88021ec00000 RSI: 0000000000000001 RDI: 0000000000000000
[ 5.128773] RBP: ffff8800d9232ff0 R08: ffff8800d9232ff0 R09: ffff8800d9232ff4
[ 5.128841] R10: ffff88021e028158 R11: 0000000000000001 R12: ffff880037394008
[ 5.128909] R13: ffff8800d87ae008 R14: 0000000000000000 R15: ffff8800378bdf08
[ 5.128977] FS: 00007fdbcb5dd700(0000) GS:ffff88021ec00000(0000) knlGS:0000000000000000
[ 5.129058] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5.129124] CR2: 00007fdbcae13000 CR3: 0000000214e0a000 CR4: 00000000001407f0
[ 5.129192] Stack:
[ 5.129252] ffffffff8128542d ffff8800d9233000 ffff8800d9232ff0 ffffffffa08b1f82
[ 5.129512] ffffffffa08842ab ffff8800d9232ff8 ffffffffa03c04c9 ffff880037394008
[ 5.129771] ffff8800d9230038 0000000000000000 ffffffffa0393b77 ffff8800d9230038
[ 5.130030] Call Trace:
[ 5.130093] [] ? cpufreq_get+0x35/0x63
[ 5.130213] [] ? os_get_cpu_frequency+0x7/0xe [nvidia]
[ 5.130334] [] ? _nv013197rm+0x9/0x21 [nvidia]
[ 5.130455] [] ? _nv000808rm+0x2a9/0xf08 [nvidia]
[ 5.130570] [] ? _nv000811rm+0x89/0xce [nvidia]
[ 5.130688] [] ? rm_init_rm+0x24/0x7e [nvidia]
[ 5.130794] [] ? nvidia_init_module+0x90/0x701 [nvidia]
[ 5.130900] [] ? nvidia_frontend_init_module+0x82/0x8f0 [nvidia]
[ 5.131020] [] ? nv_drm_init+0xf/0xf [nvidia]
[ 5.131087] [] ? do_one_initcall+0x88/0x11a
[ 5.131156] [] ? load_module+0x1293/0x1e15
[ 5.131223] [] ? module_flags+0x6f/0x6f
[ 5.131290] [] ? __get_free_pages+0x5/0x3e
[ 5.131357] [] ? SyS_init_module+0x8e/0x99
[ 5.131425] [] ? system_call_fastpath+0x16/0x1b
[ 5.131491] Code: 44 24 04 e8 ac ff ff ff 8b 44 24 04 41 59 5b c3 48 63 ff 48 c7 c0 a8 f8 00 00 48 8b 14 fd b0 d1 68 81 48 8b 04 10 48 85 c0 75 02 <0f> 0b 8b 80 84 00 00 00 48 c7 c7 70 f8 00 00 48 03 3c c5 b0 d1
[ 5.134512] RIP [] lock_policy_rwsem_read+0x1b/0x3a
[ 5.134624] RSP
[ 5.134694] —[ end trace 57df780370d795bd ]—
[ 5.189609] iTCO_vendor_support: vendor-support=0
[ 5.192924] iTCO_wdt: Intel TCO WatchDog Timer Driver v1.10
[ 5.193012] iTCO_wdt: Found a Panther Point TCO device (Version=2, TCOBASE=0x0460)
[ 5.193250] iTCO_wdt: initialized. heartbeat=30 sec (nowayout=0)

That doesn’t appear to be an nvidia bug, it’s a linux/driver/cpufreq 3.12 bug. Look at this line;

[ 5.125413] kernel BUG at drivers/cpufreq/cpufreq.c:79!

…Then if you google kernel “BUG at drivers/cpufreq/cpufreq.c:79” and look at latest/newer pages, you will see a few mailinglist comments on this. Off hand, i don’t have a solution for you though.

I would probably report it upstream to linux developers, since it appears to be a linux issue, not nvidia.

No, what you have to look at is the stack trace. It’s a bug raised inside the NVIDIA driver binary blob (and that prevents the load of the NVIDIA driver, the kernel continues to boot perfectly), so it’s a NVIDIA driver bug, or it’s a kernel bug that only NVIDIA developers can report upstream (they are the only allowed to see the source code of their driver).

Note: I’d like to add my nvidia-bug-report.log.gz to the post, but the javascript fileuploader of the forum doesn’t work from linux. :-?

Upload to Dropbox or the like and link it. I could never get the forum uploader to work. (using Firefox on Windows here)

Sry for digging up an old thread, but I have similar issue with nvidia-drivers-331.38 and kernel 3.12.6-gentoo. I googled a bit and havent found a solution yet, just this thread, is there a solution for the issue? If not heres my nvidia-bug-report.log, it may help a bit.
http://pastebin.com/ZDGYSUsA
The crash happens at boot time, and I cant make bumblebee run correctly.

I tracked down a very similar problem in 3.13.5 and posted a patch. Is there any chance you could test it to see if it solves this problem for you? https://patchwork.kernel.org/patch/3767101/

I know it’s late but I just read your reply aplattner.

The crashes are gone since kernel 3.13.7, probably due to one of these patches (or both):
https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=b94d12cd299763e67a1bf86ba882076cd891b69f

https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=1bfb1772bd0368d2220c1bc6314c04527295a3bb