nvclock on Tesla K20 issue

I have installed nvclock for modifying the GPU and memory frequency of Tesla K20 GPU. It seems the nvclock utility is not working for K20. The nvidia-smi -ac provides a very limited set of frequencies to choose. I am looking for a tool which would provide me much higher range of frequencies to select. Is there any other tool apart from nvclock which can do it?

nvidia-smi -q -d SUPPORTED_CLOCKS

Supported Clocks
Memory : 2600 MHz
Graphics : 758 MHz
Graphics : 705 MHz
Graphics : 666 MHz
Graphics : 640 MHz
Graphics : 614 MHz
Memory : 324 MHz
Graphics : 324 MHz

You might want to ask your NVIDIA partner who sold you the card, as they should be able to provide support and gear you in the right direction – that assumes you bought the card from an authorized vendor.

I’m not aware of any particular tool that is able to do that in Linux. My only idea (if you don’t have any support) is to backup your BIOS, modify it with the memory/graphics clocks you need and re-flash it back. The tool you’re looking for to modify the BIOS is called Kepler BIOS Tweaker v1.25. I have not tried this on a K20 bios though, only on a GTX Titan. I’ll look up a K20 BIOS posted on Techpowerup’s database and see if it seems like Kepler BIOS Tweaker would work and re-post back.

For reading/backing up/flashing the BIOS you’re going to want NVFlash, there is a pure DOS and a Windows variant.

Also, interesting – I figured that one could set Memory and Graphics clocks to whatever value within reason with a K20. I didn’t experiment with this feature when I had one available.

Seems like Kepler BIOS Tweaker v1.25 loads up a K20 BIOS just fine, so you could edit it with your choice of clocks and re-flash it… make sure to backup your original BIOS if you choose to do this, and it’d be at your own risk.

The proper way is to ask NVIDIA for support, as they should presumably provide you a solution that can work.

vacaloca, I know you had a K20 and upgraded to a Titan. But how confident are you that the BIOS reflash isn’t going to brick a $3500 K20?

I’m all game to give it a go, but that risk is what’s holding me back.

allanmac and I have both had multiple discussions on the limitations of K20, especially since it’s clear there’s HUGE clocking headroom left untapped on the K20. If people are getting huge overclocks on Titan (higher wattage due to graphics!) what can the K20 do? That would be FUN to find out. Just scary.

The risk is there regardless, but I suspect that doing a small change like a default clock change will be fine for a simple test. The tool I mentioned to modify the BIOS generates a correct checksum after modifying it, so not even that is something to deal with. From others that have managed to brick cards, it seems you can recover as long as you have another GPU to act as a temporary video card (the below is a generic tutorial):
http://www.overclock.net/t/593427/how-to-unbrick-your-bricked-graphics-card-fix-a-failed-bios-flash

Given that I’d say it would be worth a shot. The specifics to flash a Titan are here:
http://www.evga.com/forums/tm.aspx?m=1891166&mpage=1&print=true

This is short discussion on recovery/flashing specific to NVFlash
http://www.evga.com/forums/tm.aspx?m=896050

Even if the card is completely hosed (flash halfway, etc) it can still be recovered by force-flashing and jumpering a specific EEPROM pin to ground like it explains above, so not all it lost, it’d just be a slight pain.

I have CentOS installed on my machine. It seems that BIOS Tweaker works on Windows. Is there any other tool that I can use on the installed OS?

Not that I’m aware of. Presumably that tool should run on Wine, since it just loads the BIOS file and let you edit/save it back with another name. I’m sure you can find a WinPE environment on CD/USB to run it on your system otherwise if the Wine approach doesn’t work.

I just noticed that the clock choices nvidia-smi lets you select are the ones listed on the Boost Clocks table of the K20c BIOS on techPowerUp here: http://www.techpowerup.com/vgabios/index.php?architecture=&manufacturer=&model=Tesla+K20c&interface=&memType=&memSize=, but you can only edit the last clock on the table (758) to something else. I’d suggest you peruse the forums that deal with NVIDIA BIOS mods, it might be that the Boost Clock table on this BIOS is further editable if you know what you’re doing. That being said, a simple base clock change would be fine for testing purposes.

I am reluctant to post any sort of modified BIOS here for this card because I don’t have a way to test it myself, but if anyone would like that I could certainly do it if need be, but it would be at your own risk, and you would be expected to exercise diligence in backing up your BIOS and having a different GPU to boot in case the modified BIOS bricks your card.

I have got a K20 now the BIOS of which I wish to modify. Can you please elaborate the procedure?
I have also sent you a PM for the same.

I guess the correct way to go about it would be to use the existing Nvidia management library MVML.

Here is a snippet about setting clocks:

nvmlReturn_t DECLDIR nvmlDeviceSetApplicationsClocks (nvmlDevice_t device, unsigned int
memClockMHz, unsigned int graphicsClockMHz)

Set clocks that applications will lock to.

Sets the clocks that compute and graphics applications will be running at. e.g. CUDA driver requests these clocks during context creation which means this property defines clocks at which CUDA applications will be running unless some overspec event occurs (e.g. over power, over thermal or external HW brake).

Can be used as a setting to request constant performance.

For Tesla ™products, and Quadro ®products from the Kepler family. Requires root/admin permissions.
See nvmlDeviceGetSupportedMemoryClocks and nvmlDeviceGetSupportedGraphicsClocks for details on how to list available clocks combinations.

After system reboot or driver reload applications clocks go back to their default value.

[1]

Guess it might work fine for the GTX Titan aswell since it’s a GK110 based card.

[1] https://developer.nvidia.com/sites/default/files/akamai/cuda/files/CUDADownloads/NVML_cuda5/nvml.4.304.55.pdf

I have already used the NVML to try to change he frequencies. My issue is that I can only set frequencies to a very few levels what the K20 provides. For my research, I need a vast range of frequency levels, hence the need for changing the BIOS.

Gotcha! :)

I guess instead of risking to brick your expensive Tesla card you could contact any of those desperately trying to flash a Titan into a K20 and they would happily change cards with you. ;)

Let’s assume that I do not care much regarding the cost of K20 and without the necessary frequency scaling support, it will not be very useful for my research.

I’ve already mentioned the tools needed, specifically Kepler BIOS Tweaker, but even with that, it doesn’t seem like you are able to edit the last clock on the scaling table like I have mentioned before, as well as the core clocks. I am hesitant to give any further advice because the potential to brick an expensive card is there. Have you tried to escalate this issue with NVIDIA itself via the partner that sold you the card? They might give you what you need given you’ve spent $3.5k on a card… give it a shot.