"Kernel timeout" on Tesla
Hi,
I have problem with "kernel timeout" under Fedora 16.
I know, that it can be caused, by X server timeout, but in this machine I have some 'normal' graphic card and nVidia Tesla C2075
I think that Tesla is not even capable to run graphics interface. So why I have kernel timeout?

My device query:
[code]./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Found 1 CUDA Capable device(s)

Device 0: "Tesla C2075"
CUDA Driver Version / Runtime Version 4.2 / 4.2
CUDA Capability Major/Minor version number: 2.0
Total amount of global memory: 5375 MBytes (5636292608 bytes)
(14) Multiprocessors x ( 32) CUDA Cores/MP: 448 CUDA Cores
GPU Clock rate: 1147 MHz (1.15 GHz)
Memory Clock rate: 1566 Mhz
Memory Bus Width: 384-bit
L2 Cache Size: 786432 bytes
Max Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536,65535), 3D=(2048,2048,2048)
Max Layered Texture Size (dim) x layers 1D=(16384) x 2048, 2D=(16384,16384) x 2048
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 32768
Warp size: 32
Maximum number of threads per multiprocessor: 1536
Maximum number of threads per block: 1024
Maximum sizes of each dimension of a block: 1024 x 1024 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 65535
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and execution: Yes with 2 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Concurrent kernel execution: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support enabled: Yes
Device is using TCC driver mode: No
Device supports Unified Addressing (UVA): Yes
Device PCI Bus ID / PCI location ID: 5 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 4.2, CUDA Runtime Version = 4.2, NumDevs = 1, Device = Tesla C2075
[deviceQuery] test results...
PASSED[/code]

and automaticly generated xorg.conf:
[code]# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig: version 295.40 (mockbuild@) Thu Apr 12 13:28:25 CEST 2012

Section "ServerLayout"
Identifier "Default Layout"
Screen "Default Screen" 0 0
InputDevice "Keyboard0" "CoreKeyboard"
InputDevice "Mouse0" "CorePointer"
EndSection

Section "InputDevice"
# generated from data in "/etc/sysconfig/keyboard"
Identifier "Keyboard0"
Driver "keyboard"
Option "XkbLayout" "us"
Option "XkbModel" "pc105"
EndSection

Section "InputDevice"
# generated from default
Identifier "Mouse0"
Driver "mouse"
Option "Protocol" "auto"
Option "Device" "/dev/input/mice"
Option "Emulate3Buttons" "no"
Option "ZAxisMapping" "4 5"
EndSection

Section "Device"
Identifier "Videocard0"
Driver "nvidia"
EndSection

Section "Screen"
Identifier "Default Screen"
Device "Videocard0"
SubSection "Display"
Modes "nvidia-auto-select"
EndSubSection
EndSection[/code]

Is problem caused by fact, that in xorg.conf I have provided "nvidia" driver?
With this old xorg.conf X server didn't want to start ("no screens detected")
[code]Section "Device"
Identifier "Videocard0"
Driver "vesa"
EndSection[/code]
Hi,

I have problem with "kernel timeout" under Fedora 16.

I know, that it can be caused, by X server timeout, but in this machine I have some 'normal' graphic card and nVidia Tesla C2075

I think that Tesla is not even capable to run graphics interface. So why I have kernel timeout?



My device query:

./deviceQuery Starting...



CUDA Device Query (Runtime API) version (CUDART static linking)



Found 1 CUDA Capable device(s)



Device 0: "Tesla C2075"

CUDA Driver Version / Runtime Version 4.2 / 4.2

CUDA Capability Major/Minor version number: 2.0

Total amount of global memory: 5375 MBytes (5636292608 bytes)

(14) Multiprocessors x ( 32) CUDA Cores/MP: 448 CUDA Cores

GPU Clock rate: 1147 MHz (1.15 GHz)

Memory Clock rate: 1566 Mhz

Memory Bus Width: 384-bit

L2 Cache Size: 786432 bytes

Max Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536,65535), 3D=(2048,2048,2048)

Max Layered Texture Size (dim) x layers 1D=(16384) x 2048, 2D=(16384,16384) x 2048

Total amount of constant memory: 65536 bytes

Total amount of shared memory per block: 49152 bytes

Total number of registers available per block: 32768

Warp size: 32

Maximum number of threads per multiprocessor: 1536

Maximum number of threads per block: 1024

Maximum sizes of each dimension of a block: 1024 x 1024 x 64

Maximum sizes of each dimension of a grid: 65535 x 65535 x 65535

Maximum memory pitch: 2147483647 bytes

Texture alignment: 512 bytes

Concurrent copy and execution: Yes with 2 copy engine(s)

Run time limit on kernels: Yes

Integrated GPU sharing Host Memory: No

Support host page-locked memory mapping: Yes

Concurrent kernel execution: Yes

Alignment requirement for Surfaces: Yes

Device has ECC support enabled: Yes

Device is using TCC driver mode: No

Device supports Unified Addressing (UVA): Yes

Device PCI Bus ID / PCI location ID: 5 / 0

Compute Mode:

< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >



deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 4.2, CUDA Runtime Version = 4.2, NumDevs = 1, Device = Tesla C2075

[deviceQuery] test results...

PASSED




and automaticly generated xorg.conf:

# nvidia-xconfig: X configuration file generated by nvidia-xconfig

# nvidia-xconfig: version 295.40 (mockbuild@) Thu Apr 12 13:28:25 CEST 2012



Section "ServerLayout"

Identifier "Default Layout"

Screen "Default Screen" 0 0

InputDevice "Keyboard0" "CoreKeyboard"

InputDevice "Mouse0" "CorePointer"

EndSection



Section "InputDevice"

# generated from data in "/etc/sysconfig/keyboard"

Identifier "Keyboard0"

Driver "keyboard"

Option "XkbLayout" "us"

Option "XkbModel" "pc105"

EndSection



Section "InputDevice"

# generated from default

Identifier "Mouse0"

Driver "mouse"

Option "Protocol" "auto"

Option "Device" "/dev/input/mice"

Option "Emulate3Buttons" "no"

Option "ZAxisMapping" "4 5"

EndSection



Section "Device"

Identifier "Videocard0"

Driver "nvidia"

EndSection



Section "Screen"

Identifier "Default Screen"

Device "Videocard0"

SubSection "Display"

Modes "nvidia-auto-select"

EndSubSection

EndSection




Is problem caused by fact, that in xorg.conf I have provided "nvidia" driver?

With this old xorg.conf X server didn't want to start ("no screens detected")

Section "Device"

Identifier "Videocard0"

Driver "vesa"

EndSection

#1
Posted 05/02/2012 09:36 AM   
I think TESLA card is driving the display in this case as I see "Run time limit on Kernels as 'YES'".Exiting X will allow you to run CUDA applications as long as you want.
I think TESLA card is driving the display in this case as I see "Run time limit on Kernels as 'YES'".Exiting X will allow you to run CUDA applications as long as you want.

#2
Posted 05/02/2012 05:16 PM   
[quote name='cudaDMA' date='02 May 2012 - 05:16 PM' timestamp='1335979007' post='1403482']
I think TESLA card is driving the display in this case as I see "Run time limit on Kernels as 'YES'".Exiting X will allow you to run CUDA applications as long as you want.
[/quote]

Yes, but I can have few users simultaneously connected to this server and some of them want to have X server started. Best combination for me is X server running on 'non-CUDA' GPU and leave Tesla card only for CUDA computations.
[quote name='cudaDMA' date='02 May 2012 - 05:16 PM' timestamp='1335979007' post='1403482']

I think TESLA card is driving the display in this case as I see "Run time limit on Kernels as 'YES'".Exiting X will allow you to run CUDA applications as long as you want.





Yes, but I can have few users simultaneously connected to this server and some of them want to have X server started. Best combination for me is X server running on 'non-CUDA' GPU and leave Tesla card only for CUDA computations.

#3
Posted 05/04/2012 06:43 AM   
Scroll To Top