CUDA Programming and Performance

Tesla Compute Cluster driver released non-display driver for 64-bit Windows Server 08/08 R2
Tesla Compute Cluster driver released non-display driver for 64-bit Windows Server 08/08 R2

38 Replies

26,756 Views

sWienke

7 years ago

"Display driver stopped responding and has recovered" WDDM Timeout Detection and Recovery
"Display driver stopped responding and has recovered" WDDM Timeout Detection and Recovery

19 Replies

137,359 Views

eglion9517530

6 years ago

Tell me a way to GPU speed up with banal elementwise multiplication.. Thanks in advance!
Tell me a way to GPU speed up with banal elementwise multiplication.. Thanks in advance!

5 Replies

146 Views

Vitalybn

4 days ago

How to implement stack
How to implement stack

0 Replies

7 Views

PabloBot

2 hours ago

Reading memory mapped pointer from 3rd Party PCIe Device via cudaHostRegisterIoMemory without CPU Caching?
Reading memory mapped pointer from 3rd Party PCIe Device via cudaHostRegisterIoMemory without CPU Caching?

2 Replies

63 Views

cheinger

1 day ago

How to read data from matrix
How to read data from matrix

0 Replies

18 Views

giapvn

4 hours ago

I can't realize the kernel concurrent with Hyper-Q
I can't realize the kernel concurrent with Hyper-Q

4 Replies

41 Views

hlei

15 hours ago

Cuda Run time library unload
Cuda Run time library unload

1 Replies

27 Views

jprabhas

8 hours ago

Why the 64bit addAtomic error?
Why the 64bit addAtomic error?

4 Replies

117 Views

sedona

5 days ago

MPI and CUDA on different compute capability GPUs
MPI and CUDA on different compute capability GPUs

0 Replies

61 Views

edde

16 hours ago

cudaMalloc calls from multiple threads on same GPU / multiple processes on different GPUs - serializ...
cudaMalloc calls from multiple threads on same GPU / multiple processes on different GPUs - serializ...

2 Replies

31 Views

HannesF99

24 hours ago

Problem with cudamalloc free(): invalid next size (normal)
Problem with cudamalloc free(): invalid next size (normal)

3 Replies

796 Views

smoes

6 years ago

Modified a pointer in device
Modified a pointer in device

0 Replies

58 Views

narutocool

20 hours ago

Multiple iteration of single Task Kernel
Multiple iteration of single Task Kernel

7 Replies

115 Views

LoKKeR

1 week ago

cpu function and gpu kernel overlap
cpu function and gpu kernel overlap

12 Replies

224 Views

YuKuki

4 days ago

why the cuda profiler can not generate the timeline?
why the cuda profiler can not generate the timeline?

3 Replies

126 Views

chickennight

1 day ago

CUDA SEPARABLE COMPILATION using CMake not generating PTX
CUDA SEPARABLE COMPILATION using CMake not generating PTX

0 Replies

27 Views

martinrais

2 days ago

Cuda equivalents of OpenCL Built-In functions
Cuda equivalents of OpenCL Built-In functions

0 Replies

27 Views

Eameija

2 days ago

Optimized version of single-precision error function, erff()
Optimized version of single-precision error function, erff()

14 Replies

672 Views

njuffa

2 years ago

the 1024 threads can work concurrently?
the 1024 threads can work concurrently?

4 Replies

72 Views

chickennight

3 days ago

Create Topic