It does not select the right gpu
Hi all,

I have a GeForce 285 and 580. The 285 is my primary since I am using that with the monitor and the 580 totally dedicated to Cuda computing.

I wrote my code, i specified "compute_20,sm_20" and also

[code]int deviceCount, device;
cudaDeviceProp prop;
// Num. of Device Count
cudaGetDeviceCount(&deviceCount);
printf("\nDevice found:\n");
for(i=0; i<deviceCount; i++) {
cudaGetDeviceProperties(&prop, i);
printf("%s\n", prop.name);
}
// Device Selection
device = 0;
cudaSetDevice(device);

// Current Device Detection
cudaGetDeviceProperties(&prop, device);
printf("Using device %d: %s \n", device, prop.name);[/code]

To select the 580 (that is 0, 285 is 1)

Well, when I run it, if I set a thread/block until 512 everything fine, but from 513 going over I get error..

I guess it is using anyway the 285 since it has the 512 threads/block limit, while on 580 this is 1024.

What could I do?
Hi all,



I have a GeForce 285 and 580. The 285 is my primary since I am using that with the monitor and the 580 totally dedicated to Cuda computing.



I wrote my code, i specified "compute_20,sm_20" and also



int deviceCount, device;

cudaDeviceProp prop;

// Num. of Device Count

cudaGetDeviceCount(&deviceCount);

printf("\nDevice found:\n");

for(i=0; i<deviceCount; i++) {

cudaGetDeviceProperties(&prop, i);

printf("%s\n", prop.name);

}

// Device Selection

device = 0;

cudaSetDevice(device);



// Current Device Detection

cudaGetDeviceProperties(&prop, device);

printf("Using device %d: %s \n", device, prop.name);




To select the 580 (that is 0, 285 is 1)



Well, when I run it, if I set a thread/block until 512 everything fine, but from 513 going over I get error..



I guess it is using anyway the 285 since it has the 512 threads/block limit, while on 580 this is 1024.



What could I do?

#1
Posted 09/19/2011 11:07 AM   
Update: I toke the 285 out from the pc, but I still get the error.

It is related with the "Generate GPU Debug Information" field.

When this is "Yes", I can run my program with up to 768 (only limited by Shared Memory).

But if it is "No." then I get errors, and it runs fine again only with =< 512 kernels.


Any idea?
Update: I toke the 285 out from the pc, but I still get the error.



It is related with the "Generate GPU Debug Information" field.



When this is "Yes", I can run my program with up to 768 (only limited by Shared Memory).



But if it is "No." then I get errors, and it runs fine again only with =< 512 kernels.





Any idea?

#2
Posted 09/19/2011 12:37 PM   
Where exactly have you specified "compute_20,sm_20"? It seems you put it somewhere where it is only used for debug builds, not for release builds.
Where exactly have you specified "compute_20,sm_20"? It seems you put it somewhere where it is only used for debug builds, not for release builds.

Always check return codes of CUDA calls for errors. Do not use __syncthreads() in conditional code unless the condition is guaranteed to evaluate identically for all threads of each block. Run your program under cuda-memcheck to detect stray memory accesses. If your kernel dies for larger problem sizes, it might exceed the runtime limit and trigger the watchdog timer.

#3
Posted 09/19/2011 01:09 PM   
[quote name='tera' date='19 September 2011 - 01:09 PM' timestamp='1316437787' post='1295193']
Where exactly have you specified "compute_20,sm_20"? It seems you put it somewhere where it is only used for debug builds, not for release builds.
[/quote]

In:

Property -> CUDA C/C++ -> Device -> Code Generation

And I double checked, the configuration is the Release one
[quote name='tera' date='19 September 2011 - 01:09 PM' timestamp='1316437787' post='1295193']

Where exactly have you specified "compute_20,sm_20"? It seems you put it somewhere where it is only used for debug builds, not for release builds.





In:



Property -> CUDA C/C++ -> Device -> Code Generation



And I double checked, the configuration is the Release one

#4
Posted 09/19/2011 01:34 PM   
Scroll To Top