denormals and cublas
I found that when I use the cublas single precision vector norm function cublasSnrm2() and the vector contains any denormalized numbers the result will be NAN on Fermi cards. Is this a bug?

In the example below, the second case with value=1e-39 gives me a NAN on a C2050. On a C1060 both results are zero.

--- Example program ---

#include <cublas_v2.h>
#include <stdio.h>

int main() {

int vectorSize=1000;

cublasHandle_t handle;
cublasCreate(&handle);

float* vectorX_cpu;
float* vectorX_gpu;
float result;

cudaMallocHost((void**)&vectorX_cpu,vectorSize*sizeof(float));
cudaMalloc((void**)&vectorX_gpu,vectorSize*sizeof(float));

for(int i=0;i<vectorSize;i++) {
vectorX_cpu[i]=1e-38f;
}
cudaMemcpy(vectorX_gpu,vectorX_cpu,vectorSize*sizeof(float),
cudaMemcpyHostToDevice);
cublasSnrm2(handle,vectorSize,vectorX_gpu,1,&result);
printf("value=%e vectorSize=%i result=%e\n",
vectorX_cpu[0],vectorSize,result);

for(int i=0;i<vectorSize;i++) {
vectorX_cpu[i]=1e-39f;
}
cudaMemcpy(vectorX_gpu,vectorX_cpu,vectorSize*sizeof(float),
cudaMemcpyHostToDevice);
cublasSnrm2(handle,vectorSize,vectorX_gpu,1,&result);
printf("value=%e vectorSize=%i result=%e\n",
vectorX_cpu[0],vectorSize,result);

cudaFree(vectorX_gpu);

cublasDestroy(handle);

return 0;
}
I found that when I use the cublas single precision vector norm function cublasSnrm2() and the vector contains any denormalized numbers the result will be NAN on Fermi cards. Is this a bug?



In the example below, the second case with value=1e-39 gives me a NAN on a C2050. On a C1060 both results are zero.



--- Example program ---



#include <cublas_v2.h>

#include <stdio.h>



int main() {



int vectorSize=1000;



cublasHandle_t handle;

cublasCreate(&handle);



float* vectorX_cpu;

float* vectorX_gpu;

float result;



cudaMallocHost((void**)&vectorX_cpu,vectorSize*sizeof(float));

cudaMalloc((void**)&vectorX_gpu,vectorSize*sizeof(float));



for(int i=0;i<vectorSize;i++) {

vectorX_cpu[i]=1e-38f;

}

cudaMemcpy(vectorX_gpu,vectorX_cpu,vectorSize*sizeof(float),

cudaMemcpyHostToDevice);

cublasSnrm2(handle,vectorSize,vectorX_gpu,1,&result);

printf("value=%e vectorSize=%i result=%e\n",

vectorX_cpu[0],vectorSize,result);



for(int i=0;i<vectorSize;i++) {

vectorX_cpu[i]=1e-39f;

}

cudaMemcpy(vectorX_gpu,vectorX_cpu,vectorSize*sizeof(float),

cudaMemcpyHostToDevice);

cublasSnrm2(handle,vectorSize,vectorX_gpu,1,&result);

printf("value=%e vectorSize=%i result=%e\n",

vectorX_cpu[0],vectorSize,result);



cudaFree(vectorX_gpu);



cublasDestroy(handle);



return 0;

}

#1
Posted 07/14/2011 08:53 AM   
Thanks,
We can reproduce the problem and will look into it.
Thanks,

We can reproduce the problem and will look into it.

#2
Posted 07/14/2011 10:01 PM   
[quote name='philippev' date='14 July 2011 - 10:01 PM' timestamp='1310680904' post='1265233']
Thanks,
We can reproduce the problem and will look into it.
[/quote]

I am facing the same problem. Any solution?
[quote name='philippev' date='14 July 2011 - 10:01 PM' timestamp='1310680904' post='1265233']

Thanks,

We can reproduce the problem and will look into it.





I am facing the same problem. Any solution?

#3
Posted 05/03/2012 05:40 PM   
What CUBLAS version do you use?
It should be fixed since CUBLAS4.1
What CUBLAS version do you use?

It should be fixed since CUBLAS4.1

#4
Posted 05/03/2012 05:45 PM   
Scroll To Top