cuDNN 1x1 filter convolutions

My gradient checks fail (both numerically and visually) for the 1x1 filter convolutions required for the GoogLeNet. They stop working as soon as the number of input filters is greater one.
Most values in the derivative towards the input images seem to be zero (all except the first map), regardless of weight and diffData values. I’m using the cuDNN v2 rc2, on a Titan Black GPU.

Are 1x1 convolutions supported and tested?

Downgrading from v2 r2 to v2 r1 seems to resolve the problem. Maybe a regression error?

It is a bug in cudnn v2, we are investigating the problem. Thanks for reporting the issue.

I ran into the same problem, spend quite some time debugging and made a repro case before I found this post.
Any ETA on a fix for this?

Below is the repro case:

float *make_device_vector(float vec, int sz)
{
float * vec_d;
checkCUDA( cudaMalloc(&vec_d, sz * sizeof(float)) );
checkCUDA( cudaMemcpy(vec_d, vec, sz
sizeof(float),
cudaMemcpyHostToDevice) );
return vec_d;
}

void print_device_vector(float *vec_d, int sz)
{
float vec = new float[sz];
checkCUDA( cudaMemcpy(vec, vec_d, sz
sizeof(float),
cudaMemcpyDeviceToHost) );
for (int i = 0; i < sz; i++)
{
std::cout << vec[i] << " ";
}
std::cout << std::endl;
delete vec;
}

// ======
// Testing cudnnConvolutionBackwardData()
// 1x2 image with 2 channels in, 1 channel out
//
// Out(top)-gradient: (1 0)
// Filter: (1 2)
//
// Expected in(bottom)-gradient: (1 0 2 0)
//
// Outut of this program is:
// 1 0
// 1 2
// 1 0 0 0
//
// I.e., the returned in-gradient is: (1 0 0 0) – Bug!?
//
int main(int argc, char *argv) {
cudnnDataType_t dataType = CUDNN_DATA_FLOAT;
cudnnTensorFormat_t tensorFormat = CUDNN_TENSOR_NCHW;
cudnnHandle_t cudnnHandle;
checkCUDNN( cudnnCreate(&cudnnHandle) );

cudnnConvolutionDescriptor_t conv_desc;
checkCUDNN( cudnnCreateConvolutionDescriptor(&conv_desc) );
checkCUDNN( cudnnSetConvolution2dDescriptor(conv_desc,
0, 0, 1, 1, 1, 1, CUDNN_CONVOLUTION) );

cudnnFilterDescriptor_t filter_desc;
checkCUDNN( cudnnCreateFilterDescriptor(&filter_desc) );
checkCUDNN( cudnnSetFilter4dDescriptor(filter_desc,
CUDNN_DATA_FLOAT, 1, 2, 1, 1) );

cudnnTensorDescriptor_t out_diff_desc;
checkCUDNN( cudnnCreateTensorDescriptor(&out_diff_desc) );
checkCUDNN( cudnnSetTensor4dDescriptor(out_diff_desc,
tensorFormat,
dataType,
1, 1, 1, 2) );
cudnnTensorDescriptor_t in_diff_desc;
checkCUDNN( cudnnCreateTensorDescriptor(&in_diff_desc) );
checkCUDNN( cudnnSetTensor4dDescriptor(in_diff_desc,
tensorFormat,
dataType,
1, 2, 1, 2) );
float out_diff = {1, 0};
float filter = {1, 2};
float *out_diff_d = make_device_vector(out_diff, 2);
float *filter_d = make_device_vector(filter, 2);
float *in_diff_d;
checkCUDA( cudaMalloc(&in_diff_d, 4 * sizeof(float)) );

print_device_vector(out_diff_d, 2);
print_device_vector(filter_d, 2);

float alpha = 1.0f;
float beta = 0.0f;
checkCUDNN( cudnnConvolutionBackwardData(cudnnHandle,
&alpha,
filter_desc, filter_d,
out_diff_desc, out_diff_d,
conv_desc,
&beta, in_diff_desc, in_diff_d) );

print_device_vector(in_diff_d, 4);
}

Same problem here. RC2 gradient check failed on 1x1 filter, reverting to RC1 resolves the problem.

For the case, the OSX version for RC2 is working properly while the Linux version fails.

The cuDNN v2 RC3 includes a fix for this bug, and is now available on the cuDNN downloads page.

To download the RC3, please visit https://developer.nvidia.com/cuDNN and click the “Download” button at the bottom of the page.

If you encounter any problems, please file a bug at https://developer.nvidia.com/nvbugs/cuda/add