getting wrong results when calling cublas in coupling with C++/CLI and C#

afshiinzkh · April 26, 2016, 2:38pm

I have written a wrapper in C++11/CLI with Visual Studio to use CUDA’s CuBLAS. I am using CUDA Toolkit 7.0.

Here is the source code of my wrapper:

#pragma once

#include "stdafx.h"
#include "BLAS.h"
#include "cuBLAS.h"

namespace lab
{
    namespace Mathematics
    {
	    namespace CUDA
	    {
		   
		    void BLAS::DAXPY(int n, double alpha, const array<double> ^x, int incx, array<double> ^y, int incy)
		    {
			    pin_ptr<double> xPtr = &(x[0]);
				pin_ptr<double> yPtr = &(y[0]);
     			pin_ptr<double> alphaPtr = α

		    	cuBLAS::DAXPY(n, alphaPtr, xPtr, incx, yPtr, incy);
		    }
       }
   }
}

To test this code, I wrote the following test in C#:

using System;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using System.Linq;
using lab.Mathematics.CUDA;

namespace lab.Mathematics.CUDA.Test
{
  [TestClass]
  public class TestBLAS
  {
    [TestMethod]
    public void TestDAXPY()
    {
        var count = 10;
        var alpha = 1.0;
        var a = Enumerable.Range(0, count).Select(x => Convert.ToDouble(x)).ToArray();
        var b = Enumerable.Range(0, count).Select(x => Convert.ToDouble(x)).ToArray();

        // Call CUDA
        BLAS.DAXPY(count, alpha, a, 1, b, 1);

        // Validate results
        for (int i = 0; i < count; i++)
        {
            Assert.AreEqual(i + i, b[i]);
        }
    }
  }
}

The program compiles with x64 architecture with no error. But the results I get are different every time I run the test. More precisely, the array b is the result and it has different values every time. And I don’t know why.

I am Also adding my cuda code maybe there, someone can find a problem. note that I don’t get any error, warning whatsoever while compiling. I am also wondering maybe I have to do some changes in the compilation while I did nothing and used the default options.

void cuBLAS::DAXPY(int n, const double *alpha, const double *x, int incx, double *y, int incy)
		{
			// Allocate GPU memory
			double *devX, *devY;
			cudaMalloc((void **)&devX, (size_t)n*sizeof(*devX));
			cudaMalloc((void **)&devY, (size_t)n*sizeof(*devY));

			// Create cuBLAS handle
			cublasHandle_t handle;
			cublasCreate(&handle);

			// Initialize the input matrix and vector
			cublasSetVector(n, sizeof(*devX), x, incx, devX, incx);

			// Call cuBLAS function
			cublasDaxpy(handle, n, alpha, devX, incx, devY, incy);

			// Retrieve resulting vector
			cublasGetVector(n, sizeof(*devY), devY, incy, y, incy);

			// Free GPU resources
			cudaFree(devX);
			cudaFree(devY);
			cublasDestroy(handle);
		}

harryz · April 27, 2016, 2:52am

Hi afshiinzkh,

This is Nsight visual studio forum, for cuda programming question you can ask it at CUDA Programming and Performance forum, for cublas queston you can ask it at GPU-Accelerated Libraries forum.

Best Regards