I am computing correlation coefficients of a number of different arrays in parallel.
A few of the results are greater than 1.0, so I am tracking down the discrepancy.
When debugging in NSight, some of the intermediate results appear to be dramatically off.
For equation: float dfs2 = 250.37335 - 252 * -0.99676728 * -0.99676728 (all float except int 252)
NSight in VS2010 shows a value of 8.1612998e-06 for the computation before execution, but after stepping in NSight, the value of dfs2 is shown as 1.5258789e-05, which is dramatically different.
Excel shows the value as 7.35939E-06 (obviously closer to pre-computation displayed)
Why would the device computed results be twice as large?
Results on 525M with Optimus running latest release CUDA 5.0 and latest release NSight 3.0 with driver v320.18.
CorrDemo.zip (50.9 KB)