Hello there,
I am running HPL to test a desktop computer, now with 2 tesla C2050 cards (hpl-2.0_FERMI_v13.tgz), available on nvidia developer’s zone. I run the HPL benchmark (mpirun -np 2 run_linpack &) and immediately I ran the “nvidia-smi -q -d MEMORY,UTILIZATION” command I got the following output:
==============NVSMI LOG==============
Timestamp : Fri Sep 30 23:55:06 2011
Driver Version : 275.09.07
Attached GPUs : 3
GPU 0:A:0 ### TESLA C2050
Memory Usage
Total : 2687 Mb
Used : 2321 Mb
Free : 365 Mb
Utilization
Gpu : 0 %
Memory : 0 %
GPU 0:8:0 ### TESLA C2050
Memory Usage
Total : 2687 Mb
Used : 2321 Mb
Free : 366 Mb
Utilization
Gpu : 0 %
Memory : 0 %
GPU 0:81:0 ### QUADRO 5000
Memory Usage
Total : 2559 Mb
Used : 16 Mb
Free : 2542 Mb
Utilization
Gpu : 0 %
Memory : 3 %
The full nvidia-smi -q command output is attached as nvidia-smi.txt
As you see, the tesla cards are using almost all of their memory, but has 0% on GPU and Memory utilization (I don’t know why).
The benchmark requires a lot of time and the performance in Gflops is very low (as it were using only the CPUs), this is my HPL.out file:
...
The following parameter values will be used:
N : 51712
NB : 512
PMAP : Row-major process mapping
P : 1
Q : 2
PFACT : Left
NBMIN : 4
NDIV : 2
RFACT : Left
BCAST : 1ring
DEPTH : 0
SWAP : Mix (threshold = 128)
L1 : no-transposed form
U : no-transposed form
EQUIL : yes
ALIGN : 8 double precision words
--------------------------------------------------------------------------------
- The matrix A is randomly generated for each test.
- The following scaled residual check will be computed:
||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N )
- The relative machine precision (eps) is taken to be 1.110223e-16
- Computational tests pass if scaled residuals are less than 16.0
================================================================================
T/V N NB P Q Time Gflops
--------------------------------------------------------------------------------
WR00L2L4 51712 512 1 2 1373.70 6.711e+01
--------------------------------------------------------------------------------
||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0040267 ...... PASSED
================================================================================
Finished 1 tests with the following results:
1 tests completed and passed residual checks,
0 tests completed and failed residual checks,
0 tests skipped because of illegal input values.
Does anyone know why my HPL runs look like they weren’t using GPUs??
Thanks for the help! :)
nvidia-smi.txt (7.3 KB)