You should find a way to ensure that the openGL context created by your app is being created on the surface book dGPU, as it appears that the surface book iGPU does not support OGL.
This is generally recommended for CUDA-OpenGL interop apps anyway (create the OGL context on the same GPU that has the CUDA context), and you can google for instructions on how to do it, or this writeup may help you get started:
at the beginning of the OpenGL program,
everything seems to be OK now.
But I am still wondering,
why I should call this even after I have called cudaSetDevice(deviceID); ?
Does this mean that I can use GPU1 for CUDA calculation, meanwhile, use GPU2 for OpenGL displaying?
It is possible to run CUDA/OpenGL interop on 2 different NVIDIA GPUs, where one is hosting the OGL context and one is hosting the CUDA context. This might run a little slower than the combined case, but it is possible. in your case, assuming you only have one NVIDIA GPU in your machine, you would want both the GL and CUDA contexts established on the same GPU.