Enabled graphics cards

Hi there,
is there a way, how to identify optix cards by cuda?

For example:
My machine - 1x 980 + 1x 580

OptiX returns only device number 0 in array returned by optix::context::getEnabledDevices() function.
However I need to know, which card OptiX uses, because cuda returns 2 cards in cudaGetDeviceCount().
If I use something like cudaGetDeviceProperties(…, optix::context::getEnabledDevices()[0]), it works for this case, but not, if my cards will be inserted into my motherboard in different order:
1x 580 + 1x 980
OptiX returns device number 0 again, however cudaGetDeviceProperties(…, optix::context::getEnabledDevices()[0]) is 580 card, but optix uses 980.

Also is there a way, how to tell OptiX (by c++ program), that I need to use only 580 card and not 980?

Yes, of course. You can use the OptiX rtDevice*() functions to query the installed devices before creating the context and set the device(s) you want to select from that enumeration.

I normally print out what is installed and what my OptiX applications picked with this code which is loosely based on the functionality demonstrated in the OptiX SDK “sample3” example code.
I’m explicitly using the C-API in that function because I find the available C++ wrappers are misplaced. They are defined at the ContextObj which I haven’t created at that time.

#define RT_CHECK_ERROR_NO_CONTEXT( func ) \
  do { \
    RTresult code = func; \
    if (code != RT_SUCCESS) \
      std::cerr << "ERROR at " << #func << std::endl; \
  } while (0)

...

void Application::getSystemInformation()
{
  unsigned int optixVersion;
  RT_CHECK_ERROR_NO_CONTEXT(rtGetVersion(&optixVersion));
  std::cout << "OptiX " << optixVersion/1000 << "." << (optixVersion % 1000) / 10 << "." << optixVersion % 10 << std::endl;

  RT_CHECK_ERROR_NO_CONTEXT(rtDeviceGetDeviceCount(&m_numberOfDevices));
  std::cout << "Number of Devices = " << m_numberOfDevices << std::endl << std::endl;

  for (unsigned int i = 0; i < m_numberOfDevices; ++i)
  {
    char name[256];
    RT_CHECK_ERROR_NO_CONTEXT(rtDeviceGetAttribute(i, RT_DEVICE_ATTRIBUTE_NAME, sizeof(name), name));
    std::cout << "Device " << i << ": " << name << std::endl;
  
    int computeCapability[2] = {0, 0};
    RT_CHECK_ERROR_NO_CONTEXT(rtDeviceGetAttribute(i, RT_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY, sizeof(computeCapability), &computeCapability));
    std::cout << "  Compute Support: " << computeCapability[0] << "." << computeCapability[1] << std::endl;

    RTsize totalMemory = 0;
    RT_CHECK_ERROR_NO_CONTEXT(rtDeviceGetAttribute(i, RT_DEVICE_ATTRIBUTE_TOTAL_MEMORY, sizeof(totalMemory), &totalMemory));
    std::cout << "  Total Memory: " << (unsigned long long) totalMemory << std::endl;

    int clockRate = 0;
    RT_CHECK_ERROR_NO_CONTEXT(rtDeviceGetAttribute(i, RT_DEVICE_ATTRIBUTE_CLOCK_RATE, sizeof(clockRate), &clockRate));
    std::cout << "  Clock Rate: " << clockRate << " kHz" << std::endl;

    int maxThreadsPerBlock = 0;
    RT_CHECK_ERROR_NO_CONTEXT(rtDeviceGetAttribute(i, RT_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK, sizeof(maxThreadsPerBlock), &maxThreadsPerBlock));
    std::cout << "  Max. Threads per Block: " << maxThreadsPerBlock << std::endl;

    int smCount = 0;
    RT_CHECK_ERROR_NO_CONTEXT(rtDeviceGetAttribute(i, RT_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT, sizeof(smCount), &smCount));
    std::cout << "  Streaming Multiprocessor Count: " << smCount << std::endl;

    int executionTimeoutEnabled = 0;
    RT_CHECK_ERROR_NO_CONTEXT(rtDeviceGetAttribute(i, RT_DEVICE_ATTRIBUTE_EXECUTION_TIMEOUT_ENABLED, sizeof(executionTimeoutEnabled), &executionTimeoutEnabled));
    std::cout << "  Execution Timeout Enabled: " << executionTimeoutEnabled << std::endl;

    int maxHardwareTextureCount = 0 ;
    RT_CHECK_ERROR_NO_CONTEXT(rtDeviceGetAttribute(i, RT_DEVICE_ATTRIBUTE_MAX_HARDWARE_TEXTURE_COUNT, sizeof(maxHardwareTextureCount), &maxHardwareTextureCount));
    std::cout << "  Max. Hardware Texture Count: " << maxHardwareTextureCount << std::endl;
 
    int tccDriver = 0;
    RT_CHECK_ERROR_NO_CONTEXT(rtDeviceGetAttribute(i, RT_DEVICE_ATTRIBUTE_TCC_DRIVER, sizeof(tccDriver), &tccDriver));
    std::cout << "  TCC Driver enabled: " << tccDriver << std::endl;
 
    int cudaDeviceOrdinal = 0;
    RT_CHECK_ERROR_NO_CONTEXT(rtDeviceGetAttribute(i, RT_DEVICE_ATTRIBUTE_CUDA_DEVICE_ORDINAL, sizeof(cudaDeviceOrdinal), &cudaDeviceOrdinal));
    std::cout << "  CUDA Device Ordinal: " << cudaDeviceOrdinal << std::endl << std::endl;
  }
}

The last attribute tells you the matching CUDA ordinal.

As you found out, the device numbering can be different between OptiX and CUDA, depending on what mechanisms are used to determine the resp. ordering, e.g. by PCI-E slot, by performance, or by SM version.

If you use something like that code to query what devices are installed before creating the OptiX context and calling m_context->setDevices() on that with a vector of OptiX device IDs you want to use, then m_context->getEnabledDevices() will return exactly that vector.
If you let OptiX decide which device(s) to use, you will get what it picked according to the rules listed in the OptiX Programming Guide chapter “3.1 Context”.

void Application::initOptiX()
{
  try
  {
    getSystemInformation(); // Sets m_numberOfDevices.

    m_context = optix::Context::create();

    // Select the GPUs to use with this context.
    m_devices.clear();
    
    // ... TODO: Put code here to fill the vector<int> m_devices with IDs in the range [0, m_numberOfDevices - 1], the OptiX ordinals.
    // The list of devices must adhere to the compatibility rules in OptiX Programming Guide chapter 3.1!

    m_context->setDevices(m_devices.begin(), m_devices.end());
    
    // Print out the current configuration to make sure what's currently running.
    std::vector<int> devices = m_context->getEnabledDevices();
    for (size_t i = 0; i < devices.size(); ++i) 
    {
      std::cout << "m_context is using device " << devices[i] << ": " << m_context->getDeviceName(devices[i]) << std::endl;
    }

    // ... TODO: Handle the rest of the one-time OptiX initialization here or after this function has been called.
  }
  catch(optix::Exception& e)
  {
    std::cerr << e.getErrorString() << std::endl;
  }
}

As you’ve no doubt noticed, the indices used by OptiX and CUDA to identify your GPUs are not always the same. However, if you have the GPU’s index according to OptiX, you can retrieve it’s CUDA index like so:

int optix_device_index = 0, cuda_device_index;
rtDeviceGetAttribute(optix_device_index, RT_DEVICE_ATTRIBUTE_CUDA_DEVICE_ORDINAL, sizeof(int), &cuda_device_index);

In your case, the GTX 980 is a Maxwell card (compute 5.2), and the GTX 580 is a Fermi card (compute 2.0). If you check chapter 3.1 of the OptiX Programming Guide, you’ll see that OptiX cannot create a multi-GPU configuration with these two cards, so it will only use the GTX 980. You can look up the compute capabilities of your cards here.

You can tell OptiX which devices you want it to use within c/c++ using rtContextSetDevices with the OptiX device indices, or outside of your code using CUDA_VISIBLE_DEVICES with the CUDA device indices.

Thanks, it is exactly what I need.