Hey, so I’m using CUDA 4.0 and have a compute capability 2.1 card, and I want to use some objects with virtual functions in my kernel. If I create the object and then call a virtual function inside the same kernel it works exactly as it’s supposed to, however I’d like to be able to move my data structure containing a variety of objects up to the device, then run a kernel that references it. When I do this executing a virtual function it crashes the kernel. I’ve tried two ways and I think this is what is going on with them:
-
Copy the objects from the host into global device memory using cudaMalloc and cudaMemcpy. I think the problem with this is that I create the objects on the host so the virtual function tables refer to host memory which when accessed on the device cause the crash.
-
Create the objects on the device and save the pointers to be called with a different kernel later. I think the problem here is that the virtual function tables created reference the kernels I call to create the objects (which basically just contain a single ‘new’ statement), and the kernel I want to use to do my work has those functions at different addresses, resulting in a crash.
All the data in the objects gets transferred fine and is correct when I run a kernel later so I’m pretty sure I’m getting the objects on the device correctly otherwise.
Is it possible to use a virtual function of an object inside a kernel that the object was not created in? Could I somehow reference my working kernel in my object creation kernels to get the right function addresses? Ideally I would like to be able to do it from the host. Is this even what the problem actually is?
Thanks.