Enable GPU cloth, but no speed up?

Hi, there. I have a problem about GPU cloth.
I use the following statement to enable GPU acceleration as the API DOC says:
PxCloth::setClothFlag(PxClothFlag::eGPU, true);
But it doesn’t work, program alway prompt message“warning : GPU cloth creation failed. Falling back to CPU implementation”

Then, I use the following codes to setup CudaContexManager in PhysX initialization:
PxProfileZoneManager* profileZoneManager = &PxProfileZoneManager::createProfileZoneManager(m_foundation);
PxCudaContextManagerDesc cudaContextManagerDesc;
PxCudaContextManager* mCudaContextManager = PxCreateCudaContextManager(*m_foundation, cudaContextManagerDesc, profileZoneManager);
if(mCudaContextManager)
{
if(!mCudaContextManager->contextIsValid())
{
mCudaContextManager->release();
mCudaContextManager = NULL;
}
}
sceneDesc.gpuDispatcher = mCudaContextManager->getGpuDispatcher();

There is no warning message prompt, so I think I have open GPU cloth successfully. However, I don’t see any speed up. Did I miss something? Could anyone tell me why? Thanks.

If you are using low resolution cloth or only a few assets, CPU execution may be faster.

There is only one piece of clothing, which consist of 8,000 vertices. As for cloth-body collision handling, instead of using PhysX cloth-body API, I use my own method. Is this the reason? Thanks.

I think a single cloth with 8000 vertices might be right on the edge of the CPU/GPU threshold; what do you mean by ‘I use my own method’ for collision? The collision between the cloth and the connected-sphere shapes, and the contact generation, are performed on the GPU, so if you are doing your own collision in between GPU frames, I wouldn’t be surprised if that impacts performance. Why are you choosing to bypass the native collision?

My cloth-body collision method is based on voxelization. Just as you say, I’m doing this in between GPU frames. As for cloth self collision, I use PhysX cloth self collision.
The reason I didn’t use native collision for cloth-body is that generate spheres and capsules for human body is a tough and time consuming work. And currently, there is no method to do this automatically .