First time shader performance issue and Performance spikes in DirectCompute

I am new to Direct Compute and I’m having some trouble in start up performance. I’m using AMD FirePro W7000 Graphics card.The DirectCompute feature level is cs_5_0.

In my project the input data is given as frames. The input data size is 1024x1024.

The number of threads used is [numthreads( 16,16,1)] and Dispatch() calls are always Dispatch(64,64,1). Also tried to follow other performance optimization methods such as using shared memory, minimized the use of local variables etc.

During shader execution only following steps are done:

        pIContext->CSSetShader()
        pIContext->CSSetUnorderedAccessViews();
        pIContext->CSSetConstantBuffers( );

        // Execute shader
        pIContext->Dispatch( 64,64, 1 );

       //  Reset shader parameters
        pIContext->CSSetShader( NULL, NULL, 0 );
        pIContext->CSSetUnorderedAccessViews( 0, uNumViews, pNullUAV, NULL );My question is why the following behaviuors occur?
  1. After all these steps still there is a performance hit for first few shader executions. For successive executions the same shaders take only less time. { Execution in shaders are log calculation, bit shift, FFT, apply Gaussian filter etc}

  2. During continuous processing of frames after every 6 seconds, there is a spike in performance. This is observed when frames are given with less than 10 fps speed.

Please help.

I have also tried with NVIDIA Tesla C2075. Startup issue exist. Spike issue is almost not there