Order Independent Transparency bug?

I implemented the following Order Independent Transparency paper: Memory-Efficient Order-Independent Transparency with Dynamic Fragment Buffer | IEEE Conference Publication | IEEE Xplore
using cuda 9.0 (VS 2017) in a gtx 970 driver version 388.31;

The problem however is that as you can see in the following screenshot, those strange incoherent temporal artifacts will happen no matter what;

(left side is fixed function alpha blend, right side is blend done with the paper algorithm): https://i.imgur.com/hNH8zvA.png

(video version) http://a.pomf.cat/knodfl.mp4

The same algorithm runs fine on a gtx 920m with the same drivers.

For a moment I thought it was due to undefined behavior with unsychronized image load/stores in different shader passes, but even after adding glFinish/glFences and glMemoryBarriers nothing changed;

I can provide the entire Visual studio solution as needed; dl link: https://ufile.io/2trjm

Have anyone faced a similar problem before?
Thank you