gpus are just so much easier to develop for than cpus - why are we still torturing ourselves with kernel mode and ring 0 when you can just write a single kernel and be done with it?