(Post author here.) Curious to hear more details about your workload, because a ...

marmaduke · on April 22, 2016

It's parameter sweeps of a delay differential equations, one simulation per thread. This requires a lot of complex array indexing and global memory access, so arithmetic density isn't near optimal. Still, it's a real world workload that benefits hugely from GPU acceleration.

Moving from a GTX 480 to a Kepler or Maxwell card, the numbers go up, but not the performance. I might have a corner case, but before investing in new hardware, I would want to benchmark first and not blindly follow the numbers.

Caprinicus · on April 22, 2016

People bought 400 series cards for their compute performance long after they were outdated. If your software wants an nvidia card it was either that or go up to quadro. People bought the first titan for the same reason.