It’s not the speed, that holds FPGA adaptation back. It’s development process/time. While one can start with GPU immediately, there is a need for FPGA to develop whole PCIe infrastructure and efficient data movers. One is done with GPU while FPGA developers just start with algorithms. As long as one does not need real time capability, GPU is an obvious choice. My 200 MHz design outcompetes every CPU and GPU out there with very narrow data processing window, but development time is 5x compared to regular software.