Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Well I keep seeing all models quantized and for 2-bit, 4-bit and 1-bit quantizations I had good very good inference performance (either througput or latency) on CNNs and some RNNs on Alveo boards using FINN (so, mostly high level synthesis and very little actual fpga wrangling). No idea about the current status of all these, will read the paper though :-)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: