Block RAMs are 2-port. You can up it to 4 ports by doubling their clock rate, if the rest is merely crawling on 100mhz. Yet, each core would eat up one port for reading its microcode every clock cycle, so you can only have up to 4 cores for each group of 4 brams with stored microcode.