Memory access is often over 100x the cost of an L1 cache hit. It doesn't take too many of those to make a big difference if you're CPU bound.
My comments are general, I'm sure Clojure has access to arrays where necessary for interop at least.