Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, it's great. Part of the issue is that PostgreSQL isn't great at parallelism yet so optimizing the storage greatly reduces compute time to begin with.

Unfortunately that extension has a bunch of limitations and issues that keep it from being production-ready. PostgreSQL could really use a proper columnstore table implementation, and there's a pluggable storage API on the roadmap but it hasn't gotten much traction yet (and is focused on an in-memory engine first).



And it's... really not all that fast when compared to mature analytical databases. ClickHouse on identical hardware is ~ 100x faster than cstore_fdw.

http://tech.marksblogg.com/benchmarks.html

More interesting to me is the reverse: using FDW from the analytical DB to Postgres, e.g., https://aws.amazon.com/blogs/big-data/join-amazon-redshift-a...


This is also an older benchmark. I'd be curious to see this used against postgresql 11; there have been massive speed increases in 10 and now 11.

One of the most interesting things is you can shard postgres extremely easily now by using FDWs which can do native pushdown of optimizations.

Also, Citus itself has been improving in that time, and I'm curious to see how that performs.


Well yes, like I said it's not really a true columnstore. It's a basic storage extension that writes table data as ORC files for good compression and in the best-case scenario can managed to filter out segments. It's missing all the fancy processing features that real columnar warehouses use so it'll never be as fast.


This was doing some analysis on my laptop. What other free columnar options are out there for Postgres?

For the specific dataset I needed to take advantage of some of the additional datatypes that PG has available, which is how I found it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: