Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I wonder how much of this is driven by market need. Pyarrow + Pandas is significantly fast already.

http://wesmckinney.com/blog/high-perf-arrow-to-pandas/

Also Pandas 2.0 is going to roll in a lot more utulities for parallel computing. Is there really a need for 50-100x speedups today ?



I often run into situations where I hope pandas were 50-100x faster.

Dask can help, but introduces quite a bit of additional complexity.

I'm also looking forward to stricter data models than what pandas currently uses, in particular proper null support for all dtypes and less random type conversion.


RAPIDS is partly powered by Apache Arrow. So we are all collaborating on a common next-generation computation ecosystem.


That blog post is about how to load more data into pandas via Arrow. RAPIDS is about how to then compute on it. It's all the same people working on Arrow and GoAi. So... Yes :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: