Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

SQL.

It is a joke, but an SQL engine can be massively parallel. You just don't know it, it just gives you what you want. And in many ways the operations resembles what you do for example in CUDA.

CUDA backend for DuckDB or Trino would be one of my go-to projects if i was laid off.



My issue with SQL is lack of composability and difficulty of debugging intermediate results.


Yes, SQL is poor.

What could be good is relational + array model. I have some ideas on https://tablam.org, and building not just the language but the optimizer in tandem I think will be very nice.


The programming style reminds me of the old days of clipper and xbase family, even ABAP. I like the syntax.


You can use SQL CTE's and/or VIEW's as a composable abstraction over queries and inspect intermediate results. The language features are there.


The standard things that someone should always say when someone brings up this problem is:

• Datalog is much, much better on these axes.

• Tutorial D is also better than SQL.


Check out https://prql-lang.org/

It solves all the warts of sql while still being true to its declarative execution. Trailing commas, from statement first and reads as a a composable pipeline, temporary variables for expressions, intuitive grouping.


is it a language problem though? it's just lack of tooling.


The dataframe paradigm (a good example being polars) is another good alternative that's more composable (imo).


It is true. I still hate it. I think because it always offers 10 different ways to do the same thing. So it is just too much to remember.


Even in this thread people underestimate how good e.g. DuckDB can be if you swallow its quirks. Yeah SQL has many problems, but with a slightly extended language with QoL features and seamless parallelism DuckDB is extremely productive if you want to crunch bunch of numbers in the order of minutes, hours etc (not real time).

Sometimes I have a problem, I just generate bunch of "possible solutions" with a constraint solver (e.g. Minizinc) which generates GBs of CSVs describing bunch of solutions, then let DuckDB analyze which ones are suitable, DuckDB is amazing.


More generally, the key here is that the more magic you want in the execution of your code, the more declarative you want the code to be. And SQL is pretty much the poster child declarative language out there.

Term rewriting languages probably work better at this than I would expect? It is kind of sad how little experience with that sort of thing that I have built up. And I think I'm above a large percentage of developers out there.


If you want to work in data engineering for massive datasets (many petabytes) pls hit me up!


Sorry, wrong continent :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: