Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

multiple people can be wrong, so it is not a valid argument. I can write 10 nested loops and overwrite value of a single cpu register with a single value, so many times, that it will exceed any big data cluster capacity. This is what I meant by CPUs are fast.

if you study engine internals, especially MergeJoin, HashJoin, NestedLoopJoin - they all do comoute cartesian while simultaneously applying predicates. Some operations are faster because of sorting and hashing, but they still do cartesian.



If they are applying predicates to reduce the number of rows processed then they are not performing cartesian joins. You don't need to take the word of people here, you merely need to read any source about how database engines process queries. I am of course open to the possibility of being wrong should you find authoritative sources that show cartesian joins produced for all queries. However, nearly two decades of working with a variety of engines tells me you are unlikely to find such a source. Your comments betray a fundamental lack of understanding on the topic. Your unwillingness to recognize the exponential resources required for your understanding to be correct also betrays a fundamental misunderstanding. You even contradict yourself by in one comment insisting that such queries would be "SLOW" while in another comment stating that you obtain speedy results due to a fast CPU. Which in itself betrays yet another fundamental misunderstanding in how databases utilize resources and where bottlenecks arise.

Last, you have offered no significant support for your claims, and as the initiator of the discussion making the claim of cartesian joins, the burden of proof is on you to provide evidence for your claims.

In any case, your persistence in digging your hole deeper and deeper on this issue is embarrassing, and I won't be cruel enough to enable that further. Feel free to reply, but I am done, with a final recommendation that you continue your learning and not stop at the point of your current understanding. You at least seem earnest in your interests in this topic, so you should pursue it further.


> if you study engine internals, especially MergeJoin, HashJoin, NestedLoopJoin

it's called merge join because it employs the merge algorithm (linear time constant space)[1] - this can only be used when the inputs are sorted. likewise the hash join entails building a hash table and using it (linear time linear space)[2] but the inputs don't have to be sorted. the point of these is to avoid the O(N*M) nested loop

[1] https://en.wikipedia.org/wiki/Merge_algorithm

[2] https://en.wikipedia.org/wiki/Hash_join




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: