Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Since there's nothing limiting the set of rows from project_commits, it might as well table-scan it. The primary key on commits will be used for each left join lookup.

Nested loops aren't that different, performance-wise, than sorted merge joins. Sorting takes O(n log n); whereas the nested loop does n lookups, each taking O(log n), for a similar O(n log n). Memory allowing, a hash join has more potential for speedup.

There should be a locality win from a sorted merge - depending on the correlation between scan order and foreign key order, the index lookups in the nested loop may be all over the place. Usually this doesn't matter much because you don't normally do 5+ billion row joins.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: