This guy took one look at some scientists' code and instantly got a 14000x speed...

wredue · on Feb 10, 2024

Yeah. We had an in house report generator that was epicly slow (about 7 minutes to gen a single report)

We frequently have strange issue come in cause fucking vendors can’t stop making changes, then saying “nobody changed anything that would cause that”!

Anyway. I told my boss I’m going to spend a day optimizing this, and got back “we had people try, it’s not really a valuable use of time”.

Long story short, I know that processing 10,000 lines of data should absolutely not take 7 minutes and now it doesn’t. Now we can run our entire day in 30 seconds. No threads. No async. Half a day of optimization.

I think part of the problem with optimization today is that people are completely clueless about how fast computers are, and thus stay happy with being slow just as a matter of ignorance. The FP community pushing that all optimization is premature optimization doesn’t help either.

chriswarbo · on Feb 10, 2024

Urgh, reminds me of a production error that was caused by a job running on a timer, assuming that some other job would have finished by then. That lead to two obvious questions:

- Why is it on a timer, instead of being triggered when the first job was done?

- Why was that first job taking over an hour to process XML data?

Turns out the second one was accidentally-quadratic, since it was parsing a whole document to process its first record; then re-parsing the whole document to process its second record; etc. and this made it very slow when the XML was reaching several hundred MB.

Unfortunately, the solution we went with was to set the timer to run 20 minutes later...

brabel · on Feb 10, 2024

I remember when I was in COMPSCI-101 or something like that and was introduced to sorting, and after coming up with a basic bubble sort algorithm, thinking that there was no way that could be improved because it looked self-evident that you needed to do all the work the algorithm was doing or you couldn't be sure the list was really sorted. Then, of course, I was introduced to multiple better algorithms like merge sort and quick sort which showed just how wrong I was...

Perhaps, for the "scientist" writing that initial code, their algorithm was also already the best that could be done because they just don't know much about algorithms and techniques to make them faster?! But once you learn about things like divide-and-conquer, work avoidance etc. it becomes pretty much self-evident when your algorithm sucks... it's really not that hard - definitely not for people whose job is stuff like climate science who have a very good understanding of maths.

Maybe by teaching scientists the basic "smart" algorithms and just a little about algorithm analysis (Big O notation), massive computation improvements could be had.

munch117 · on Feb 10, 2024

For you, me and James Hiebert, this sort of optimisation is exhilarating. Getting a half-hour job down to milliseconds? Awesome!

But for the scientists that wrote the original code, maybe not. Maybe they think of this sort of thing as drudge work, something that doesn't really hold their interest. Their fun is in designing the mathematical concept, and turning it into code is just a chore.

So yeah, we could teach scientists. But even better would be if we could provide scientists with tools that are just naturally fast when expressing problems on their own terms.

brabel · on Feb 10, 2024

> Their fun is in designing the mathematical concept, and turning it into code is just a chore.

It's not about fun. Sometimes it can be the difference between something being even possible or not. In this case, the author said they ran this algorithm hundreds of times. So changing it from 30 mins to 0.1 second makes things that were impossible before, possible. I don't find it fun at all to optimise, but I can see where I may need to in order to make things better, or possible at all... what I am suggesting is that anyone writing code, scientist or not, need to be aware of this - and know when they just MUST optimise.

onos · on Feb 10, 2024

As an ex scientist, I think basic algorithms theory should be incorporated into scientific computing classes. I took a few of these but none of the concepts from this area was covered. I remember well discovering some code of mine was slowing down with system size and finally realizing it was because “append” was creating a new array each step… had no clue that would happen. Was enthralled by the online algorithms course when I finally discovered it - hash tables!!!

zmgsabst · on Feb 10, 2024

Also, spending three hours optimizing away thirty minutes you only run four times is a net negative of one hour.

You have to optimize the system, as a whole.

dgacmu · on Feb 10, 2024

The irony is that his solution has an unnecessary log(n) term, because it's finding each point independently instead of taking advantage of the fact that the input points are also a dense grid. ;) so it can probably run 2-10x faster than the optimized version, assuming that the function call to bin search is probably a lot more expensive than filling in the grid, which R can do internally.

(You don't have to binary search each time, you have a known starting point from the previous point. So you binary search once to find the top-left point and the rest is just checking if you moved into the next global grid cell.)

NooneAtAll3 · on Feb 10, 2024

"Someone improved my code by 40,832,277,770%"

https://www.youtube.com/watch?v=c33AZBnRHks

ykonstant · on Feb 10, 2024

Thanks for the link, this was very entertaining.

darylteo · on Feb 10, 2024

With a bit more digging, but very similar https://nee.lv/2021/02/28/How-I-cut-GTA-Online-loading-times...

eru · on Feb 10, 2024

Interesting article!

> Furthermore, there is literally no way to tell whether your program will ever actually terminate without actually executing it. And there are many, many, many, ways to write a program which make it “slow” to execute. “Slow” being… like really slow.

Nitpick: there are some languages where you can tell whether your program will terminate without running it. There are even languages where you can tell whether your program will be slow.

And there are even more specialised languages where any program will be 'fast'.

Obviously, these languages aren't Turing complete. It's an interesting exercise to explore the design space to find languages that are both useful and avoid being Turing complete.

Agda is an example of a language that allows you to run approximately any practical computation, but it's not Turing complete. Oversimplified: in Agda a program is only considered valid by the compiler, if it comes with a computer-checkable proof of halting on all inputs.

hansvm · on Feb 11, 2024

> Obviously, these languages aren't Turing complete.

Interestingly, a Turing complete language without non-halting programs could exist conceptually (rather, many do); it's just impossible to know it has that property (hard to use in practice if you don't know it has the guarantees you'd like), and it can't be finitely describable (impossible to implement a compiler for it which doesn't sometimes "succeed" at compiling invalid programs).

towawy · on Feb 10, 2024

You can also construct programs in Turing complete languages to always halt (e.g. by bounding the number of iterations of all loops, usually with the tradeoff of limited input size).

eru · on Feb 10, 2024

Yes, you can program in a subset of your language that is not Turing complete.

The_Colonel · on Feb 10, 2024

You'd also need to forbid (even indirect) recursion, i. e. one function cannot appear in a callstack twice.

trealira · on Feb 10, 2024

Isn't that overly restrictive? Some programs are guaranteed to terminate even with mutual recursion.

For example, these (obviously inefficient) functions always terminate for all values of x, since both terminate given an input of 0, and 0 is the minimum possible input to these functions.

  bool even(unsigned int x)
  {
    return (n == 0) ? true : odd(n - 1);
  }

  bool odd(unsigned int x)
  {
    return (n == 0) ? false : even(n - 1);
  }

The_Colonel · on Feb 11, 2024

The point of these restrictions is that you can formally verify that a given program will terminate in a way which can be checked by e.g. a compiler or some other kind of checker.

While statically verifying your snippet is possible, it is also significantly more difficult to do than enforcing rules like "all loops must have an upper limit". I also think that you'd need dependent types to make this approach viable. Your simple example is IMHO rather an exception, just having "unsigned int" parameters / return types would mostly not suffice to statically verify that a recursive method will finish or not. (plus things like global state etc.)

eru · on Feb 12, 2024

A language with your suggested restriction (and a few more) might be possible.

But it's not the only way to get a total language (that always halts), contrary to what your original comment claims.

eru · on Feb 12, 2024

Agda doesn't need that restriction, as far as I know. And neither does Dhall. See https://dhall-lang.org/

Neither language is Turing complete.

zmgsabst · on Feb 10, 2024

I think you can allow finite recursions, eg, a maximum stack depth.

The_Colonel · on Feb 11, 2024

But then reaching it will mean a crash. I could use a similar approach to say that any program will terminate if I set a fixed time (instruction count) limit.

zmgsabst · on Feb 12, 2024

Yes — that’s correct.

Programs with a fixed amount of execution time always terminate. That’s why we add timeouts to executions, because they’re a solution to the halting problem, which guarantees execution flow of our machines always returns at some point.

EasyMark · on Feb 10, 2024

I once took some matlab code and got a 200x+ improvement for a couple of days work, the code was atrocious. So it really matters to understand your tool. I think the person I took it from was afraid to change anything that worked after it was prototyped.