Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
4B If Statements (andreasjhkarlsson.github.io)
1518 points by r4um on Dec 28, 2023 | hide | past | favorite | 469 comments


I wish I still had one of the earliest programs I wrote. It would have been 1996, I was 16, and I had a linear algebra book which had an appendix entry on computer graphics. I had taken a programming class the prior quarter.

I got hooked writing a program to draw rotating wireframes of a few different shapes. I almost failed the class because I was so distracted by this.

I didn't know about arrays yet. Every vertex was its own variable with a hardcoded default. Every entry in the rotation matrices was its own variable. The code to do matrix multiplication therefore didn't have loops, just a long list of calculations. Which had to be duplicated and adapted to every vertex.

I did know about pointers, I had to draw to the screen after all, which was done by writing to memory starting at a known address. So at least I had loops rasterizing the lines between vertices.

Which means I definitely had the concept of arrays and indexing into them, I just didn't know enough to make my own.


Same. When I was 12ish trying to write a Pac-Man game in basic, I dreaded writing the logic for the 4 ghosts. I'd need separate code for the ghosts at (x1,y1)....(x4,y4) and said to my dad it would be cool if I could use xn and yn inside a for loop, but those are just 2 more variables. I wanted n to indicate which one. This seemed kind of weird, but it's what I really really wanted. He grabbed the little basic book and showed me that x(n) was actually a thing!

I think back to this in discussions about teaching. Abstract concepts are best understood when the student has a legitimate need for them. You can explain something all day and have their eyes glaze over wondering why something is important, but if it solves a problem they have it will click in seconds or minutes.


This comes up pretty often in (well-designed) video games. Say you have colored doors that can only be opened by keys of the same color. If you put the blue key on the path to the blue door, then players will never learn that you need the color of the key to match the color of the door. But if you put the blue key on a side path and have the player encounter a blue door relatively early on, then they will encounter the door first and cannot progress until they find the key. The frustration at not being able to progress is much more effective at teaching the player about the "colored keys for colored doors" mechanic than any amount of explaining you could have done.


This answer unlocked a principle in my thinking about a certain class of problems I’ve been struggling with in a different domain. Thank you!


When I started programming at 12-13 I had no idea loops or arrays existed and implemented an absolutely criminal Tic-Tac-Toe game.

The same year me and a friend were studying for the national Informatics Olympiad and we were completely stumped when faced with a problem that required a list: "How do we know how many variables to declare if we don't know the N beforehand?".

(The problem if anyone is interested: https://neps.academy/br/exercise/384. Given a list of up to 50k distinct numbers (with values up to 100k) and another list of 50k numbers that should be removed, print the resulting list. Time Limit: 1 second.)


When I first started coding, I thought loop structures were confusing and redundant when I could accomplish the same thing with GOTO. So that’s how I implemented all my loops for years until I eventually took a course and a teacher told me I was never to use GOTO again.


Dijaksta's “goto considered harmful“ is often interpreted as a much broader opinion than it was, but this is a perfect example of what he was against


Yeah Dijkstra's main objection to goto is that it's really hard to formally define its semantics and even if you do, the formalism is so complex it's hard to reason about.

Break and continue on the other hand are considerably easier to formalize.


> national Informatics Olympiad

That brought back a lot of memories... I wonder if there are participation records. The olympiad website isn't responding. I also took part in the astronomy olympiad.


The OBI website is really great, it's just been down for a couple of days. There's a good chance you'll find records there.


This seems to be a somewhat common occurrence. When I was first learning programming with python I wanted multiple balls in a pong-like game, so after some googling I ended up writing something like `globals()["ball" + i]` until someone mentioned that arrays were a thing I could use for that.


It’s a recurring thing because programing at its fundamental level is extremely simple. If you have variables, conditionals and loop/goto you can do anything. It’s pretty easy to learn the basics and get much more empowered than you had any right to be.

Armed with such power, you will build a slow, unreadable and unmaintainable mess, but you don’t know that yet, because it’s all working after all.

But as your lines multiply you begin to misstep on your own code and wonder if there’s a better way to do this. And you then stumble upon data structures, functions, classes, etc.

The self-taught road is always more or less like that.


I believe this is the way. That is why I sometimes think whether programming can be taught by using shell environment at first. Every command is like a program and you can do a lot without loops.

Piping also helps. But maybe piping is not a good way to start, dunno.


Shell (and its odd cousin the .BAT file) is interesting because it’s got such wildly different sets of “easy things” and “complex things” compared to almost any language.

Write some ASCII to a file - one character in Bash, 2-?? lines of annoying boilerplate in every other language

Pick the second item in an array - in Bash, a series of random-looking punctuation I can never recall to even create a true array. In any other language, roughly myArray[1]

So I worry that in some cases the student forced to only learn using shell might be bitter once they learned other languages.

PS: it’s funny you mentioned loops as a “do without” — I’d say bash is pretty okay with that — as long as you’re looping over a list of files ;)


The obvious solution here is to use the lower part of the screen as working memory while drawing the upper part. By the time you get to the bottom you shouldn't have many calculations left anyway. This has the benefit of using fast gpu ram and is therefore cuda and very ai.


If only GPUs existed back then. This would have been very slow memory mapped video ram.


It reminds me of one of my earliest ventures into freelancing, where all I had was a tiny VPS that supported PHP. I had to process a large spreadsheet which had maybe 5/10k entries in it. Small potatoes these days, compared to that time in 2002/2003.

I wasn't a computer scientist so I read the file in the dumbest way possible, but kept getting errors about memory usage and running out of space since I was looping within loops within loops. I went through it and added `$variable = null` at every opportunity. Lo and behold it worked.


Holy feces I think that I inherited your parser. Was it for a US client, in the field of cash advance loans?


No, not even close :)


Well, then, you can rest knowing that other parser writers came to the same workaround as you did!


I did something similar for my big middle school hit, a TI-83 version of Snake. Every x and y position of every segment of the snake went into a separate variable. The snake could only get so long because TI-83 BASIC only gives you so many variables.


IIRC I think it only gave you a fixed number of arrays, too, though I think they could get pretty long so you could probably have done that.

Also it’s rad that you were programming the TI-83 in middle school. I learned TI BASIC in high school for both fun and academics. (I created programs to solve each kind of trig problem we were assigned complete with shown work, with the full blessing of the teacher who could see that understanding was obviously a prerequisite to being able to code a solver.)


The first part of GWBasic I learned by asking someone for help was "chain", after having taught myself print, input, if, and goto from the documentation.


Do you recall which shapes you created?


I remember there were 3, and that there was a cube and a pyramid. I can't recall the third. Maybe I did both a square pyramid and a triangular pyramid.


Welcome to the world of optimization the hacky way on an old computer, where code is data is code, and the only truth is: does it render in as many frames as possible? ;)

Usually, you cannot achieve this by engaging in academic code with recursion and loops; you have to do the opposite: trade readability and compact code for speed.

Sometimes, code for 6510/68000 is so optimized - usually unrolled - that you get code patterns that resemble data applied at different frames, much like your code.

There is nothing wrong from my point of view. ;)

(Some words of wisdom: May the debug gods be with you on this journey, though.)


It seems over-engineered to me. Why go through all the hustle of code generation? It can be solved with a simple “for loop”:

  func isOdd(n int) bool {
    var odd bool
    for i := 0; i < n; i++ {
      odd = !odd
    }
    return odd
  }
Playground link: https://go.dev/play/p/8TIfzGrdWDF

I did not profile this one yet. But my intuition, and my industry experience, tell me that this is fast.


A true production quality implementation should always use recursion.

  func isOdd(n int) bool {
    switch {
    case n == 0:
      return false
    case n > 0:
      return !isOdd(n-1)
    default:
      return !isOdd(n+1)
    }
  }


You can also use beautiful mutual recursion a la ocaml

  let rec isEven n = 
    if n = 0 then true
    else isOdd (n-1)
  and isOdd n = 
    not (isEven n)


FYI ES2015 added tail-call optimization, allowing JS to finally have an elegant isOdd function.


Only Safari supports it, though


This will cause a stack overflow. You can convert that into a loop and make your own stack on the heap if you need very deep recursion.


If you can keep the local variables down, a simple ulimit -s 60000000 should be able to get that stack limit nice and big!

You may need to call setrlimit with an appropriate RLIMIT_STACK, but I think the FFI compatibility should be able to pull that off, right?


> This will cause a stack overflow. You can convert that into a loop and make your own stack on the heap if you need very deep recursion.

Doesn't Go allocate / extend stacks using dynamic allocation rather than having a static limit (e.g. like C)?


In a language with tail-call optimization, it won't.


It will. The negations are only being applied on the way back up the call stack. The mutual recursion version in a sibling post can be made to work though.


Surprisingly, some compilers can automatically turn that to a loop too! Try it on clang and gcc :)


GCC does eliminate one of the recursive calls but retains the other: https://godbolt.org/z/dof4T4vYv

But it also does some magic that I don't quite understand (what do the two `sub` instructions do before the `call`? Do they prepare the stack?) because x86 assembly is so confusing to me sometimes.


this is truely evil :)


I can confirm that the rust version of this one is fast:

    playground::isOdd:
     testq %rdi, %rdi
     setg %al
     andb %dil, %al
     retq
(Click ... beside build to get assembly) https://play.rust-lang.org/?version=stable&mode=release&edit...

Unfortunately the go playground doesn't seem to support emitting assembly?


Shameless plug to claim Rust superiority!1

No, Go playground can not do that :(


https://godbolt.org/ supports emitting assembly for Go while Google's tools catch up!


Good point! Neither gc nor gccgo -O seem to figure this function out :(

https://godbolt.org/z/eMv41nc6Y

Proof, that you must use rust if you want blazingly fast execution of fearlessly pessimized code!


This commands an issue in the Go repo!


Don't forget the companion isEven function:

    func isEven(n int64) bool {
        return !isOdd(n)
    }


Wow you can call a function from a function ? It’ll revolutionize my productivity.


It will iterate infinitely for n=infinite.


Is infinite even or odd?


Yes.


Ideally it should throw exception I guess.


You can improve this with tail recursion.


Does Go default-initialize booleans by any chance? In C, which author had used, this program would be undefined behavior due to reading non-static, uninitialized bool.


Go has “zero” values. The zero value of Boolean is false.

https://go.dev/tour/basics/12


You, sir, are a raging psychopath. You have found a local maxima of brevity and absurdity and I, for one, salute you.


I think this counts as a bottom-up DP solution


This approach is perfect for the is-even npm package[1], with 196,023 weekly downloads, or the is-odd npm package[2], with 285,501 weekly downloads. Wouldn't it be cool if you type `npm install` and it starts downloading a 40GB is-even and a 40GB is-odd?

[1] https://www.npmjs.com/package/is-even

[2] https://www.npmjs.com/package/is-odd


It's always worth mentioning that those packages stem from the effort of one dedicated npm spammer[1] to weasel his way into as many node_modules directories as possible. There's also packages for ansi-colors (no, not one package for all colors, one package for each color) and god knows what else. These then get jammed into any cli tool or other half-way sensible looking package, all referencing each other obviously - and any "real" project is just a single innocuous dependency away from having dozens of jonschlinkert packages.

[1] https://www.npmjs.com/~jonschlinkert


> Several years ago, just before my 40th birthday, I switched careers from sales, marketing and consulting to learn how to program, with the goal of making the world a better place through code [1]

It checks out, sales/consulting folks are pretty infamous for their tendency to abuse metrics. The metric here is npm downloads and Github stars.

The strategy does mean that he's _technically_ not inaccurate in claiming this on his LinkedIn -

> NASA, Microsoft, Google, AMEX, Target, IBM, Apple, Facebook, Airbus, Mercedes, Salesforce, and hundreds of thousands of other organizations depend on code I wrote to power their developer tools and consumer applications.

I encountered this type a lot in college consulting groups, it's a little funny seeing one make their way to the OSS community.

[1] https://github.com/jonschlinkert


> To date, I've created more than 1,000 open source projects

This dude is counting each one as a project hahaha


It's the truth but not the whole truth


It reminds me of graffiti tagging!


Yeah but this might make him the Patient Zero for modern tech influencing and blogspam.


Anyone know if this has spread to crates.io yet? I see plenty of name squatting, but I haven't run into real crates trying to insert themselves into everything. Namespaces are sorely needed, including some semi-official ones. candi::rand would be reasonable for candidates to enter std. Watching the battles over tokio getting into candi would be fun.


> Watching the battles over tokio getting into candi would be fun.

What would be the argument for this? It's my understanding that the lack of a default runtime is considered a strength


When I first heard of the is-odd and is-even NPM packages I was sure they were a joke, yet there we are: 200K weekly downloads! Publishing the packages may have been the effort of one spammer, but many developers obviously chose to use them - that's the part that boggles my mind.


Well, look at one of the dependents: https://www.npmjs.com/package/handlebars-helpers - certainly a more useful npm package, but by the same author. Seldom do people actually type `npm install is-even` - there's just a jungle of transitive dependencies that can be traced back to one of jonschlinkert's many packages, which then circles back to something inane as `is-even` or `ansi-red`.

I once ran a simple grep in some of my node projects - most of them had a jonschlinkert package in node_modules, certainly not through any (direct) choice of my own.


Check out this very useful utility: https://github.com/mitsuhiko/is-jonschlinkert :)


Possibly related: https://github.com/SukkaW/nolyfill vs. ljharb and nodejs 4


I recently came across this issue (https://github.com/import-js/eslint-plugin-import/issues/213... and https://github.com/import-js/eslint-plugin-import/issues/181...) and man that person has really found him self a hill.

What is a bit worrying though is that he is an active member and contributor to TC-39. Meaning that this kind of community hostility is very much alive among the people who rule JavaScript.


View their obstinance: https://github.com/nvm-sh/nvm/issues/794

8 years later and despite much support for the `.node-version` file.

Someone else started using the .node-version file years ago, and because all open source packages won't form a committee to standardize this file, nvm will not support it.

They have a lot of hills. JS Private Properties was another.


Very often when I’m digging around GitHub Issues because of some bug or quirk or insanity in the JS ecosystem (which is often) I see someone spout the worst possible take - often being kind of a jerk about it - and when I look to see who’s responsible, very very often, ljharb’s name pops up. Often.

Dogpiling on someone deep in an HN comments tree isn’t exactly the classiest thing but…never having interacted with him myself, I’ve been harbouring this low-grade antipathy towards him - nothing unhealthy, just a groan whenever I see his name on GH - for years now, and it’s cathartic and almost gratifying, given his prominence in the community, to feel seen like this. Thank you.


I was doing some snooping around and found this earlier discussion https://news.ycombinator.com/item?id=37602923 from September 2023 (234 points, 110 comments), which I had missed at the time, very related to this current subthread, and includes posts such as these (https://news.ycombinator.com/item?id=37604635).

I think we as a community really need to have a conversation about ljharb and his role in the future of our industry. If he was only a library maintainer, that would be one thing, we could just move on, find workarounds, alternatives, etc. But his involvement in TC-39 makes him one of our rulers in a non-democratic structure. That makes this different.


To be fair, the ESM switch has been botched beyond compare.

It’s like they didn’t want to become Python 2/3, and then did the absolute worst possible alternative.

It is beyond frustrating that it’s up to individual package authors whether or not their package supports ESM or CommonJS.

And yeah, it’s a pita when one of your downstream dependencies decides to go ESM only, and breaks your entire friggin chain of stuff that depends on it being CommonJS.


I agree. Mistakes were made and migration has been unnecessarily hard. This particular issue doesn’t even affect me. I simply turned off `no-unresolved`. TypeScript handles a lot better anyway (even with jsdoc type annotations).

But what is the issue here is the stubbornness of the maintainer and his unwillingness to accommodate a very sizable portion of his user base. The industry is moving on, and as a TC-39 member he should be aware of where the community is moving as well as show some empathy with his users.


I love the release notes:

> It exists.


Would it be considered bad practice or in poor taste to make pull requests in projects with the sole objective of implementing is-even natively?

Thinking of being the change I want to see in the world.


It’s probably contextual but I’d say it’s in poor taste to use them in the first place. Removing them is a net benefit imo.


I see, thanks. I guess that answers one question, but raises another: why have his packages depend on more of his packages? If his goal was to be included in as many node_modules directories as possible, and handlebars-helpers was already included what's the point of pulling in is-odd/is-even, too?


Might be this from his GitHub bio “Several years ago, just before my 40th birthday, I switched careers from sales, marketing and consulting to learn how to program” Good way to get more eyeballs…


Ah yes, that actually explains a lot! Thanks.


Makes is-odd/is-even popular; many downloads; raises their (and his) profile.


Here we are all talking about him now!


He could sell rights to the repos and disavow any knowledge of its maintenance while maintaining the link in his own repos. When those sold rights are used to commit some crime he has plausible deniability as anyone else but got a payday. If you try spinning off the subpackage just prior to a sale then it shows some sort of intent.


Is there any evidence that he has ever done anything like this, or that he plans to? Or is this just pure speculation?


I didn't declare he's done this only that it is a vulnerability of depending on those packages.


I did learn one new thing from browsing the is-odd source code: Number.isSafeInteger(n) checks that n falls within the [Number.MIN_SAFE_INTEGER, Number.MAX_SAFE_INTEGER] interval.

  ...
  if (!Number.isSafeInteger(n)) {
    throw new Error('value exceeds maximum safe integer');
  }
  ...


Shouldn't this simply return true instead of throwing?


Often the is-even package will be imported by another (less ridiculous) package of the same author.


the bigger joke is that programming language have no built in


Huh? Did you miss the modulo operator?


It is not just modulo, it also test for whether the input is a number or integer. It also has one dependency, is-number. And isOdd(“3”) returns true.

Basically it helps dealing with all the typing problems of Javascript, and also fitting into functional programming paradigms.


If you need to know if a number is even or odd then you should convert it to an int natively anyway.


Given the lack of integers and my hesitancy to trust modulo for non-integer variables, I don't know if I would trust it. You would need to add some safety checks, but either you create an is_even/is_odd function that has safety checks, or you have to rely on developers adding in the checks anytime the number might have been in close proximity to a floating point number.

Something as simple as this can end up being neither even or odd.

    (0.1 + 0.2) * 10


It wouldn't have 200k downloads a week unless there was a large number of programmers who didn't know how to do it.

I mean we're talking FizzBuzz secret sauce here. This must be total black magic for a catastrophic percentage of programmers


I have not read the source but I had always assumed that this was the lovingly crafted effort of someone who is intimately familiar with the js standard making sure that some hypothetical expression like ![1] is neither odd nor even. Surely the idea that modulo is beyond developers is too horrifying to contemplate.


Here you go:

    /*!
     * is-odd <https://github.com/jonschlinkert/is-odd>
     *
     * Copyright (c) 2015-2017, Jon Schlinkert.
     * Released under the MIT License.
     */
    
    'use strict';
    
    const isNumber = require('is-number');
    
    module.exports = function isOdd(value) {
      const n = Math.abs(value);
      if (!isNumber(n)) {
        throw new TypeError('expected a number');
      }
      if (!Number.isInteger(n)) {
        throw new Error('expected an integer');
      }
      if (!Number.isSafeInteger(n)) {
        throw new Error('value exceeds maximum safe integer');
      }
      return (n % 2) === 1;
    };
It does some checking the `value` is an integer in the safe range, which doesn't even seem right to me. Why shouldn't you be able to call this on integers outside the save range?


(10e15 + 1) % 2 === 0


All non-safe integers are even, yes?


Sad but true. For JavaScript these kind of functions can actually be useful because of all the quirks. If that was the GPs hint then I can understand.


A more catastrophic number of programmers a) dont know how dependencies work and b) think everyone else are idiots.


Perhaps, I solidly understand how dependencies work and in this case my observation is defensible

Someone made the decision to use that and someone thought using stuff made by a person's who makes those kinds of decisions was a good idea and so on.

You can git blame dependencies all the way down and research the parties involved. I've done it, built tools for it even.

A stack of people who make bad decisions doesn't make good software.


More likely the author made another, more useful, package that uses it.


My favorite is the existence of both https://github.com/jonschlinkert/ansi-gray and https://github.com/jonschlinkert/ansi-grey that do the exact same thing.


Another classic from jon is the venerable https://www.npmjs.com/package/isobject coming in at 30,000,000 weekly downloads for

`function(val) { return val != null && typeof val === 'object' && Array.isArray(val) === false; }`

But honestly, having seen "TypeError: Cannot read properties of null" enough times, I give it a pass.

https://npm-stat.com/charts.html?author=jonschlinkert paints a pretty crazy picture


Seriously fuck this dude leeching on the open source community.

He's not even a technical guy but has a background i marketing and is directly trolling various Github issues :

https://news.ycombinator.com/item?id=28661094

I've always wondered why node-modules and npm required such an insane amount packages so quickly, and now i know why, people like him that use their 10.000 ridiculous packages to boost their career or do whatever self serving community destroying thing they can think off that day.

There really should be a way to ban people doing this shit.


It's more a statement about the community than him imo. Or maybe about what npm allows to be published. Nobody forced you to use is-odd, yet it's downloaded 200k a week, why? Because js' developer community doesn't know any better


Maybe someone needs to do a website in the same spirit of https://youmightnotneedjquery.com except youmightnotneedjonschlinkert.


Notice that most dependents of his tools are: 1) his own project, 2) abandoned beginner projects.

As some people mentioned, most of the usage of is-even is by tools made by this person.

But the other part is that, he made quite a few development tools for beginners that scaffold new projects and pull lots of his packages as dependencies (especially the handlebars helper), which heavily inflates the number of "Dependents" in NPM.

The other issue is that, in the past, he managed to include some of his less-useless packages in some semi-popular tools.


A lot or these downloads may be CI systems running automatic tests.


They clearly are, but this still means a not negligible amount of softwares have those packages in their dependencies.


Exactly right. Its clever way of deflecting blame that whole community should share by piling on this particular person.


This really sounds like victim-blaming. The community is vulnerable to somebody publishing asinine modules, which should be addressed. However, this individual is still the perpetrator.


Isn't the post you linked to about a different person?


I mostly write in languages without package managers so it is possible that my expectations are just wrong, but is there no way of showing your dependency graph when using npm?


Here are some popular ones. Now, in two of these cases, they can be dev-dependencies, but even there, it makes me angry, since all of those deps have to be downloaded to my machine, and can run any random code they want at install time (which is why Node projects should always be run in some kind of sandbox).

https://npm.anvaka.com/#/view/2d/jest

https://npm.anvaka.com/#/view/2d/tailwind

https://npm.anvaka.com/#/view/2d/aws-sdk


> can run any random code they want at install time (which is why Node projects should always be run in some kind of sandbox).

You can install with the --ignore-scripts flag. Or set the option globally in your npm config file.


> There really should be a way to ban people doing this shit.

???

There is. It's called "doing nothing."

It takes work to add a dependency to your project; they don't spring out of nothing.


The author hasn't forced you to import their package. What's there to ban apart from maybe polluting the common namespace?


That's hilarious. The best part is that he didn't even try to make good packages (not that hard) but just took the lazy route.


Well one reason could be this software proverb: Good programers are lazy programers.


It's weird to me that in every other comment thread on Hacker News, people say "good software should do one thing and do it well". And then the topic moves to NPM, and suddenly everyone loses their mind?


A little copying is better than a little dependency.


This is part of the UNIX philosophy, but sometimes this gets taken way too far. There is of course the UNIX command "yes", which literally just prints the character "y" over and over again (to skip interactive prompts).


GNU yes will actually repeat any string—the default is of course "y". So it's very handy for scripting, and not really as single-purpose as you'd think.


Yeah it's definitely useful. Still a bash one liner though:

  function yes { while true; do echo "${1:-y}"; done }


also, it's actually faster than reading /dev/zero if you need some filler data hahahaha


Haha ok do something substantial! It's implied.


Achieving supply-chain security by controlling every link in the chain. This Jon is very noble.


Packages should specify their entitlements.. network access, file access, executing system calls, being able to monkey patch things, accessing packages outside their own.


Deno has a few of these by default an application has no access to the filesystem, env or network.

You can allow only reading or only writing for files and you can also define what domains are allowed.


Sure it's a huge step forward. But that's on an application level, which honestly I can do with sandbox-exec on macOS (see below).

In Deno, all permissions would still propagate down to the dependencies.

  (version 1)
  (deny default)
  (allow file-read*
    (subpath "~/Downloads")
  )
  (allow network-outbound
    (remote tcp "localhost:80")
  )
or

  (version 1)
  (allow default)
  (deny network*)
or

  # start an airgapped shell, and play around in there..
  sandbox-exec -p "(version 1)(allow default)(deny network*)" bash


There have been a few GitHub issues open but from what I can tell it's in the "not technically possible with V8 VM" category


Realistically I think they long ago should have been banned from npm, with all submissions deleted.


So what excuse do the people installing these packages have?


I'm relatively new to the world of node. Is there anything objectively wrong or should I say nefarious with that this Jon person is doing? I guess, what's the issue here, from your perspective, if he's creating package (that albeit are simple) but some some amount of utility?


In my view it is objectively wrong to create trivial npm packages yes. If we look at the npm ecosystem as a commons, that person is polluting it. Of course you could say it's namespaced to one account, so what's the harm? In my view, plenty:

- package searches will show these packages due to the inflated usage from transient deps.

- installs are slower due to the package noise.

- increased attack surface when they are used

- cultural normalization of throwaway packages

Probably more.


Each dependency causes work for any serious use. You got to check license, got to check for updates, risk supply chain attacks (package disappearing, package replaced with bad code, ...) etc. which causes longer term cost.

In addition abstraction of trivial checks, makes it harder to see the limitations of said routine. How well does it work on numeric strings? How well on large numbers where float properties cause issues?


This is the first time I'm hearing about this guy, but not sure why there's so much hate.

It looks like he came from a non-technical background, and is trying to make the language more noob friendly.

Compare the pair:

if isEven(n) { }

if (n % 2) == 0 { }

I guarantee you my wife would have no idea what the second snippet even means.


There's an argument to be made for writing functions like `isEven` and using them instead of n%2. There's an argument that the JS standard library, or other comprehensive util libraries like Lodash or Underscore should include these functions.

The problem here is introducing separate dependencies for each of these tiny functions. Dependencies are code that you haven't written, but are still your responsibility. For a lot of things, that's a good tradeoff: if you don't have the expertise in a specific area, or if you can offload work to a dependency that you trust, that's great. But for micro dependencies like this, it's usually a bad deal - you don't get anything in return (seriously, how hard is it to write your own isEven function?) but you have to rely on a third party to be secure, to not push anything accidentally broken, to not change the API, etc.

(I think it's also worth pointing out that your wife is not a paid programmer. Software development should be accessible, but this isn't the only goal, and I think it's reasonable to assume that most programmers either understand the n%2 idiom, or know enough to be able to find help on the subject.)


Most professions require some sort of degree or license before you get to practice it. If your wife wants to develop software, she should learn the basics. Especially when her code is included in larger projects and ecosystems.

I think the criticism comes from the fact it is hard to avoid driving over badly designed bridges or avoid wasteful dependencies in the Javascript ecosystem due to the way the package management is organised, and it is felt this specific person contributes a lot to that problem.

I have no knowledge of this person, but I often avoid Javascript and NPM for exactly that reason. I'm hopeful of Deno though to fix some of that mess.


The hate is deserved. This person lacks ethics.


if !(n & 1) {} // rightmost bit(LSB) is always `true` for odd integers


For code golf purposes, ~n&1 works fine


// FIXME: This breaks on negative numbers on ones-complement machines


You may not like it, but this is what 10x performance looks like


It seems very uncharitable to describe someone as a “spammer” because their philosophy on the proper size of units of code reuse is different from yours.


It seems overly charitable to me to frame the dispute in those terms. You and I might differ as to the appropriate amount of vermouth in a martini or pineapple on a pizza and I won't worry much about it, within reason.

There's a limit though! I think up to about 10% pineapple by weight is reasonable for a Hawaiian, if you choose 0 or 20% then we'll have no issue. If you went to 30% I don't think I could stop myself writing a libellous remark on hn. Anything about 50% and I would be morally forced to denounce you to the proper authorities. About 80% is where the nightmares begin.


At a certain point, isn't it just grilled pineapple with pizza crumbs?


Even if you think having a dependency load `is_even()` is a good thing, surely you can see how it's pretty hard to defend having a completely separate and nearly identical `is_odd()` rather than just `!is_even()`


The argument is that non-programmers like his wife don't understand e.g. `!`.


I think if you're installing npm packages and writing code you're a programmer whether you want to be or not.



Amazingly, following “don’t repeat yourself” in its purest form, is-even depends on is-odd:

  /*!
   * is-even <https://github.com/jonschlinkert/is-even>
   *
   * Copyright (c) 2015, 2017, Jon Schlinkert.
   * Released under the MIT License.
   */

  'use strict';
  
  var isOdd = require('is-odd');
  
  module.exports = function isEven(i) {
    return !isOdd(i);
  };


If this code runs on hundreds of millions of phones and computers, npm seems horrible for the environment! Not only because of runtime, but because we’re talking so much just to host and serve these modules!


These packages are performance art, and shall not be judged by efficiency metrics.


I bet a dollar you started writing "performance" and then rewrote it as "efficiency" :)


Good guess, but sorry :) You can pay at https://github.com/sponsors/fritzo?frequency=one-time


That’s right. Npm and the entire node ecosystem is just terrible. Specifically because you can get a production ready package from even Google, and you know it’s going to have some random useless package in the chain somewhere.


I haven't heard about him but checking the source tree of our 2 front-end apps and his 'is-number' package (which his is-odd depends on), seems to be imported by quite a few other packages.

Now looking at the source, that package may make sense if figuring out whether something is a number type in JS is really that cumbersome. (Though I'd expect that there is a more generic package that covers the other built-in types as well.)

Also since isNumber treats strings that can be converted to a number, a number, it can yield weird results since adding two strings will naturally just concatenate them. So e.g.:

    const a = '1'; 
    isNumber(a); // returns true
    const b = a + a; // Now you have a string in b: '11'

Of course, it's standard JS stupidity (and 2*a would be 2, and 1+'1' and '1'+1 would both be '11'), but then maybe stating that '1' is a number is not the right response. However, the package was downloaded 46 million times last week and that seems to be so low only because of Christmas. The previous weeks averaged around 70M. And most of these are dependencies, like in our projects, I'm sure.


I did a nullll package [1] whose only purpose is to export a null and takes 400MB in memory but somehow it got flagged in HN [2]. With 41 stars on github and 100% test coverage [3], it was clearly production ready.

[1]: https://github.com/mickael-kerjean/nulll

[2]: https://news.ycombinator.com/item?id=17072675

[3]: https://github.com/mickael-kerjean/nulll/blob/master/test.js


That try-catch block[1] is a thing of beauty.

[1] https://github.com/mickael-kerjean/nulll/blob/master/index.j...


Shame it was flagged. Would've been handy for a project I was working on. Had to work around not having a performant null


Actually it’s not good enough there because JavaScript numbers are f64 rather than u32. Even if you only support the safe integer range (beyond which the answers become somewhat meaningless), that’s 2⁵⁴, more than four million times as large as 2³². Not sure how big the machine code would be, but I’m guessing it’ll just add 4 bytes (40%) to each case, so you’re up to something like 224 exibytes. And that’s when you’re being lazy and skipping the last ten bits’ worth. If you want to do it properly, multiply by another thousand. Or maybe a little less, I haven’t thought too deeply about NaN patterns. Or maybe just infinite if it supports bigints.


But you can create a self-modifying function that, ‘for speed’, adds each case when it is called.

The initial function should start with checking for special cases NaN, infinities, 0 and -0, then do (may not be valid JavaScript)

  if((x++ === x) && (x—- === x)) printf("even\n");
to handle cases over 2^54 or below -2^54, then do

  isOdd = true
  neg = x
  pos = x
  while(true) {
    if(neg++ === 0) break;
    if(pos-- === 0) break;
    isOdd = !isOdd
  }
to determine whether x is odd or even in isOdd. Doing the self-modification is left as an exercise.

This is good defensive programming. It avoids doing a tricky division by 2.0 that might be buggy (I don’t think https://en.wikipedia.org/wiki/Pentium_FDIV_bug was affected, but who knows what other FPUs do?)


But we can probably compress the data very well (e.g. "double delta" is always zero), and then decompress only needed parts.


This still requires reading through it all in memory at run time. Maybe we can optimize it with a JIT -what if the package called the ChatGPT api to get the the python code that generates the machine code just for the number queried?


The funniest (or best part) about those two packages is that is-even has a dependency to is-odd and does just negate the output of is-odd.


It’s true, the great thing about clean, reusable, modular code like this is that you can compose both of these packages to make a is-even-or-odd package.


Well first you need to obviously build an OR package, part of your suite of logical operator packages, all depending on your battle-tested, high performance XOR package.



Groundbreaking!


If would be even funnier if is-odd would simultaneously do the same thing in reverse.


Implementing the comparisons by hand seems difficult and primitive. May I suggest introducing some helpful sub-packages and building the solution on top of those. For instance, is-odd would be implemented by using is-one, is-three, is-five, is-seven, etc.


and is-one should be implemented by negating is-two, is-three, and so forth


I read some text where a cheme was implemented such that numbers were `2 = 1+1`, `3 = 2+ 1`...



>Succ Peano

...


The standard joke about my university's grad-level programming languages course is that its formalization unit does a great job of identifying undergrads by discussing Peano naturals and Hoare logic in the same lecture...


Perhaps Nock:

https://developers.urbit.org/reference/nock/definition

> The reader might wonder how an interpreter whose only arithmetic operation is increment can ever be practical.

> The short answer is that a Nock interpreter doesn't have to use the algorithm above. It just has to get the same result as the algorithm above.


Theoretically only one 40gb file would be needed. AFAICT almost every odd number is not even (I cannot say "all" bc I have not checked every integer yet).


That's what unit tests are for, right?


I did something similar six years ago called gg-flip for npm - https://github.com/avinassh/gg-flip

It was submitted in HN too - https://news.ycombinator.com/item?id=16194932


I was wondering what the code looked like...

    'use strict';
    
    const isNumber = require('is-number');
    
    module.exports = function isOdd(value) {
      const n = Math.abs(value);
      if (!isNumber(n)) {
        throw new TypeError('expected a number');
      }
      if (!Number.isInteger(n)) {
        throw new Error('expected an integer');
      }
      if (!Number.isSafeInteger(n)) {
        throw new Error('value exceeds maximum safe integer');
      }
      return (n % 2) === 1;
    };


I mean... when you look at it like that, at least it's got the error checking integrated as well... tho the fact it pulls in is-number is fucking hysterical lol



I'd fire someone who imported those packages.


Interesting that there are so many more weekly downloads for is-odd. This seems to suggest that there are many more odd numbers than even.

(Alternatively, perhaps odd numbers are actually much rarer, leading to them having a higher market value, and thus more interest in discovering them.)


The isEven package uses the isOdd package…


I wonder what will happen if the author of isOdd optimizes their code by using the isEven package


The implementation of is-even is mind blowing (numbing?).

    'use strict';
    
    var isOdd = require('is-odd');
    
    module.exports = function isEven(i) {
      return !isOdd(i);
    };
good thing it declares its dependencies properly!


Big fan of code reuse. Small packages providing single unique functionality is the way to program modern systems. It has to be right if bright minds in NPM / Rust community follow this approach.


Even better if it built-on-install by executing another script to generate everything (naturally using %) New nerd game: enshitification inverse golf.


Every time I see these packages mentioned, I’m reminded of an interview with Joe Armstrong. In it, he said, (paraphrasing from memory) “Wouldn’t it be interesting if a language had a global, flat registry of functions that anyone could pull from and contribute to? That way, bugs are fixed in one place, there’s less reinvention of the wheel, things get more correct over time, and programs just become a simple composition of standard functions.”

I may be misremembering his meaning, but I remember thinking it was an interesting idea. It wasn’t obviously a terrible idea. I thought it would be like the Clojure standard library on steroids, especially if it was coordinated and vetted by a decent team.

But alas, NPM has proven it otherwise.


I don't think NPM has proven that idea infeasible, just that it may not be a good idea to depend on third-party content that may change under your feet, on such a fine-grained level.

But have a look at the Unison language https://www.unison-lang.org/docs/the-big-idea/ , that has such global registry but addresses each function by a hash of its syntax tree, and thus sidesteps the issue.


How are bugs "fixed in one place" that way?

You either accept updates of third party packages, with little to no vetting of your own, and get fixes "for free" or… you don't.

There's no middle ground: the only way to save effort is to trust others.


NPM is a worse idea than Joe's as Joe was talking about a single community vetted monorepo. Not a free for all of individual repos.

With Joe's idea everything is up front and part of the language and not a bulletin board of packages near the checkout line at the git supermarket. This way simple stuff like isint() can make its way into the languages official standard library. This should eliminate the uncertainty of 3rd party packages maintained by a random number of individuals that can be taken down or tainted at any time.


I wonder if it might be practical to bootstrap such a thing off of Wikifunctions, with a process to vouch for a function as "important enough to merit inclusion" and tooling to ensure that the Wikifunctions implementations actually agree, plus something to synthesize/translate a first-pass implementation in the desired language?


NPM doesn’t capture the full benefit of an open registry of functions because, while anyone can fork and create an alternative @christophilis/is-even or @divbzero/is-even, there is no good way for developers to pick the best version of is-even.

Maybe if NPM required all package maintainers to use a namespace, the global is-even package name could resolve to the most-used fork.

So for an import like:

  import isEven from 'is-even'
You could install a specific fork:

  npm install --save @christophilus/is-even
Or default to the most-used fork:

  npm install --save is-even
In both cases, package.json would only contain namespaced dependencies and npm outdated could flag if a different fork is more commonly used.


This Joe Armstrong quote sums up NPM:

"You wanted a banana but what you got was a gorilla holding the banana and the entire jungle."

(originally on object oriented programming)


To be fair, for all its pitfalls, NPM does provide a limited version of that idea to millions of codebases.


the idea sounds great, but there would have to be something in place to prevent it from degrading into a Wikipedia-style turf war, with people trying to protect "their" code against changes...


But of course WebAssembly is the future. So we should reimplement it in WASM.


why though? this is exactly what databases have been invented for! One could simply store a mapping of numbers to their classifications as 'even' or 'odd' in an SQLite database. This approach has the added benefit of not requiring program updates whenever a number's classification changes from odd to even.


Databases still have to be maintained and updated. You're better off setting up an Ethereum contract with proper economic incentives for others to act as an oracle and return the proper answer at any given time.


Anyone want to try to figure out the gas costs for deploying 30GB worth of smart contracts? xD


Yes, this seems like the kind of thing that should be in Wikidata. Then you also don't have to keep the database locally and can simply do a quick HTTPS request. (The only problem might be if TLS itself needs an even/odd function, but it probably doesn’t.)


I like this option just because then it makes it easy for anyone to update the data when future changes arise.


> The only problem might be if TLS itself needs an even/odd function, but it probably doesn’t.

Doesn't sound like its ready for cloud scale.


    CREATE TABLE even_or_odd (
      is_odd NUMBER(1);
      is_even NUMBER(1);
      is_zero NUMBER(1);
      is_one NUMBER(1);
      is_two NUMBER(1);
      is_three NUMBER(1);
      -- ...
    );
    INSERT INTO even_or_odd (is_odd,is_one) VALUES (1,1);
    INSERT INTO even_or_odd (is_even,is_two) VALUES (1,1);


A better design would look like this:

    CREATE TABLE numbers (
      minus_four_billions_two_hundred_ninety_four_millions_nine_hundred_sixty_seven_thousands_two_hundred_ninety_six_is_even BOOLEAN,
      ...
      zero_is_even BOOLEAN,
      one_is_even BOOLEAN,
      two_is_even BOOLEAN,
      ...
      four_billions_two_hundred_ninety_four_millions_nine_hundred_sixty_seven_thousands_two_hundred_ninety_six_is_even BOOLEAN
    )
That design can store multiple versions of the data, making it more resilient to future changes in the even/odd property of numbers.


Correct, but you'd certainly want to use an XML database.

Not only it helps with portability of the data, it also keeps it in a human-friendly format should the need to inspect it by hand arise.


XML is not web scale though, if you want to serve lots of users it has to be JSON in MongoDB.


AWS's Elastic Cloud Parity already provides this, and is far more scalable.


This is one of the most entertaining articles I've ever read here. He should put the source code online so ChatGPT can "learn" from it.


That would definitely violate his strict license:

    /* Copyright 2023. All unauthorized distribution of this source code 
       will be persecuted to the fullest extent of the law*/
And with such elegant code, who can blame him?


To what extent does the law allow persecution?


OpenAI see.

OpenAI ignore.

OpenAI train.


LOL. "persecuted" -- I just noticed this!


The joke is completely lost on me. I don't even mean the person who did this — why not, after all. 1198 upvotes ATM is what confuses me.

Lookup tables for computable values are not novel, nor are they a joke. This actually is a solutions for the time/memory tradeoff, as author well knows. The problem at hand is absurd, but quite primitive, so there was absolutely no doubt this can be done. No real measurements were performed, apart from the observations that it took about 10 seconds to chew through 40GB program on his computer. So what did we learn? That exe files cannot be more than 4GB? That a program with 2^32 ifs is about 300 GB? Why 1198 people found this interesting?

Maybe I'm spoiled, but unlike "Hexing the technical interview" or some SIGBOVIK stuff, this doesn't seem crazy, just pointless.


The joke is that he actually did it. People have been joking about it for decades. But, the crazy bastard did it for real. It’s so extreme that no compiler could handle it. Not even any known assembler. So, he had to generate his own machine code binary to make it work. But, it works! Nutz!


> Lookup tables for computable values are not novel, nor are they a joke

It's been ages since I've done anything low-level, but I don't think 4 billion if-statements are compiled to a "lookup table" when optimizations are turned off. Each and every if statement would be evaluated in order to determine if it matches the input. This seems to be supported by OP's program outputting in much less time for small numbers than for large numbers--since the small numbers appear earlier in the code.

A 4 billion case switch statement on the other hand, I would expect to compile to some kind of lookup table. Though when the data type is an unsigned int even then I'm not sure what the compiled code would look like without optimization.


Would a compiler be smart enough to generate a big table for this?

I wonder if it’s just so big the optimization pass would give up, or perhaps makes tons of tables of 256 entries each.


Optimization was explicitly disabled by compiler flag for this experiment


Sometimes people do things to be funny.


My understanding was that it was a parody of these blog posts satirising the pointlessness of pushback against conventional wisdom. Rather dry.


This is amazing technology. You should sell it to AWS so they can offer an Enterprise-ready AWS EvenOrOdd API for everyone who does not know how to host the 40GB executable properly. With the power of the cloud this programme would be unstoppable


It’s just begging to be a Lambda function.


I'm surprised no one has chimed in on how the program is "processing" 40GB of instructions with only 800 MB/s * 10s of disk read.

If I had to hazard a guess, there's some kind of smart caching going on at the OS level, but that would entail the benchmark of "n close to 2^32" not being run correctly.

...Or that the CPU is smart enough to jump millions of instructions ahead.


> my beefy gaming rig with a whopping 31.8 GB of memory

So a rerun of the program should only need to load ~8GB if the filesystem caching is somewhat loop/scan resistant.

My first thought was "probably the math is wrong", but looks like it adds up to something reasonable - especially as all numbers are rather vague / rounded (e.g. let it be 12s) and the number was just high, not absolute maximum.


I guess this would be the expected, though boring, answer -- "incorrect" benching (i.e. clear the cache first).


My guess is either compression or stuff lingering in RAM. The CPU can't be smart here since it doesn't know what any of the future ifs will be. It doesn't know they're in order, or unique, or even valid instructions. You could (theoretically; the OS probably wouldn't let you) replace an if with an infinite loop while the program is running.


Could it be the branch predictor of the CPU somehow catching on to this? Perhaps something like 'the branch for input x is somewhere around base+10X.


> Perhaps something like 'the branch for input x is somewhere around base+10X.

That's unlikely. Branch predictors are essentially hash tables that track statistics per branch. Since every branch is unique and only evaluated once, there's no chance for the BP to learn a sophisticated pattern. One thing that could be happening here is BP aliasing. Essentially all slots of the branch predictor are filled with entries saying "not taken".

So it's likely the BP tells the speculative execution engine "never take a branch", and we're jumping as fast as possible to the end of the code. The hardware prefetcher can catch on to the streaming loads, which helps mask load latency. Per-core bandwidth usually bottlenecks around 6-20GB/s, depending on if it's a server/desktop system, DRAM latency, and microarchitecture (because that usually determines the degree of memory parallelism). So assuming most of the file is in the kernel's page cache, those numbers check out.


I doubt it, branch predictors just predict where one instruction branches to, not the result of executing many branches in a row.

Even if they could, it wouldn’t matter, as branch prediction just lets you start speculatively executing the right instruction sooner. The branches need to be fully resolved before the following instructions can actually be retired. (All instructions on x86 are retired in order).


There's also anticipatory paging, the OS can guess which pages will be requested the next time around.


It can't be the CPU since it is actually memory-mapped code, and the branch predictor surely cannot cause page faults and so cannot page in the next page of code.

Truly curious. I guess the linear access pattern helps but 800 MiB/s?


Very likely the majority of the file is paged in, and can be prefetched just fine. The author doesn't clear/disable the page cache.


It mmaps the program, unused pages then only take a page table entry and are not loaded. The only page actually loaded is the one he directly jumps into. Neat trick.


It's not jumping directly, it's executing sequentially, comparing against each consecutive number in sequence.


You are right, my bad. Thanks for clarifying it.


I assume modern branch predictors are capable of picking up a trivial pattern like this.


Visionary genius Ross van der Gussom is my new favorite mythological creature.


Think of Python as a way to script C, and skip most/all compiling. If your Python is slow, you are probably doing it wrong.

Recommend reading this: https://cerfacs.fr/coop/fortran-vs-python


I was a teaching assistant in a "data structures and algorithms" course where students could choose either Java or Python. Most of the labs were the same except that the treemap lab had to become a hashmap lab for the Python version because it was so excruciatingly slow.


Not all Python is just a front-end to fast C code. That only really works with numerical computing of big tensors.

Though I'd still agree - if your Python is slow you're doing it wrong - you shouldn't be using Python!


You can also call eg Rust or OCaml code from Python, doesn't have to be C.


Of course. When people talk about calling C from Python they really mean "a fast language".


If you switch away from Python, you are probably sub optimizing


There are reasons other than speed that makes people move away from Python.

Btw, recently CPython has sped up a lot. Between versions 3.9 to 3.12 many programs run about twice as fast. Much of that improvement is thanks to the 'faster-cpython' project (which Microsoft is generously funding).

I contributed about a 1% speedup during that time, too.


If you want to see C's _true_ scripting language, check out Lua.

It can be embedded in your C application, unlike Python which is the other way around (Lua is like 80KiB, Python is several MiB).


I ran a web search to figure out if “Ross van der Gussom” was some sort of inside joke. The top two search results were OP and the parent comment.


I assume a play on Guido van Rossum the creator of python but I'm unaware of any meaning deeper than that if that's what you're looking for.


It's just a funny misphrasing of his name.


There is some conspiracy here. You claim to not know what the name means, but Google seems to think you are highly relevant to that name ... now.

(I myself, have avoided mentioning the name to avoid whatever sick latent name void you are clearly complicit in creating here!)


I feel like the whole article is an allegory of LLM development (as a critic would write it: devoting a stupendous amount of resources and 'training data' to 'memorize' the solution). I wonder if this was the author's intent?


Considering I expected it to be about LLMs from the title (I thought it would be an announcement of a new 4B model), I’m going with "probably" ;)


Glad I'm not the only one who thought that lol


Reading the headline I totally expected it to be about LLMs.


right so! Like the 40b LLM model doing for loops. It very much seems this allegory is the actual motivation behind the article… is not about engineering but the absurdity that awaits us around the corner.


I’m pleasantly surprised that nobody started the discussion on whether zero is odd or even ;-) https://en.wikipedia.org/wiki/Parity_of_zero


I find it hard to understand how someone can be confused about that. But I dimly remember talking to people with this confusion in the past.


Please don’t take this as defending this thinking. I’m just guessing at how someone could get to this confusion.

I can see how you’d get there by thinking of numbers as a thing for counting. Even numbers give you piles of two without one left over. Zero fails the first condition since it gives you no piles at all. But it doesn’t leave a pile of one, so it’s not really odd, either.

If you ever find yourself confused in this way about definitions consider: is this definition serving me? Could I adopt a different definition that’s used by people really good at this sort of thing?

(Zero is even because any integer n that can be formed by 2k for integer k is even, and that can be formed by 2k+1 is odd. 2x0=0)


I am fairly certain I can split an empty pile into two empty piles, without a remainder.


Hmmm by definition an empty pile is not a pile though, is it? So there's nothing to split?

;-)


And an empty plate is not a plate, also by definition? Okay, sure.


Ah I suppose if you see an "empty pile" as still a vessel with nothing on/in it, your point stands.

But a plate is its own thing, in addition to what goes on it. A "full plate" is full of what? And a full plate is the plate plus whatever is on it, not just the stuff on it.

I think of a "pile" of stuff as being it's own thing, that being the pile itself. An empty pile is the absence of the thing.

That was my interpretation, but I see what you meant. :)


Hmm, I think that's how normal people approach the question (implicitly), and that might be how they think zero is not an even number.

At least this would explain their reasoning in terms I can understand.


well that's odd


Yes, it all boils down to definitions. Definitions that have 0 as an even number work well in math.

For most people's daily life, it doesn't really matter which category zero would fall under (as they never really eg consider dividing zero items evenly between people.) So their (implied) definitions can be all over the place.


>Even numbers give you piles of two without one left over. Zero fails the first condition since it gives you no piles at all. But it doesn’t leave a pile of one, so it’s not really odd, either.

This definition does not naively extend to negative numbers, which are anti-piles of things. You are right, of course, about definitions serving their purpose. In this case maintaining the symmetry of alternation requires 0 to be even, and that argument could even be extended to negative numbers. (Of course other numbers, like the rationals and reals, are pure fiction and can safely be ignored negative or otherwise.)


> (Of course other numbers, like the rationals and reals, are pure fiction and can safely be ignored negative or otherwise.)

I think natural numbers are already pretty fictional.

Btw, you might want to look into p-adic numbers.


Your defense boils down to people having an insufficient understanding of zero as an ordinary number, and still clinging to the concept that zero means "nothing" or is otherwise magic.

This hints at a failure of math education.

As an analogy, many (usually, but not always, weaker) programmers still have magic ideas about booleans and comparison operators, and write nonsensical stuff like if (a == true). When you ask them, it's invariably that in their mind, there's mystic connection between comparison operators and if statements.


> Your defense

I literally… never mind.


Ha ha, I didn't even realize this when I typed the answer. I hope it was clear that I didn't mean to imply this was your personal opinion.


It wasn’t clear then, but I can certainly see how you’d mean it that way. It was “mine” as in “I brought it”.


> When you ask them, it's invariably that in their mind, there's mystic connection between comparison operators and if statements.

It doesn't have to be mystic. It's perfectly fine to design a language that works like this. It wouldn't be a good language, but it would be possible.

Just like PHP didn't use to support constructs like `f(10)[2]`, that used to be a syntax error. So you needed something like `x = f(10);` first, before accessing `x[2]`.

If you saw that kind of construction with the intermediate variable, you might also accuse the programmer of imagining mystic connections.


Matlab still doesn't handle f(10)(2)... (The _(_) syntax is both function calls and indexing there.)


You are giving me flashbacks to memories I thought I had successfully repressed.


It's apparently confusing enough that if you ban odd license number plates then the cops don't want to arrest anyone whose license plate ends in 0 because they're not sure if it is even or not. At least that was the case in Paris in 1977, when they alternated between odd/even license plates every day to limit the number of cars.


Interesting. Do you have a source for that?


It looks like bans for odd/even license plates were used in Paris in 2014 [1] and 1997 [2], but not before then. However, a similar scheme was used to ration gasoline in the US during the 70s [3].

The only source I can find for the claim about police confusion is the one cited by Wikipedia [4], whose reliability I'm inclined to doubt based on the 1997/1977 discrepancy.

[1] https://www.npr.org/sections/thetwo-way/2014/03/17/290849704...

[2] https://www.wired.com/1997/09/paris-smog/

[3] https://www.npr.org/sections/pictureshow/2012/11/10/16479229...

[4] https://en.m.wikipedia.org/wiki/Odd%E2%80%93even_rationing#D...


The wikipedia article matched my vague recollection of the event, but if they got the year wrong then it may as well just be a rumour at this point.


Thanks!


One might even say it's odd.


The thing that gave me pause the first time I thought about it (it was during a test so I couldn’t ask or check) was that if zero is even, that means (despite numbers being infinite) there’s one more even number than odd numbers.

I also knew that 1 is not prime, even if logically it should be. Its definition specifically states a prime number needs to be greater than 1. So that means mathematics sometimes has exceptions for numbers inside a definition.

Given that, it’s not immediately obvious that zero would be even. It wouldn’t be odd either, that was out of the questions, but it could be neither.


> The thing that gave me pause the first time I thought about it (it was during a test so I couldn’t ask or check) was that if zero is even, that means (despite numbers being infinite) there’s one more even number than odd numbers.

Thanks to Hilbert's Hotel you can re-arrange the numbers to have an arbitrarily larger or smaller overhang of even or odd numbers.

https://en.wikipedia.org/wiki/Hilbert%27s_paradox_of_the_Gra...

> I also knew that 1 is not prime, even if logically it should be. Its definition specifically states a prime number needs to be greater than 1. So that means mathematics sometimes has exceptions for numbers inside a definition.

The definition I use is that a prime number needs to have exactly two distinct divisors. No need for any special cases in this definition.

(The motivation for this somewhat strange definition is so that factoring any positive number, including 1, into a multiset of her prime factors is unique.

If you redefine the prime numbers to include 1, prime factoring is not unique. Of course, you could still make everything work out in the end: you just need to declare that your new-prime factoring is unique up to the multiplicity of 1s.

Just like eg standard decimal numbers don't change if you add leading 0s, new-prime factoring would not change if you add factors of 1.

You can do math with almost any arbitrary definitions, if you add enough exceptions and explanations in your theorems to work around the rough edges in your definitions.)


> Thanks to Hilbert's Hotel

That makes no difference to the numbers that exist, it only changes which ones you count for the duration of that thought experiment.

> The definition I use is that a prime number needs to have exactly two distinct divisors. No need for any special cases in this definition.

You’re still making up a special case with the explicit goal of excluding a particular number. It’s a roundabout definition that requires more thought and will be lost on most people which don’t already know the regular one.


I always found it funny that this is printed as a reminder text on Magic the Gathering cards that care about even or odd values: https://scryfall.com/card/iko/88/extinction-event?utm_source...

But thinking about it, I have no doubt that Wizards found out this confusion is somewhat widespread during playtesting, and printing a short reminder was an easy fix.


I particularly liked this bit

> Not only is 0 divisible by 2, it is divisible by every power of 2, which is relevant to the binary numeral system used by computers. In this sense, 0 is the "most even" number of all.

in the arena of even one-up-man-ship, zero wins.


Is that article about the IEEE 754 positive zero, or negative zero, or both?


When I was a junior developer fresh out of college, another junior and I noticed a bank sign with the temperature showing to be -0 degrees. We made a few comments about Two's Complement and felt a bit smug in our superiority over the manufacturer of the sign. Decades later I read an article about how -0 degrees is a standard in civil meteorology for "below freezing but rounded up to zero". Though it took a while to learn, it was a good lesson in making assumptions about standards in other domains.


I asked a few LLMs whether zero is an even or odd number to see how they fared on such a widely misunderstood question:

LLAMA 2 7B: 3x even, 2x neither

LLAMA 2 13B: 1x even, 4x neither

LLAMA 2 70B: 1x even, 4x neither

GPT 3.5: 20x even despite trying to trick it

Mistral 7B: 1x even, 3x neither, 1x even+neither?!

Mixtral 8x7B: 8x even, 2x neither


It seems that it is not a matter of size or processing power, but a fundamental problem of bad training data that bakes these misconceptions into the model. OpenAI is probably using a better filtered dataset and may even be able to remove some of the more common errors through alignment and fine-tuning.

I experimented a bit more and it seems that Mixtral is better than Mistral when it comes to common misconceptions. There has probably been some improvement in the training data.


This is the kind of discussion that matters only for the very theoretical edge cases or for the pop-math inconsequential discussions

0 is even

I mean, it is the first phrase on the linked fine article:

> In mathematics, zero is an even number.


Jokes aside, lookup tables are a common technique to avoid costly operations. I was recently implementing one to avoid integer division. In my case I knew that the nominator and denominator were 8 bit unsigned integers, so I've replaced the division with 2 table lookups and 6 shifts and arithmetic operations [1]. The well known `libdivide` [2] does that for arbitrary 16, 32, and 64 bit integers, and it has precomputed magic numbers and lookup tables for all 16-bit integers in the same repo.

[1]: https://github.com/ashvardanian/StringZilla/blob/9f6ca3c6d3c... [2]: https://github.com/ridiculousfish/libdivide


Clearly they should have optimized using binary search. Use nested if-else statements 31 deep. This would require using a more sophisticated comparison like < less than instead of equals. You'd get a similar amount of code, but it would execute on O(logN) instead of O(N). That would surely be the mark of an expert. ;-)


I'd like to see a fully distributed version. All you need is 4B hosts (IPV6 FTW) named N.domain.com (where N varies from 0 to 4B-1). The driver app sends N to the first host (0.domain.com). Each host compares the incoming N with their N.domain.com name; if they match, return the host's true/false value. If they don't match, forward the request to (N+1).domain.com and return the result.


N+1 may overflow; using N-1 is more robust.

Also, I think this can easily be implemented; you don’t need 4B hosts, just 4B DNS entries, and those, you can create with a single wildcard entry (https://en.wikipedia.org/wiki/Wildcard_DNS_record)


Don't give Suckerpinch any ideas


It's kind of ironic that the python generator script uses modulo. The author could have pushed it to the point of not using one.

I'm tempted to write a followup doing even crazier things...


I'm pretty sure that was an intentional joke. I do wish it'd have been a bitwise AND with 1.


Clearly the solution is to factor each number and to (recursively) test the parity of the sum of the factors. Since all prime numbers other than 2 are known to be odd, the parity test for the case of a single factor is easy. You only need to special-case 4 to avoid an infinite recursion.


Author should have perhaps bootstrapped a script that doesn't use modulo?


Surely any solution that claims to be performant would use a binary search, rather than looping through every number. Would be a little trickier to generate the code though.

Edit: /s of course


Why search? We know exactly where every number is. Use a jump table.

While this example is obviously silly because loading a number into a register and performimg some operations on it is going to be faster than a cache miss from jumping into a massive table, the general technique can absolutely be applied in the real world.


The thing is jump tables re absolutely used for this kind of thing, namely when the function doesn't have an easy closed form. It should be noted half of the joke is the fact he uses subsequent if statements in order to exhaustively check each number rather than a jump table.


I wouldn't touch the generated ifs, but would generate a b-tree database which maps integers to the address of the generated code which has the desired "if" comparison. Then the C program starts by querying the b-tree for the address and just assigns that to "isEven" pointer.

Hmm... Nevermind. A really smart way to solve this would be to store the result of the "isEven" computation directly on the b-tree, so the whole problem can be solved with a simple database query!


Or make an array with 2 billion uint64 entries all containing this constant:

0xC3C033C340C033

Then cast the array as if it was a uint32_t* add the number you want to check, cast it again as a function pointer of a function returns a int and voilà

It should take 16G of space to encode but it should require only a single page fault to load from disk.

    uint64_t a[] = { [ 0 .. INT_MAX ] = 0xC3C033C340C033 } ; 

    int is_even(uint32_t n) {
      return ((int (*)())((uint32_t *)a + n))();
    }
This uses a GCC extension. If you can't use that or if it doesn't work you can easily create an ELF object file that contains a data section with that number repeated 2 billion times.

Explaination:

That magic number is actually two 32-bit numbers juxtaposed:

The first one is 0xC340C033, which encodes the following instructions:

    xor eax, eax ; 33 C0
    inc eax ; 40
    ret ; C3
This effectively returns 1 in the eax register (which is compatible with the C calling convention)

The second number is similar but it doesn't include the "inc" instruction.

When casting as a int array we end up with an array with alternating function bodies. At even positions we have functions that return 1 (true) and at odd positions it returns 0 (false)

The interesting thing is each of those function bodies fit into a 32-bit number. A single jump to an arbitrary address would require larger array.

Of course, it's silly to do this for even/odd but it can be useful to understand how this works.

Go kids, build your own forths!

(Disclaimer: I haven't tried out any of the above and typed this on my phone while walking in the woods, so I'd be surprised if it actually works without some touches, but the gist of the idea should work)

Edit: fixes


    .global a
    .data
    a:
        .fill 0x7FFFFFFF, 8, 0xC3C033C340C033


Or make a lookup table with bits. It'd only be 512MB of 0xaa or 0x55. (but wouldn't use ifs)


Make it a microservice in Kubernetes so that it scales.


I wouldn’t be surprised if there isn’t at least one thing this insane at some FAANG.

“We’ve got to scale the maths engine cluster to handle the load!”



The actual implementation of this package is even better.

https://github.com/i-voted-for-trump/is-even/blob/master/ind...


What is even better is the fact that it has 200k weekly downloads. I thought this was supposed to be a troll package.


It is a troll package. Just a successful one.


But why do people use this? Is this a dependency in some popular library?


Don't fall for the salesman's "it was a joke" play.

Is-thirteen, now THAT is a troll package.


Well worth a quick trip to the source to see how it's implemented. After all, how would the author get around to packaging 4B if statements, like the OP?


He could package it to the size limit and range the numbers it could detect. Add a note to download particular package if the number is out of range of the package and that package is not installed.


One package should probably handle range of one million. Now it's just 4000 packages to install. You wouldn't even notice that in an average js project.


Why stop there. Create an algorithm that distributes the calculation of each unique number to a different instance. Surely that would scale even better!


Compilers will actually do these kind of tricks, I have seen them doing that for switch statements with a large number of cases.


No, a binary search will defeat prefetching. A linear scan is cache-friendly and will be much faster up to a certain point. It's called mechanical sympathy, look it up.


This metaprogramming reminds me of the post where they serialized data to ruby code to get around a slow XML parser: https://news.ycombinator.com/item?id=36311924

I’m curious if anyone has found a serious application for metaprogramming large amounts of data in a high level language like this?

So far the only application I have thought of was meta programming a series of api calls before executing them so the entire transaction can be replayed as a script.


I don't know if I'd go as far as calling it serious, but...

I have a string of 150 individually addressable RGB LEDs on a fake tree in my living room. It's controlled by an ESP32 (running ESPHome). I started generating little animations for different times of year (in the form of ESPHome light effects). I ran into a few problems:

- The edit/compile/flash/debug cycle was painfully slow.

- Expressing the animations in C++ sapped inspiration.

- Some parts of the animations are more computationally intensive than others, so getting consistent "frame rates" was difficult.

That "frame rates" bit made me realize these are just little videos with 150x1 pixel resolution. Now I write the animations in Python on my laptop and that code has two backends. One renders to the screen for quick iteration on the animations. The other plays one cycle through the loop of each animation and emits the results as C++ arrays and some variables (e.g., name, desired frame rate, etc.). Then the only other C++ code needed is the logic for passing the array data to the LED library.

There is no compression or anything else fancy. I can fit about a dozen animations with 100 to 200 frames each on a $4 microcontroller with Wi-Fi. They start almost instantly when applying power, and they play at a very consistent frame rate. The highest I have tried is 60 fps and that is no problem.


I quit iterating on my animations (led strip based bedlight with ESPHome/ESP32) due to the exact same frustrations (and that writing c++ in a yaml file is not really great). Care to share the code for your approach?


A while back I was reverse engineering a mobile game that used a completely custom serialization format for game data, most of it in one giant "config" blob several megabytes in size. I'd written a parser and serializer for it in Python, using custom classes to represent the various nested object types, but I didn't have a convenient way to explore and edit the data.

So, I wrote some code to parse the giant blob and emit a python source file, consisting of a whole bunch of nested class constructors (with named parameters etc. making it quite readable), and a final function call to serialize the whole lot back into a blob. Now I could edit that source file and run it, and have my source-level edits reflected in the output blob (which I could subsequently feed back into the game engine to see the results of my changes).


Note: this is essentially what the AWS CDK does.

Yet another example that configures is code is configuration is code is…


https://en.wikipedia.org/wiki/X_BitMap and https://en.wikipedia.org/wiki/X_PixMap come to mind, but they usually were written by an image editor, not by a program specifically written by the programmer who needed such code in their program.


Somewhat pedantic here, but would the authors use of python to generate C code be considered metaprogramming as opposed to code generation? I thought metaprogramming typically involved writing code that can alter/augment itself, like new syntax.

Personally, I have found a lot of use in code generation, for example using RTKQuery's codegen tool that accepts an OpenAPI schema and outputs a JS client.


I would argue that code generation is a form of metaprogramming, no opposition here.


Protocall buffers are effectively code generation and are central to Google's tech stack. Does that count?


That reminds me of my first program ever as an 11 year old. Not only did I not know about modulus, I didn't know about less than or greater than. The program picked bingo numbers. So there were 5 lines that looked something like `10 IF n=1 OR n=2 OR n=3 OR n=4 OR n=5 OR n=6 OR n=7 OR n=8 OR n=9 OR n=10 OR n=11 OR n=12 OR n=13 OR n=14 OR n=15 THEN l="B"`


At 9, I tried to make a game that combined SimCity 2000 and Transport Tycoon Deluxe - my favs at the time. I used MFC(Windows) because my parents got me a big black book about it. My game was a simple train simulator with four trains. Each train moved turn by turn to its destination from its origin. Each train had to finish its move before the next train had a chance to move.

I wrote it in C++ but didn't know much about managing the game's state or objects. But there was some fancy acceleration code in there.

You "played" the game by modifying the hardcoded x,y coordinates in the code and admiring how the trains moved on screen.

I thought real games used multi-threading to move each train at the same time. When I tried it, the game broke, probably because of UI updates from other threads.

So I never quite built my game. But this project, with all its problems, got me into software development years later.


If your programming language of choice has no arrays (like most of the languages implemented in one of the many, many, many "How to write your own interpreter/simple compiler" tutorials out there) but supports variables, then you can use this technique to kinda implement them anyway, Okasaki-style:

    var arr0, arr1, arr2, arr3, arr4, ..., arr65535 : int;

    func get_arr(index: int): int {
        if index < 0 or index > 65535 { exit(); }
        if index < 32768 {
            if index < 16384 {
                if index < 8192 {
                    ...
                        if index < 2 {
                            if index < 1 {
                                return arr0;
                            } else {
                                return arr1;
                            }
                        } else {
                            if index < 3 {
                                return arr2;
                            } else {
                                return arr3;
                            }
                        }
                    ...
                }
                else {
                    ...
                }
            } else {
                ...
            }
        } else {
            ...
        }
    }

    proc set_arr(index: int, value: int) {
        if index < 0 or index > 65535 { exit(); }
        if index < 32768 {
            if index < 16384 {
                if index < 8192 {
                    ...
                        if index < 2 {
                            if index < 1 {
                                arr0 = value;
                            } else {
                                arr1 = value;
                            }
                        } else {
                            if index < 3 {
                                arr2 = value;
                            } else {
                                arr3 = value;
                            }
                        }
                    ...
                }
                else {
                    ...
                }
            } else {
                ...
            }
        } else {
            ...
        }
    }
It even has O(log n) access time which is actually the same complexity as accessing a RAM by a pointer anyway! Just replicate this code for any array variable you need to simulate and pre-allocate enough of the memory, it's totally free with that .bss trick anyhow.

So technically, languages without arrays are about as Turing-complete as those with limited-size pointers/array indices (funnily enough Brainfuck — where the pointer is invisible to the progammer — actually is Turing-complete).


That's because array, or even if/else statements are already higher-level abstraction. To be Turing complete you simply need: a memory that can be read from or written to, and the ability to "jump" or "move" to some memory location.

Turing complete may sound big, but almost anything that can compute is Turing complete.


You also need an unbounded storage. Technically, if you have unlimited-width integers then two variables is all you need, but most languages have limited-width integers which makes them finite-state automata.


Why two variables? Can't you always encode two arbitrary-width integers into one arbitrary-width integer?

For example, using 0b0001 as a split character, to end up with the pseudocode left, right = memory.split('0001') and then encoding everything else without using any 0b0001 pattern. It might be awfully complicated, but you can always use a fixed but arbitrarily large part of this arbitrary length integer as scratch space to implement this encoding.

Edit: for an easier time, put one integer in all the even-indexed parts of the binary number, and the other in all the odd-indexed parts of the number.


Technically all computers in our universe aren’t, but we don’t make a fuss about it. In theory they aren’t complete, but in practice they are. Wonder if there’s a theorem which links the maths with physics in an intuitive way. Thermodynamics sort of do, but entropy isn’t that intuitive?


You can have e.g. Python (lists and bignums) implementation that would acquire more and more storage as needed: first from the RAM, then from disk, then doing fancy cloud Memory-as-a-Service requests, then commissioning building more datacenters, then branching out to the other planets etc. until it runs into the finiteness of the universe, all the while providing the abstraction of lists and numbers that can grow without hard limit (although with growing access latency). The practical limit is determined entirely by the things extraneous to the computation model itself.

On the other hand, if you have e.g. C, then your program is in principle can not have an array with indices larger than SIZE_MAX, period — which is guaranteed to be a finite number. This limit is built into the C abstract machine. Trying to weasel out of it by using FILE I/O is hindered by the fact that ftell() has to return accurate results that must fit into long (now, if ftell() were unavailable, or were allowed to fail for large enough offsets, then fseek()/read()/write() could actually be used in an obvious way for simulating a tape without a hard limit on its size). So this computation model, while practically Turing-complete, is not theoretically Turing-complete.

So, my original point was that built-in support for arrays, while providing ergonomics and performance, does not really extend the computational capabilities of the language: the languages with finite-sized integers but without arrays can simulate finite-sized arrays just like so. And if your integers are unbounded then you can use e.g. Gödel numbering to simulate unbounded arrays (again, with loss in ergonomics and performance).


Why would normal memory access be O(log n)? Seems to me jumping to a random address should be an O(1) operation. Would be pretty dumb if jumping to higher RAM addresses takes longer than jumping to lower RAM, excluding hardware things like different RAM sticks or page files.


It depends on your definitions and assumptions. Here 'n' is not the address, but the size of memory (or the largest possible/worst-case address). In the universe we inhabit, due to limitations on information density in a given volume[1], the worst-case random access time for a memory of size 'n' would be the cube root of 'n', because you need to transmit a signal at the speed of light between to points inside a sphere.

Real computers tend not to be able to store their information in all three dimensions (I think I once saw an argument for why this is also theoretically not feasible), and in practice appear to have worst-case access times that are logarithmic. It's fundamentally for the same reason: the bigger your memory, the more distant will be the most remote (worst-case) part of it.

Of course, in a single specific desktop or server computer, changing the capacity of your RAM sticks will likely not affect worst case access speeds, because the speed limit is likely based on the maximum capacity of the machine (a constant). So this kind of analysis is useful for pondering the ultimate limits of worst case memory accesses (and does also have some relevance huge computers), but in practice you will of course find that cache/locality effects are far more important for performance on real computers.

[1]: Since information is energy and therefore mass, a sufficiently dense store of information would collapse into a singularity, and you better hope your pointer doesn't point into one of those.


There was an article that mentioned that cbrt(n) is worst-case array access, not constant time. They did some benchmarks across the CPU cache/RAM/SSD boundary to show this practically. I can't seem to find it though, where was it?

Edit: found it, https://www.ilikebigbits.com/2014_04_21_myth_of_ram_1.html


> excluding hardware things like different RAM sticks or page files.

Exactly this! Also as n grows you will need bigger address busses (or have serial communication).


When I was in college, I had to take an assembly language course. It was on MIPS: 32 registers.

One assignment said, roughly:

- Read ten numbers into an array.

- Sort the numbers into the array.

Nothing said I needed to read the array, so I read in the numbers, both to the array and registers, hardcoded a bubble sort on the registers, and wrote the result to the array. End of program.

I was being cute. I got full marks with a note to stop being cute.


Fantastic use of meta-programming, and I like the benchmarking at the end, you have clarified much for the beginner and expert alike.


That work is amazing!

Now before you laugh at the original meme and the use of AI chosing the wrong algorithm, read this:

https://fosstodon.org/@gabrielesvelto/110592904713090347

There is no proof AI was used, but:

> Google's code was allocating 20000 variables in a single frame.


Weirdest bit is this, I think.

> After all, IPv4 is still standing stronger than ever, 60 years after it was deemed deprecated due to so called “address exhaustion”.

IPv4: deprecated in 1963, 19 years before it was introduced.


The whole article is full of jokes.

> the C programming language as it’s by far the fastest language on the planet to this day (thanks to the visionary genius Dennis Richie)

Which would be more or less true, but:

> I decided to use the slowest language on the planet, Python (thanks to the visionary genius of Ross van der Gussom)

> So I let the mighty snake do its work and after getting a cup of coffee and getting back to check on the program 48 hours later I was left with a beautiful c file


> for the large number close to the 2^32 limit the result is still returned in around 10 seconds.

This could be improved by arranging the if/else statements in a binary tree.


Come to think of it, I'm sure there are other optimisations you could do as well.


It could also simply check the most common numbers first. We can spider the Internet and collect all usages of numbers, sort them, and find out what the optimal order is.


True - given probability weights, the optimal decision order would be a Huffman tree https://en.wikipedia.org/wiki/Huffman_coding


This is a great idea! Intuitively, I think it might make the program 3x size though?: first `if` for ==, then an `else if` for <, then `else` for >.


it would be extremely humorous if it actually *reduced* the speed since it required loading so much more code.


I don't think it would help, you'd still need the 4 billion if statements to decide whether to take the left or the right branch from the root.


Yes but you would only need to visit 32 of them, so it would run much faster (around a hundred million times faster).


I think Scarblac's point is that using a > comparison would be cheating, and you'd have to do that as a series of 4 billion if/else statements too.


Fair enough, comparisons are dangerously close to arithmetic and we can't have any of that.


From the article:

“Now, in a world where AI is replacing programmers by the minute, taking their jobs and revolutionizing the way we think about code, maybe we should…”

Excuse me ?!?

I get that job loss due to AI is a common fear. But is there any hard evidence that the job market is losing “a software engineering job a minute” to AI? That’s 525600 jobs annually…


I think this is a joke


Gotcha. I’ll consider myself “swoooshed”.


% is a surprisingly slow way to do so compared to x & 1.


I’d assume any decent compiler would optimize a “% 2” (or any power of 2) into a bitwise op. Could be wrong though!


Tested it! In C: https://godbolt.org/z/7xTrhGasc GCC uses &1 even using -O0. LLVM using -O0 still does the modulo, but at -O1 it does a clever &-2.

And even C#: https://godbolt.org/z/5WPfT8e5G Does use the &-2 strategy.

Dart also uses and: https://godbolt.org/z/1437df4En

So most of languages compiled to native binaries seem to do it. Languages like Ruby and Python don't seem to optimize this, which is hardly surprising. Ruby has a JIT now, and the JIT might do it.

And I'm not sure what the haskell version does, can someone explain: https://godbolt.org/z/xPP4E1a6h ?


If you change the code from "n `mod` n" to "n `mod` 2", and add "-O" as a compiler flag, you'll see an "and" with 1 pop up in the assembly.

Without -O, what's happening is that "mod" is a method from the Integral type class (for the non-haskellers: read "interface"/"abstract class"), and thus the program just reads the corresponding field from the stafically allocated copy of the Integral dictionary for Int, and subsequently calls it. Of course one should just inline this known function call, and that's precisely what -O lets ghc do.


Nice, thanks!


Can someone explain what's the &-2 method, and why it should be better than &1 ?


In two's complement, -2 means all bit are 1, except for the lowest. 1 is all bits are zero, except for the lowest. So it's basically the same, just negated. If you look at the generated code, this is used to the opposite: Keep all bits, except the least significant.

Btw, it's a lot simpler if you use only unsigned ints. Both GCC and clang add a couple more instructions to make it work with negative numbers. With unsigned ints, both gcc and clang generate this simple assembly code:

    is_even:
        mov     eax, edi
        and     eax, 1
        ret


And those additional instructions are there only because C modulo is defined to have the same sign as the dividend (so that e.g. -5 % 2 == -1 instead of 1). Had it been defined as being always positive, &1 would have been sufficient in all cases (that would also have the effect of making integer division round to negative infinity instead of zero which too can be implemented simply by >>1, for both signed and unsigned numbers).


Thank you, today I've learned something new.

So the easiest answer to the original question (whether a number is even or odd) likely involves &1, not %2. Note taken.


But that probably isn't the function you want, because that will return -1 if the input is negative.

What you want to return is a boolean. And in that case the code generated is much simpler https://godbolt.org/z/5dq7MnrEx


Why "even C#"? Nowadays we often can trade blows with LLVM and GCC :)


I'm mostly a C# dev, so I would like this to be true! But, you know, I'm skeptical...


Thanks for trying this out! It's the same with Rust: https://godbolt.org/z/89z9rEz9a


I once saw that the only thing writing &1 instead of %2 does is making your computer yawn


Easy enough to check with even the indecent compilers!

https://godbolt.org/


I'm mildly surprised that the compiler seems to compile the code so literally so that the binary blows up like that with a linear runtime as well. I would've guessed that for such a straight-line execution on one of the simplest mathematical operations to yield a boolean, a modern highly-optimizing compiler would've done something with it.

What would it take to make the compiler optimize this code meaningfully?


Clang is able to convert the code to a jump table, though GCC can't.

https://c.godbolt.org/z/Wq65j4rrM

> What would it take to make the compiler optimize this code meaningfully?

To make GCC compile the code meaningfully, it suffices to add a bunch of else statements:

https://c.godbolt.org/z/z4KvxnbfG

In fact, GCC gets very clever at -O3.

(what matters is whether the compiler can turn the sequence of ifs into a switch statement, then both compilers have special logic to generate a lookup table.)


Interesting. I only know a little assembler, so I can't tell what -03 is producing there, but I know enough to see that it's doing some sort of loop bouncing between even/odd label choice. Is that now a constant-size binary (if still linear time)?


What GCC is doing at -O3 is encoding even (resp. odd) numbers in a bitmap.

43690 = 0xAAAA = 0b1010101010101010

21845 = 0x5555 = 0b0101010101010101

Then

    sal     rax, cl
    test    eax, 43690
    jne     .L3

    // .L3
    mov     edi, OFFSET FLAT:.LC1 // This is a pointer to the string "odd"
    call    puts
    jmp     .L2

is equivalent to C:

    if ((1 << number) & 0xAAAA) {
        puts("odd"); // x64 is little endian
    }

So execution is constant time (it only needs to check against two constants), there's no loop.

It should be advantageous compared to a jump table because it doesn't have to do two indirect jumps (though you now have branches that might mispredict).

Edit: code formatting.


I enjoyed this more than I should have.


First program I ever wrote was to bypass an anti-afk program in a game: it needed to click on a button which was x vs y pixels, and would appear on random positions on the screen every time. I manually inserted one line for each viable position using x-1 and y-1. In v2 I implemented a for loop which made the script go from tens of pages to not even a single full one - computer magic! (EasyUO language)


That looks like a program developed with TDD that has gone on too long without refactoring.

The author probably has a billion individual tests that will all lose their value if it were to be simplified, since they would no longer correspond to special cases in the code.


As a kid I heard that circuits like adders were made from boolean algebra AND, OR and NOT gates.

I played around on a big paper pad, and finally got a 4 bit adder implementation.

But since my unit circuit was [A,B]->[S], instead of the canonical [A,B,carry-in]->[S,carry-out], I had to add a second layer of units to propagate the first carry onward. And a third layer of units in case that also generated a carry. Etc.

So my circuit had gate of order N^2/2 and depth order of 2*N instead of gate and depth orders of N.


I was hoping that this would be about someone finally building an LLM out of a decision tree or decision tree derived technique (which ultimately boils down to nested if statements). Not sure why no one has tried it yet.


My understanding is that they are using increasingly small precision on the floats used by LLMs. So we just have to wait for them to reach 1-bit precision.


> In fact, the above code is a perfect example of a time-memory tradeoff

Not as much since the program will take increasingly more time to run as the input number gets higher. Checking if a 'small' number is even or odd will be pretty fast. Checking a 'large' number will be slow since it will go through all 'IFs' until it reaches the result.

It's like O(M/2) average time complexity where 'M' is a constant, the program memory size (number of 'IFs' stored).


I took it as a joke. The tradeoff is the programmer's time and memory, not the computer's


Not to compete with your brilliant code on optimization, bu I have also written a swap routine that excels especially on numbers close to each other. I know sign function result can be put into a variable, but computers nowadays are so fast...

Anyone feel free to use it.

static void swap(ref int a, ref int b) { int c = (b - a) ; while (c != 0) { a += Math.Sign(c); b -= Math.Sign(c); c -= Math.Sign(c); } }


The whole article looks like a misunderstanding to me. The author writes about a trade-off between memory and processor, although in his example both memory (on disk and in RAM) and processor are wasted (executing all IF instructions in turn).

It is also strange that the author is trying to generate code in assembly language, which he does not know very well, because there is a DIV/IDIV instruction.


Dude it's a joke


I gave similar a shot in rust, using build.rs to dynamically create the file, inspired by the blog post. I'd been meaning to look at build.rs for a while.

It seems to hit a pretty pathological compilation situation. Done as ifs, rust ran out of stack. Done as "match" it has been compiling for about 24 hours.


Interesting. How would this compare to making a while loop or recursive function, I wonder?

And, what about switch statements? In theory it is said they should be faster, but I am not sure that's necessarily the case.

Another option would be to keep a database with the numbers indexed, and then compare with performing a database query.


When I was young I tried to make my own alarm clock.

My masterplan was to print the time to the display (starting from 8 AM), wait a second and print the next value. All manually, obviously.

I remember eventually giving up because it was getting too boring. I think I discovered about loops a year after the event.

Great times :)


better to keep "4 billion" in the title... I was confused with "4 bytes" before open the link


In the blog post "Branch predictor: How many "if"s are too many?", I started with the same premise, but focused on branch predictor :P

https://blog.cloudflare.com/branch-predictor


"Since I'm a great believer in performant code I decided to implement this in the C programming language as it's by far the fastest language on the planet to this day (thanks to the visionary genius Dennis Richie)."

Fastest portable language. Assembler is faster.


This is the kind of insanity I miss from the early internet. Excellent job. Thank you for this.


I'd be interested in seeing how this would have gone with a switch statement instead.


Yep, I was thinking a switch statement may be better.

I mean, if we don't take into account that implementing this is just a joke anyway :D


Can’t you just test the last digit of main’s arg and get the best of both worlds?



Ironically, I have no idea what you’re implying with this reply. Should I drop the usage of the word ‘just’? Isn’t it clear what I’m saying?


So I honestly thought that this was going to be another LLM article, but using the venerable ReLU activation function instead of the usual. A ReLU is exactly an if statement when rendered in a decision tree (if less than 0, emit 0; otherwise, emit the input). Given the relative popularity of the 4B parameter models (any transformer is dominated by the number of parameters in good old fully-connected feedforward layers), you can perhaps describe such models as 4B if statements. I was disappointed that the author didn’t go there as a means of parody.


I'm afraid I take issue with the phrase "amazingly performant". I would struggle to come up with another way of determining whether a number near 2^32 is even that takes anything like 10 seconds


I'd bet this is slower:

    function isEven(n) {
      let arr = [];
      let sign = -1;
      for (let i = 0; i < Math.abs(n); ++i) {
        arr = [...arr, sign];
        sign *= -1;
      }
      return arr.length === 0 || arr.reduce((a, b) => a + b) === 0;
    }
But I'm not going to test it, because I'm doing productive things on my computer right now, and don't want to OOM.

Sadly, the OP missed an opportunity to call his program "blazingly fast" [0].

[0] https://www.youtube.com/results?search_query=primagen+blazin...


What was really hurting me was all the literal strings "even" and "odd." I would have made them into constants. I guess then the next step would be to use constants for all the numbers.


Reminds me of that guy who scrolled to the end of Excel: https://youtu.be/sF8k3zl70os


> we need more if statements!

Waiting for the Will Ferrel skit of this. Though this article was funny enough on its own. The Python program to generate the C code was hilarious.


Those APIs seem like fake pseudocode, until I realized... Windows


I to do lots of "researching" on social media.


> modulus operation

Isn't it normally called the modulo operation?


Maybe it's a modulo operation squished together with the modus operandi, for laughs


The TikTok is fake. Obviously it was created to drive engagement. Everything you see on TikTok is there to manipulate your emotions.


Home run! (First blog post blows up on HN)


This seems dangerously close to spawning a new busy beaver type inefficient resource utilization game


He should've used regex for that.


Not as exciting as PCRE checking for primality, but:

    python -c 'import re;print(re.match(r"^(.*)\1$",abs(int(input("n=")))*"1")and"Even"or"Odd")'


Not gonna lie, i really thought he was gonna end it with... now lets try 64bit, and leave us hanging


Ah, the holidays: when the most ridiculous and interesting technical work gets done. I love it.


I wish people would stop putting open curly braces on new lines. Makes it harder to read, IMHO.


Come on, we live in 2023 and we can achieve better performance. It should be parallelized with checks evenly distributed between GPU cores. Unless you think that efforts by NVIDIA and AMD should be wasted.


I'd say wrap this with a good interface and we can optimize it later.


This feels like a game of utilizing a PC at 100% capacity.


>Apple declined to comment for this article.


The skeleton of this post can make a good interview question


My biggest hope for code GPTs is replacing dependencies.


Seems like a poorly chosen example, since you could simply "lookup" the last bit of the integer to see if it is even or odd, which should be instant for any integer?


Why did I waste my time reading this? Hahaha


Lmao. The copyright headers.


Didn’t notice that, incredible

> Copyright 2023. All unauthorized distribution of this source code will be persecuted to the fullest extent of the law


Well this has been hilarious.


performant != high performance he used it twice so I felt compelled to state this


Although not quite as spectacular as that, here is some funny codegen in POSIX shell I wrote; `sh` doesn't have a `repeat` command, so I wrote a script to implement it without forcing any loops at all. Along the way I uncovered some interesting segfaults for most sh implementations (including, sadly, dash), which is the reason for the somewhat convoluted code. Have fun!

   #!/bin/sh
   #
   # Usage: repeat TIMES COMMAND
   #     Repeat COMMAND a given number of TIMES.
   
   eval_full_quotes='
    s/\\/\\\\/g  ;  s/\"/\\\"/g  ; s/</\\</g    ;  s/>/\\>/g  ;
    s/\$/\\\$/g  ;  s/\[/\\\[/g  ; s/*/\\*/g    ;  s/?/\\?/g  ;
    s/|/\\|/g    ;  s/&/\\&/g    ; s/~/\\~/g    ;  s/=/\\=/g  ;
    s/;/\\;/g    ;  s/ /\\ /g    ; s/%/\\%/g    ;  s/`/\\`/g  ;
    s/(/\\(/g    ;  s/)/\\)/g    ; s/{/\\{/g    ;  s/}/\\}/g  ;
    s/#/\\#/g    ;  s/'"'"'/\\'"'"'/g'
   
   errorx() { printf 'Error: %s\n' "${*:-"unknown error."}" >&2; exit 1; }
   
   _rep() {
    rep_times="$((2 * $1))"
    rep_string="$2"
    output=""
    while [ 0 -lt "${rep_times}" ]; do
     test 1 -eq "$(((rep_times /= 2) % 2))" && output="${output}${rep_string}"
     rep_string="${rep_string}${rep_string}"
    done
    printf '%s' "${output}"
   }
   
   {
   times="$1"
   shift
   
   test 0 -lt "${times}" || errorx "First argument must be a positive integer."
   test 0 -lt "$#"       || errorx "No command given to repeat."
   
   repeat_command="$*"
   command_length="${#repeat_command}"
   max_arg="${ARG_MAX:-"$(getconf ARG_MAX)"}"
   
   if [ "${max_arg}" -le "$((times * (command_length + 2)))" ]; then
    : "$(( block_size =     max_arg / (20 * (command_length + 2))        ))"
    : "$(( blocks     =             times / block_size + 1               ))"
    : "$(( remainder  = (block_size + (times % block_size)) % block_size ))"
   
    eval_string="$(_rep "${block_size}" "${repeat_command}; " | sed -e "${eval_full_quotes}")"
   
    while [ 0 -lt "$((blocks -= 1))" ]; do
     eval eval "${eval_string}"
    done
   
    eval_string="$(_rep "${remainder}" "${repeat_command}; " | sed -e "${eval_full_quotes}")"
    eval eval "${eval_string}"
   else
    eval_string="$(_rep "${times}" "${repeat_command}; " | sed -e "${eval_full_quotes}")"
    eval eval "${eval_string}"
   fi
   }
   exit


its funny but the large language models can be seen as billions of if statements learnt from data


Not really, since they do a mathematical function over blocks and don't need a single if statement. They map learned data + input -> output as a pure function


ReLU is if-like since it only passes half of the input values with the other half becoming 0


Is ML different than if statements? Yes.

But is it reallllly different? No not so much.


I like the engineering effort, but it's tainted by puerile comments like "visionary genius of Ross van der Gussom".


It took me way too long to realize this was satire. It was a stressful read up until that point.


This was hilarious thank you. Many lols


> Now, in a world where AI is replacing programmers by the minute, taking their jobs and revolutionizing the way we think about code

I’m dumb. This is a joke right? At most I can say AI has revolutionized how I interface with documentation.


That is a joke in the context of the post. Not so much a joke in general reality.

For you, AI might have just revolutionized how you "interface with documentation", but a huge share of programmers in any company are already using AI, officially or unofficially as a coding assistant.

And of course MS, IntelliJ and others are all having out products on this.


I’m using it as a coding assistant. The coding assistant is essentially fancy hyper speed documentation.


GPT is a brilliant interface to github search and stack overflow. That's all. Most of the time I use it, like stack overflow, you need to double check if it is the idiomatic way of doing it, if it IS still the right way of doing stuff or even if the code is correct, not full of bugs or even if it is solving the right problem.

It is like a helpful, but over-confident intern helping you with toil and research.


Would you spend a lot of time and money on contractors to do an often disappointing job, or ask a LLM to do a better job immediately?


I would think that the title alone should indicate that this entire bit is satire.


> This is a joke right?

I hope so. If not, it is a load of bullshit.


> Running this gives us a nice 40 GB file which contains all 4.2 billion comparisons needed to determine if any 32 bit number is even or odd!

For some reason I am thinking of npm now ...


> ; add the next 2...2^32-1 comparisons here

Chatgpt is getting so lazy these days.


Why do people always got skill issue and can't learn to use proper modulo arithmetic? It seems like those "TikTok idiots" always get the limelight over this kind of silly thing


Here is a version with 0 if statements

    #include <stdio.h>
    #include <stdlib.h>
    int main(int argc, char** argv) {
        return printf(atoi(argv[1]) & 1 ? "odd" : "even") < 0;
    }


Still branches


  #include <stdio.h>
  #include <stdlib.h>
  int main(int argc, char** argv) {
      return printf(&"even\0odd"[(atoi(argv[1]) & 1) * 5]) < 0;
  }


Not always. Popular architectures like x86_64 and ARM have instructions for conditional moves. These instructions do not cause the processor to branch.


> Considering the computer has to read 40 GB of data from disk, map it to physical memory and then let the CPU has a rip of it

No it doesn't? Maybe it needs to set up page tables for 40GB, but it doesn't need to read the parts you're not using into memory, that's part of the point of memory-mapping like this - it sets up a page fault handler to load the page into memory when accessed, rather than needing to do it all up-front.


Given that the entire linear chain of comparisons (therefore, every instruction) needs to be evaluated for 2^32-1, it would indeed need to read and execute the entire 40GB file.

The fact that it does so by faulting in each page before instructions in it can be executed doesn’t materially avoid having to read every byte.

Clearly the author should have checked the larger numbers first to prevent this (:


Software development needs to be driven by user needs. Clearly the best action in this scenario is to bundle in telemetry code that reports back which numbers are used with this program, assemble a ranking of the most frequently compared numbers in the real world, and use that to order the comparisons in v2 of the program.


It’s worth noting that good compilers can actually do a version of this: it’s called profile-guided optimisation <https://en.wikipedia.org/wiki/Profile-guided_optimization>. Not many projects use it, because it’s generally a bit of a bother (build a special version, run your profiling, then build everything all over again), but Firefox gets good mileage out of it. My vague recollection is a typical sort of speedup of 5%.


Of course the modern "scaleable" solution is to ask ChatGPT if the number is even or odd, and then print ChatGPT's response.


Using selenium and chrome webdriver under the hood to do so, as well.


Don't forget to remove the least used 1% of the numbers to make it leaner!


> Given that the entire linear chain of comparisons (therefore, every instruction) needs to be evaluated for 2^32-1, it would indeed need to read and execute the entire 40GB file.

It does, but not step by step the way the part I quoted described - the reading from memory can be pipelined with the execution. (And it's an ideal access pattern - reading linearly once - so it wouldn't surprise me if other optimisations kicked in)


For the highest possible input, all of the program needs to be read into memory. Just not all of it at the same time.

I guess technically most of the RET instructions are jumped over, and so they don't need to be in memory. But that isn't how cache-lines (let alone pages) work.


Did you ignore this part that came a bit before that?

> I decided to map the file into the address space instead of reading all of it. By doing this, we can just pretend that the entire file is already in memory and let the poor OS deal with fitting a 40 GB blob into virtual memory.

Why take a vaguely rhetorical statement and then complain it contradicts a more concretely accurate statement before it?


> Did you ignore this part that came a bit before that?

No, just the opposite - I read that and that's exactly why I'm saying there was no need to read the whole thing into memory before starting to execute it.

> Why take a vaguely rhetorical statement and then complain it contradicts a more concretely accurate statement before it?

Because it's a contradiction in what they've written?


The article doesn't state the entire program has to be read into memory before it starts executing. Instead the article states that during execution, for the highest inputs, the entire program needs to pass through memory for execution to finish.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: