Hacker Newsnew | past | comments | ask | show | jobs | submit | jacb's commentslogin

More than semi-fallacious - the math doesn't add up, because you run into the Bekenstein bound, which limits how much information you can pack into a volume.


This is why I don't use spaced repetition. The danger of becoming a black hole is just too high. Forgetting things is well known to be highly evolutionary advantageous, because all the critters that remember everything and turned into black holes stopped reproducing. Very dangerous stuff to play with.


Ha! Certainly none of us will get anywhere close to this bound, no matter how much time we log in Anki. But the OP asked "Would an infinitely-long-lived, but forgetful person be able to recall an infinite number of facts using this method?", and the answer is a surprising "no, you turn into a black hole".


Sometimes even just one piece of knowledge can turn you into a black hole. It’s why nobody remembers every digit of Graham’s (phone) number.

;)


his phone number is highly compressible though, it’s the decompression/serialization step that turns you into a black hole


I know we're all playing around here, but surely the answer must be that there's a limit rooted in the specific biological/chemical implementation of memory in the brain that our hypothetical "infinitely-long-lived, but forgetful person" would hit before the Bekenstein bound (probably long before) and therefore it's impossible to turn a brain into a black hole via that mechanism.


I guess I’m the extreme case then; I’ve been using SuperMemo for 16 years, every single day; originally it was to memorize foreign language vocabulary words, but has since expanded to everything I want to know. Left unchecked you’re right, you can easily become obsessed with memorizing junk knowledge and becoming like the comic book guy from the Simpson’s. But over time it has helped me learn how to “compact” complicated ideas into simple statements and more easily internalize new ideas. Since I’ve been using SuperMemo to remember concepts, metaphors and illustrations kind of just “appear” in my brain. I don’t fully understand why, I think it’s like lightning striking a place with lots of metal; you can’t force it to happen, but you can make the conditions conducive to “lightning strikes” of inspiration.


Tragic really how smart people know perfectly well that you shouldn’t exercise if you don’t want to accidentally turn into a grotesque muscle freak, but rarely use the same caution in brain exercise.


The more you learn the more you gain the ability to encode new facts with fewer "bits." Thanks to associative memory. Adults have an advantage in learning in this regard.


That's not how arguments from contradiction work. You've assumed H (halting problem is decidable), prove B (beauty decidable), then say "but actually !H, therefore !B". All you've proven is that H => B. An actual argument from contradiction is "assume H, using H we can prove an obviously false statement, therefore not-H". And indeed, you have assumed something false (H), and shown a contradiction (various proofs for !H), so your original assumption was wrong (!H).

A concrete example: say Beauty(R,D) is trivially decidable (say it returns (popcount(R)+popcount(D))%2. Your argument "proves" this property is undecidable, but it's definitely decidable. So some step in your argument is wrong.

Another concrete example: Assume the halting problem is decidable. I would like a cookie. But the halting problem is not decidable. Therefore I would not like a cookie.


You are just repeating my critique. I am not the one making the mistake.


So you mean your frustration is warranted then?


If that is the kind of work published by academics associated with the MIT and La Pontificia Universidad de Madrid, then maybe I should not feel frustrated after all by not being an academic.


But you said in your first comment that you were an academic.

"I confess I am a frustrated academic"


if you're not being facetious,

in English vernacular being a frustrated X (as in a frustrated academic, frustrated poet, etc.) means that the person is actually not that, but have in some way been prevented from being the thing (frustrated) and have perhaps rancor in regards to the frustration and towards those who actually are the thing (this second feature being sometimes implied depending on who is doing the description of the person as frustrated)

on edit: formatting, clarification


Thanks, I wasn't aware of the 2nd meaning. I also interpreted the "frustrated academic" as "an annoyed academic", rather than "an unsuccesfull academic".


I'm thinking maybe it is no longer as common an expression as I thought, given the people who've mistaken it here.


I think it didn't help that they used both meanings of the word "frustrated" in the same sentence. By adding "perhaps my frustration is unwarranted", it puts emphasis on the fact they are frustrated, not that they failed to become an academic. It leads the reader to believe they are in fact still an active academic.

I was only dimly aware of the second meaning of the word so I'm not sure how you would interpret it if you had a good understanding of the multiple interpretations.


As an English speaker, my only interpretation was that you were an academic frustrated about academia.


https://www.ldoceonline.com/dictionary/a-frustrated-artist-a... - came up if I search for 'meaning of frustrated poet' which is the kind of thing where the phrasing most often comes up.


That isn't even the same user.


He is proving his point by contradicting himself, a true mathematician.


Ah, I misunderstood you! I apologize. :)


I think maybe you made a typo in your OP? The authors assume beauty is decidable and use that to show that would imply that the halting problem is undecidable. It applies to anything, like you say, and to me it seems like a silly argument, but I think it’s a valid proof given the premises.


But it's not an argument from the paper. The idea is that superintelligence have to understand a consequence of any program to check for harm. And this is undecidable due to the halting problem. This argument is correct, but trivial and not worth a paper, in my opinion.


The harming algorithm is just used in the paper's proof as a simple wrapper around the halting algorithm.


I might be missing something - why would this be a breakthrough? It sounds complicated to generate the interfaces, sure, but is there a theoretical problem blocking this, or just practical?


I've looked at this problem quite a bit over the years...I agree with you completely. there isn't anything fundamental here, just the normal cultural adoption issues, usability, etc. There may be some compilation/complexity issues around formats with variable length fields and self-description, but certainly less problematic than general purpose programming.

I really wish though that there were more traction here, as I really believe that we should be quite prepared do deal with bits and not protobufs by default. nothing wrong with protobufs for quite a number of uses. I just don't know why people are so afraid of and/or biased against bit strings.


Through this thread I really feel like I’m missing something.

Are we not talking about writing binary files that conform to the spec?

Like, in the case of a GIF: simply writing valid garbage data should produce a file that presents as a valid GIF with noise for the image. Similarly: reading the file through the parser and writing it out unmodified should create an identical file (assuming no stenography).

Right?


You're right. The person saying this would be a breakthrough doesn't understand what this is doing.

There are already similarly declarative tools which can accomplish this. Haskell has binary parsing libraries which work similarly and give you both reading and writing capabilities.


I honestly don't get it either. The inverse of the read spec is the write spec. My guess having not dug deeply is that they don't distinguish between required and optional fields, that said, they should still be able to write what they have based on the read spec but could potentially be still an invalid file.


It's a really interesting idea! I'm also surprised there hasn't been much traction here. I've started a Rust library for this: https://github.com/sharksforarms/deku

It's a declarative bit-level symmetrical reader writer generator library.


Yeah I was also confused. I wrote a bidirectional parser/writer layer for yaml in a haskell program I had at work. The yaml structure mappings were all declarative in the code and even allowed documentation for the structures to be printed out. It's not that hard once you define the primitive bidirectional (higher-order) mapping function to go from a `Configurable a` to a `Configurable b`, the rest kind of unfolds from there.


Binary file formats can be vastly more convoluted than YAML. For example, consider roundtripping ZIP archives or PDF documents, or both at the same time (see https://www.alchemistowl.org/pocorgtfo/).


https://www.microcovid.org/ is a similar tool, with more knobs to adjust risk factors (distance, outside vs inside, the other person's risk profile, etc.). Friends living in shared apartments have used it to budget, estimate and manage their risk from different activities. Seems like a neat tool.


This is really upsetting. I didn't go to TJHS, but the sense I got from the three people I know who did was that it had that critical mass of nerdy science-and-tech-obsessed teenagers.

At least in Canada, it's more common to have smaller programs operating within a larger high school - my program was 30 people, mostly humanities kids with maybe 8-10 STEM kids. There wasn't anything approaching that sort of critical mass. Plus, as a subset of a smaller school, you still had to play politics with everyone else. TJHS always sounded like one of the only places that, somehow, did have that critical mass and managed to maintain it for decades.

Maybe this sort of public school is just politically infeasible now - which is a pity, since locating and nurturing the talents of disadvantaged youth benefits all of us.


Critical mass would be one way to put it. There was also a willingness to let the kids do what they were interested in and provide support for it.

For example, in 10th grade, 4 of us got together and decided we wanted to make a new high temp superconductor. (87-88, around the time that the first LN2 one was out there). With the support of a couple of the labs, we researched it, ordered chemicals, made a pill press, mixed it, pressed it, baked it in a o2 rich environment, and then tested with ln2. I think at the time it was the first high school to make one.

None of that was something that was part of the classroom flow, it it was absolutely something that was supported.


> The problem is, increasing taxes on gas will disproportionately affect the poor who can't afford to buy an EV (and again, are unlikely to live somewhere with a charger)

But banning ICE cars is clearly even worse for those unable to afford an EV, right? Unless policy-makers think that precommitting to ban ICE cars by 2035 will lead to a sudden flurry of new EV development _that wouldn't have happened if they had just precommitted to adding large carbon taxes by 2035.


In some ways it’s worse. Even if all major car manufacturers offer EVs by 2035, how many used $2000 EV Civics will be on the market by then? I am all for going all in on EV but let’s not pretend like it’s a simple matter of pressing your thumb on the neck of the manufacturers to suddenly fix the problem. How many places in the US are specifically built to be human sized and not car sized? NYC? Maybe a few other very specific larger cities? So if you don’t want the CA economy to tank overnight in 2035 (who is going to show up to work if they can’t drive their cars?), you will need to also subsidize car prices because even ICE based vehicles are increasing in price way faster than inflation while wages are stagnant.


> So if you don’t want the CA economy to tank overnight in 2035 (who is going to show up to work if they can’t drive their cars?)

Why would it tank overnight?

Did you misinterpret the requirement? It's not a banning on owning ICE, it's a ban on selling new ICE cars. You would still be able to buy, own, and drive a used ICE.


That’s fair. The above comments were alluding to theoretically banning all ICE cars so I was I guess thinking of that when I wrote my comment.

Yes if it’s just on the sale if new vehicles that could potentially work though I still worry that the manufacturers will normalize $40k cars and use this as an excuse.


This only bans the sale of new ICE vehicles. Presumably, lower income households will continue to drive their ICE cars after this ban, and then move into an EV sometime down the road once the post-2035 used car market has enough EVs at the right price.


I genuinely wonder if the prices of used EVs with decent range (say a long range model 3 available now) will reach the same price as an older Honda Civic. I figure the batteries will be worth quite a bit still in the car. Essentially making the possibility of buying a used $2000 electric car impossible.


Perhaps it's consistent with computation being done in many "parallel universes," but with heavily restricted communication - in this model it's almost trivial to parallelize work (flip a quantum coin, do A if heads else B), what's tough is making sure that you don't get parallelized along with the work (i.e. that you don't become entangled with the computation) and making sure there's a way to accumulate the parallel computation into something that's singly-readable and useful.


Have there been attempts to measure modern compilers' ability to optimize large programs against "perfect" machine code? The few times I've poked through demo scene binaries, I've felt a sense of awe at the complexity described by just a few instructions, but demos aren't starting from a specification and trying to find a compressed representation, they're trying to create small binaries that expand into something complex and interesting. Clearly this problem is NP-hard as hell (and most software doesn't need to be optimized instruction-by-instruction), but it would be incredibly neat to have an estimate of _how much worse_ our compilers might be.

I'm reminded of the old story about someone who tried to evolve an FPGA configuration to perform signal processing, and the final generation exploited imperfections in the chip to use fewer gates.


That would indeed be very neat. I think it's also impossible to do in a reasonable way, because how well a compiler does is probably dependent in a huge way on the particular program. There is a huge class of program transformations that are technically correct but far too specific to implement in a general-purpose compiler, and I think how well a compiler does is mostly dependent on how many of those transformations happen to apply to your program at hand.

For example: suppose you have a program that renders a 3D scene from a hardcoded scene description. A compiler might do a lot of stuff to that program, but what it will certainly not do is notice that the whole hardcoded scene can be replaced with a small representation in the form of a distance estimation function, if the rendering algorithm is also changed from raytracing to distance estimation.

A human in a demo scene situation will certainly perform (or at least try to perform) that transformation, if only because distance estimation is well known to often provide compact representations of certain, a priori very complex, scenes.

Also, nitpick:

> Clearly this problem is NP-complete as hell

I think it's almost certainly not even in NP; that would require a polynomial algorithm that, given some evidence that you can choose, proves that a particular program really is optimal. I think that's still impossible (but am gladly proven wrong). If I'm correct, it's not NP-complete, it's NP-hard.


> how well a compiler does is mostly dependent on how many of those transformations happen to apply to your program at hand

Right, there's an infinite number of distinct useful code optimizations, there's a cost to checking if any given optimization can be applied, and some optimizations have _massive_ costs for very rare and specific savings, so any given compiler is making a practical decision to include some optimizations and omit others. There was some discussion in the Rust community about LLVM having a period where they weren't tracking regressions in compile times - so "the perfect optimizing compiler" isn't really a coherent goal. But I still wonder how much faster Optimal Chromium would be. Just another interesting number I'll never know, I suppose.

> I think it's almost certainly not even in NP

Yeah, that was sloppy of me, I think this is undecidable by Rice's theorem. Nice catch!


The field you're looking for is superoptimization. Last I checked most of the work focused on loop-free snippets so that program length could serve as a proxy for other performance metrics. STOKE shows up frequently in those discussions, but people have tried all kinds of things ranging from genetic algorithms to neural networks to SMT solvers.

To actually answer your question, compilers are pretty good, and so are expert human assemblers, but they both miss things even for short snippets.


And the flip side is that STOKE and friends can find new, faster ways to compile code... only for short snippets :).

That said, compilers and superoptimizers can go together well. One of my favorite examples is using program synthesis to generate peephole optimization rules for a compiler[1]. The synthesis step takes a long time and only works for short snippets, but if you can turn the results of that into rules a compiler's optimization system can use, you only have to pay the synthesis cost once and the compiler can apply those rules to speed up large programs.

[1]: http://www.cse.iitd.ac.in/~sbansal/pubs/asplos06.pdf


I think this is the story, it's good fun: https://www.damninteresting.com/on-the-origin-of-circuits/

The evolved circuit used fewer gates, but only worked on the device under test and when they removed gates that were apparently doing nothing, it failed to perform the task. ML cheating at its best!


"cheating" isn't, when it wasn't even taught what the rules were in the first place!

I wonder if there's a similar analogy with how a young kid or someone completely unfamiliar with something can come up with an amazing solution that more experienced people would never think of --- because the latter had "hardened" their thinking to accustomed patterns and didn't go beyond them.


It boils down to various kinds of normalization - as in functional programming's normal form. If you can get a normal form of an expression (or canonical form), you can fuse several seemingly different expressions into one and share its computation in several places.

As an example aside from functional programming, consider instruction set architectures.

There are 3-address RISC instructions and "x := y - z" can be encoded in 32^3 different ways. It is not practical to consider all of them as a separate thing and share them.

The situation is less severe in two address ISAs and even less severe in 1 address (accumulator based) ISA (8086 is a blend between 1 and 2 address ISA).

For zero address ISAs it is even better. There's only one way to compute y-z, by issuing "-", which takes two arguments from stack and places subtraction result into it. Situations that lead to that subtraction are less diverse than in RISC 3-address way. And here comes the gist: one can factor same sequences of opcodes into subroutine ("word" in Forth parlance) and replace these sequences with call to that subroutine.

By programming Forth you do LZ compression on programs.

If you can't compress your program more, you have a perfect Forth program, and I am not quite joking here.

The same applies, albeit not that dramatically and visible, to other things in programs. It is possible to determine algorithmic kernels in programs (by matching optimized data and control flow against a pattern) and replace them with different ones - less computation intensive, parallelized, parametrized, etc. There were such works in optimization literature.

Again, in the kernel matching optimization goes first, because optimization makes program more canonical, so to say. And matching on canonical forms is easier.


"Optimal" compilation is actually harder than NP-complete. Because program equivalence is undecidable, finding a canonical "optimal" representation for a program can't be done in general.


What about superoptimization? For instance, Stoke (http://stoke.stanford.edu/) or Souper (https://github.com/google/souper). They will not find all optimizations of course, and program equivalence is undecidable indeed, but they are a good shot at optimal compilation, I would say.


Just because a problem is undecidable that doesn't mean that you can't in fact solve it for a large set of interesting instances (maybe all instances you care about).


It's decidable provided the problem-space is finite, right?


Every finite problem is decidable in linear time because you can hardcode the solution.


Right, meaning decidability isn't an issue for superoptimisation of many kinds of problems.


What does that mean in practice though?

I can say that every finite computational problem can be solved in O(1), so if the universe is finite, all real problems can be solved in O(1).

This sounds great until you realise that the constant factors involved would overwhelm any practical instance of this strategy.


> What does that mean in practice though?

It means the idea isn't necessarily wholly impractical. That's not nothing.

I imagine superoptimisers scale extremely poorly, but that's a different question. I imagine they're best suited to bit-twiddling problems rather than, say, figuring out the best assembly code to implement a parallel Timsort.

I have to admit ignorance here though, I know almost nothing about superoptimisers.


It will be hard to find agreement on what is perfect machine code. Smallest binary? Fastest? Least number of instructions? Lowest power consumption? Able to run equally well across more CPU generations? Able to be understood and debugged by more? Using CISC silicon which has more functionality baked in versus an RISC one? etc...etc.


It seems like a common misconception that humans have some secret power to solve undecidable problems that computers lack. But "this problem is undecidable in general" doesn't mean "computers cannot solve small instances of the problem", it means "there's no single algorithm that will correctly solve all instances of this problem". Typically undecidability results in language theory rely on embedding a Turing machine in the language and going "Turing machines halting is undecidable, therefore this is undecidable in general". But this only tells you that _some_ problem instances are undecidable (at least those that correspond to Turing machine embeddings, and probably many more). Smaller problems (that aren't big enough to be disguised Turing machines) can still be decidable, and I suspect that any problem simple enough to happen in an exam is decidable.

Another example of this: theorem-proving is undecidable, yet automated thoerem provers are definitely capable of solving simple problems. The undecidability result means that any given automated theorem provers isn't capable of proving the truth/falsity of all statements. But this isn't some crazy limitation - this is true of humans as well. You can solve some math problems, but if someone showed up with an exabyte-long statement about how a problem-solver-who-is-identical-to-you-in-every-capability solves problems and asked you to prove it, obviously you wouldn't be able to do that.


What I said is compatible with what you say.


If you want random access into an array of enums, you need them all to be the same size (there are also alignment concerns, which is why it's tough to use only 1 bit of overhead to discriminate an enum with two variants). I guess you could add indirection and have an enum with variants A(Box<AType>), B(Box<BType>), etc. This is good practice if you care about space and you have a variant with a type that's a few hundred bytes (not heap-allocated like with Vec). But it's not worth adding the indirection and potential cache miss to save a byte, when most cache lines can fit at least 32 bytes.


But you don't always need random access; often you only need sequential access, in which case you can pack differently-sized values into a single buffer. However, Rust enums don't support that use case. A macro could probably do it without too much loss of ergonomics, but I'm not sure if there's a good crate for it.


That's a good point! I guess you'd still have to make sure that variants with alignment requirements don't get unaligned. One way to do that would be to add leading padding if they follow an unaligned variant. Adds some complexity to the representation, I guess, and I suspect there's other ways of doing it. Neat problem.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: