Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Moravac (in the linked paper): "In both cases, the evidence for an intelligent mind lies in the machine's performance, not its makeup."

Do you agree?

I'm much less keen to ascribe "intelligence" to large, pretrained language models given that I know how primitive their training regime is compared to a scenario where I might have been "blended" by their ability to "chat" (double quote here since I know ChatGPT and the likes do not have a memory, so all prior interactions have to be re-submitted with each turn of a conversation).

Intuitively, I'd be more prone to ascribe intelligence based on convincing ways of construction that go along with intellignece-like performance, especially if the model also makes human-like errors.



I agree with Moravec. As he points out a bit later on:

> Only on the outside, where they can be appreciated as a whole, will the impression of intelligence emerge. A human brain, too, does not exhibit the intelligence under a neurobiologist's microscope that it does participating in a lively conversation.

We only have fuzzy definitions of "intelligence", not any essential, unambiguous things we can point to at a minute level, like a specific arrangement of certain atoms.

Put another way, we've used the term "intelligent" to refer to people (or not) because we found it useful to describe a complex bundle of traits in a simple way. But now that we're training LLMs to do things that used to be assumed to be exclusively the capacity of humans, the term is getting stretched and twisted and losing some of its usefulness.

Maybe it would be more useful to subdivide the term a bit by referring to "human intelligence" versus "LLM intelligence". And when some new developments in AI seem like they're different from "LLM intelligence", we can call them by whatever distinguishes them, like "Q* intelligence", for example.


Agreed, kind of similar to the trope of 'virtual intelligences' vs 'artificial intelligences' in sci-fi, where VIs are a lot more similar to LLM intelligence (imitative, often unable to learn, able to make simple inferences and hold basic conversation but lacking that instantly recognizable 'spark' of an intelligent being that we can see in humans - especially kids - and some other animals) rather than AIs, which are 'true' intelligences comparing to or exceeding humans in every way.


> The intelligence of a system is a measure of its skill-acquisition efficiency over a scope of tasks, concerning priors, experience, and generalization difficulty.

(Chollet, 2019, https://arxiv.org/pdf/1911.01547.pdf)

Priors here means how targeted is the model design to the task. Experience means how large is the necessary training set. Generalization difficulty is how hard is the task.

So intelligence is defined as ability to learn a large number of tasks with as little experience and model selection as possible. If it's a skill only possible because your model already follows the structure of the problem, then it won't generalize. If it requires too much training data, it's not very intelligent. If it's just a set number of skills and can't learn new ones quickly, it's not intelligent.


Your final paragraph is a poor definition of human level intelligence.

Yes, learning is an important aspect of human cognition. However, the key factor that humans possess that LLMs will never possess, is the ability to reason logically. That facility is necessary in order to make new discoveries based on prior logical frameworks like math, physics, and computer science.

I believe LLMs are more akin to our subconscious processes like image recognition, or forming a sentence. What’s missing is an executive layer that has one or more streams of consciousness, and which can reason logically with full access to its corpus of knowledge. That would also add the ability for the AI to explain how it reached a particular conclusion.

There are likely other nuances required (motivation etc.) for (super) human AI, but some form of conscious executive is a hard requirement.


This is reminding me again of The Bitter Lesson.

http://www.incompleteideas.net/IncIdeas/BitterLesson.html


From that article: "actual contents of minds are tremendously, irredeemably complex".

But they're not. The "bitter lesson" of machine learning is that the primitive operations are really simple. You just need a lot of them, and as you add more, it gets better.

Now we have a better idea of how evolution did it.


I am not a fan of this article. The vary foundation of computer science was an attempt to emulate a human mind processing data.

Foundational changes are of course harder, but it does not mean we should drop it all together.


> The very foundation of computer science was an attempt to emulate a human mind processing data.

The very foundation of computer science was an attempt to emulate a human mind mindlessly processing data. Fixed that for you.

And I'm still not sure I agree.

The foundation of computer science was at attempt to process data so that human minds didn't have to endure the drudgery of such mindless tasks.


Take a look at Turing words in his formulation of the Turing machine and I think it becomes quite clearly the man spent time thinking about what he is doing when he is doing computations.

The tape is a piece of paper, the head is the human, who is capable of reading data from the tape and writing to it. The symbols are discernible things on the paper, like numbers. The movement of the tape ("scanning") is the eyes going back and forth. At each symbol, the machine decides which rule to apply.

Its an inescapable fact that we are trying to get computers to 1. operate as close to how we think (as we are the ones who operate it) and 2. to produce results which resemble how we think.

Abstractions, inheritence, objects, etc are no doubt all heavily influenced by thinking about how we think. If we still programmed using 1s and 0s, we wouldnt be where we are.

It seems incredibly short sighted to me to believe that because a few decades of research hasnt panned out, that we should all together forget about it.


Turing's original paper [1] seems to have no such anthropocentric bias. His description is completely mechanical. Out of curiosity, rather than disputativeness, do you remember where you saw that sort of description? Turing's Computing Machinery and Intelligence paper seems to meticulously exclude that sort of language as well.

You have read me backwards. I am firmly of the opinion that the last few decades of research has entirely and completely panned out. The topic is clearly in the mind of theorists in 1950. But I'm pretty sure early computer architects were more interested in creating better calculators than in creating machines that think.

[1] https://www.cs.virginia.edu/~robins/Turing_Paper_1936.pdf


There is a ton. What I said isn't much of my own interpretation, its Turing own description of the machine.

The first section of the paper literally says "We may compare a man in the process of computing a real number to machine which is only capable of a finite number of conditions", gives a human analogue for every step/component of the process, continuously refers to the machine as a "he/him", and continuously gives justifications from human experience.

"We have said that the computable numbers are those whose decimals are calculable by finite means. This requires rather more explicit definition. No real attempt will be made to justify the definitions given until we reach § 9. For the present I shall only say that the justification lies in the fact that the human memory is necessarily limited. We may compare a man in the process of computing a real number to machine which is only capable of a finite number of conditions q1: q2. .... qI; which will be called " m-configurations ". The machine is supplied with a "tape " (the analogue of paper) running through it, and divided into sections (called "squares") each capable of bearing a "symbol". At any moment there is just one square, say the r-th, bearing the symbol <2>(r) which is "in the machine". We may call this square the "scanned square ". The symbol on the scanned square may be called the " scanned symbol". The "scanned symbol" is the only one of which the machine is, so to speak, "directly aware". However, by altering its m-configuration the machine can effectively remember some of the symbols which it has "seen" (scanned) previously."

"Computing is normally done by writing certain symbols on paper. "We may suppose this paper is divided into squares like a child's arithmetic book. In elementary arithmetic the two-dimensional character of the paper is sometimes used. But such a use is always avoidable, and I think that it will be agreed that the two-dimensional character of paper is no essential of computation. I assume then that the computation is carried out on one-dimensional paper, i.e. on a tape divided into squares"

"The behaviour of the computer at any moment is determined by the symbols which he is observing, and his " state of mind " at that moment. We may suppose that there is a bound B to the number of symbols or squares which the computer can observe at one moment. If he wishes to observe more, he must use successive observations. We will also suppose that the number of states of mind which need be taken into account is finite. The reasons for this are of the same character as those which restrict the number of symbols. If we admitted an infinity of states of mind, some of them will be '' arbitrarily close " and will be confused."

"We suppose, as in I, that the computation is carried out on a tape; but we avoid introducing the "state of mind" by considering a more physical and definite counterpart of it. It is always possible for the computer to break off from his work, to go away and forget all about it, and later to come back and go on with it. If he does this he must leave a note of instructions (written in some standard form) explaining how the work is to be continued. This note is the counterpart of the "state of mind". We will suppose that the computer works in such a desultory manner that he never does more than one step at a sitting. The note of instructions must enable him to carry out one step and write the next note."

"The differences from our point of view between the single and compound symbols is that the compound symbols, if they are too lengthy, cannot be observed at one glance. This is in accordance with experience. We cannot tell at a glance whether 9999999999999999 and 999999999999999 are the same"

Taken all this together, I don't think its far fetched to think Turing was very much thinking about the individual steps he was taking when doing calculations manually on a piece of graph paper, while trying to figure out how to formalize it. Perhaps you disagree, but saying its "completely mechanical" is surely false, no?


Above and beyond the call of duty. Point thoroughly made. Thank you.


"ChatGPT and the likes do not have a memory"

Can't we consider the context window to be memory?

And eventually wont we have larger and larger context windows, and perhaps even individual training where part of the context window 'conversation' is also fed back into the training data?

Seems like this will come to pass eventually.


Reducing the capability of the human brain to performance alone is too simplistic, especially when looking at LLM's. Even if we would assign some intelligence to LLM's, they need a 400w GPU at inference time, and several orders of magnitude more of those at training time. The human brain runs constanly at ~20w.

I highly doubt you'd be able to get even close to that kind of performance with current manufacturing processes. We'd need something entirely different from laser lithography for that to happen.


The problem isn't the manufacturing process, but rather the architecture.

At a low level: We take an analog component, then drive it in a way that lets us treat it as digital, then combine loads of them together so we can synthesise a low-resolution approximation of an analog process.

At a higher level: We don't really understand how our brains are architected yet, just that it can make better guesses from fewer examples than our AI.

Also, 400 W of electricity is generally cheaper than 20 W of calories (let alone the 38-100 W rest of body needed to keep the brain alive depending on how much of a couch potato the human is).


I think the power efficiency is very relaxed as you just need to consider the value of a single performance digital brain that exceeds human level.


> Also, 400 W of electricity is generally cheaper than 20 W of calories

Are you serious? I don't think you have to be an expert to see that the average human can perform more work per energy intake than the average GPU.

> The problem isn't the manufacturing process, but rather the architecture.

It's very much a problem, good luck trying to even emulate the 3D neural structure of the brain with lithography. And there are few other processes that can create structures at the required scale, with the required precision.


> Are you serious? I don't think you have to be an expert to see that the average human can perform more work per energy intake than the average GPU.

You're objecting to something I didn't say, which is extra weird because I'm just running with the same 400 W/20 W you yourself gave. All I'm doing here is pointing out that 400 W of electricity is cheaper than 20 W of calories especially as 20 W is a misleading number until we get brains in jars.

To put numbers to the point, at $0.10/kWh * 400 W * 24h = $0.96, while the UN definition for abject poverty is $2.57 in 2023 dollars.

As for my opinion on which can perform more work per unit of energy, that idea is simply too imprecise to answer without more detail — depending on what exactly you mean by "work", a first generation Pi Zero can beat all humans combined while the world's largest supercomputer can't keep up with one human.

> It's very much a problem, good luck trying to even emulate the 3D neural structure of the brain with lithography.

IIRC by volume a human brain mostly communication between neurones; the ridges are because most of your complexity is a thin layer on the surface, and ridges get you more surface.

But that doesn't even matter, because it's a question of the connection graph, and each cell has about 10,000 synapses, and that connectivity be instantiated in many different ways even on a 2D chip.

We don't have a complete example connectivity graph for a human brain. Got it for a rat, I think, but not a human, which is why I previously noted that we don't really understand how our brains are architected.

> And there are few other processes that can create structures at the required scale, with the required precision.

Litho vastly exceeds the required precision. Chemical synapses are 20-30 nm from one cell to the next, and even the more compact electrical synapses are 3.5 nm.


>The human brain runs constanly at ~20w.

Most likely because any brains that required more energy died off at evolutionary time scales. And while there are some problems with burning massive amounts of energy to achieve a task (see: global warming) this is not likely a significant short falling that large scale AI models have to worry about. Seemingly there are plenty of humans willing to hook them up to power sources at this time.

Also you might want to consider the 0-16 year training stages for human which have become more like 0-21 year training stages with at minimum 8 hours of downtime per day. This does adjust the power dynamics pretty considerably in that the time actually thinking daily drops to around 1/3rd the day boosting effective power use to 60w (as in you've wasted 2/3s the power eating, sleeping, and pooping). In addition that model you've spend a lot of power training is able to be duplicated across thousands/millions of instances in short order, where as you're praying that human you've trained doesn't step out in front of a bus.

So yes, reducing the capability of a human brain/body to performance alone is far too simplistic.


Some of that 8 hrs of sleep includes fine tuning on recent experiences and simulation (dreams).


> We'd need something entirely different from laser lithography for that to happen.

Like, you know . . . nerve cells.


Care to explain?


I was being facetious in that brains already exist.


LLMs are radically unlike organic minds.

Given their performance, I think it is important to pay attention to their weirdnesses — I can call them "intelligent" or "dumb" without contradiction depending on which specific point is under consideration.

Transistors outpace biological synapses by the same degree to which a marathon runner outpaces continental drift. This speed difference is what allows computers to read the entire text content of the internet on a regular basis, whereas a human can't read all of just the current version of the English language Wikipedia once in their lifetime.

But current AI is very sample-inefficient: if a human were to read as much as an LLM, they would be world experts at everything, not varying between "secondary school" and "fresh graduate" depending on the subject… but even that description is misleading, because humans have the System 1/System 2[0] distinction and limited attention[1], whereas LLMs pay attention to approximately everything in the context window and (seem to) be at a standard between our System 1 and System 2.

If you're asking about an LLM's intelligence because you want to replace an intern, then the AI are intelligent; but if you're asking because you want to know how many examples they need in order to decode North Sentinelese or Linear A, then (from what I understand) these AI are extremely stupid.

It doesn't matter if a submarine swims[2], it still isn't going to fit into a flooded cave.

[0] https://en.wikipedia.org/wiki/Thinking,_Fast_and_Slow

[1] https://youtu.be/vJG698U2Mvo?si=omf3xleqPw5u6Y2k

https://youtu.be/ubNF9QNEQLA?si=Ja-9Ak4iCbcxbWdh

https://youtu.be/v3iPrBrGSJM?si=9cKHXEvEGl764Efa

https://www.americanbar.org/groups/intellectual_property_law...

[2] https://www.goodreads.com/quotes/32629-the-question-of-wheth...


> LLMs are radically unlike organic minds.

I used to think this too, but now I'm not so sure.

There's an influential school of thought arguing that one of the primary tasks of the brain is to predict sensory input, e.g. to take sequences of input and predict the next observation. This perspective explains many phenomena in perception, motor control, and more. In an abstract sense, it's not that different from what LLMs do -- take a sequence and predict the next item in a sequence.

The System 1 snd System 2 framework is appealing and quite helpful, but let's unpack it a little. A mode of response is said to be "System 1" if it's habitual, fast, and implemented as a stimulus-response mapping. Similar to LLMs, System 1 is a "lookup table" of actions.

System 2 is said to be slow, simulation-based, etc. But tasks performed via System 2 can transition to System 1 through practice ('automaticity' is the keyword here). Moreover, pre-automatic slow System 2 actions are compositions are simpler sets of actions. You deliberate about how to compose a photograph, or choose a school for your children, but many of the component actions (changing a camera setting, typing in a web URL) are habitual. It seems to me that what people call System 2 actions are often "stitched-together" System 1 behaviors. Solving calculus problem may be System 2, but adding 2+3 or writing the derivative of x^2 is System 1. I'm not sure the distinction between Systems 1 and 2 is as clear as people make it out to be -- and the effects of practice make the distinction even fuzzier.

What does System 2 have that LLMs lack? I'd argue: a working memory buffer. If you have working memory, you can then compose System 1 actions. In a way, a Turing machine is System 1 rule manipulation + working memory. Chain-of-thought is a hacky working memory buffer, and it improves results markedly. But I think we could do better with more intentional design.

[1] https://mitpress.mit.edu/9780262516013/bayesian-brain/ [2] https://pubmed.ncbi.nlm.nih.gov/35012898/


> if a human were to read as much as an LLM, they would be world experts at everything

It would take a human more than one lifetime to read everything most LLMs have read. I often have trouble remembering specifics of something I read an hour ago, never mind a decade ago. I can't imagine a human being an expert in something they read 300 years ago.


Intelligence is what you call it when you don't know how it works yet.


For reference, I believe relatively rudimentary chatbots have been able to pass forms of the Turing Test for a while, while also being unable to do almost all the things humans can do. It turns out you can make a chatbot quite convincing for a very short conversation with someone you've never met over the internet[1]. I think there's a trend that fooling perception is significantly easier than having the right capabilities. Maybe with sufficiently advanced testing you can judge, but I don't think this is the case in general: in general, our thoughts may be significantly different than we can converse. One obvious example is simply people with disabilities (say a paraplegic) who can't talk at all (while having thoughts of their own): output capability need not necessarily reflect internal capability.

Also, take this more advanced example: if you built a sufficiently large lookup table (of course, you need (possibly human) intelligence to build it, and it would be of impractical size), you can build a chatbot completely indistinguishable from a human that nonetheless doesn't seem like it really is intelligent (or conscious for that matter). The only operations it would perform would be some sort of decoding of the input into an astronomically large number to retrieve from the lookup table, and then using some (possibly rudimentary) seeking apparatus to retrieve the content that corresponds to the input. Your input size can be arbitrarily large enabling arbitrarily long conversations. It seems that to judge intelligence we really need to examine the internals and look at their structure.

I have a hunch that our particular 'feeling of consciousness' stems from the massive interconnectivity of the brain. All sorts of neurons from around the brain have some activity all the time (I believe the brain's energy consumption doesn't vary greatly with activity, so we use it), and whatever we perceive goes through an enormous number of interconnected neurons (representing concepts, impressions, ideas, etc.); unlike typical CPU-based algorithms that process a somewhat large number (potentially 100s of millions) of steps sequentially. My intuition does seem to pair with modern neural architectures (i.e. they could have some consciousness?), but I really don't know how we could do this sort of judgement before understanding better the details of the brain and other properties of cognition. I think this is a very important research area, and I'm not sure we have enough people working on it or using the correct tools (it relies heavily on logic, philosophy and metaphysical arguments; as well as require personal insight from being conscious :) ).

[1] From sources I can't remember, and from wiki: https://en.wikipedia.org/wiki/Turing_test#Loebner_Prize




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: