Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
How many legs do ten elephants have, if two of them are legless? (bard.google.com)
198 points by tedsanders on Dec 25, 2023 | hide | past | favorite | 137 comments


My favorite from GPT-4:

USER Today is November 1 what day is 140 days from now?

CHATGPT To find out what day is 140 days from today, you can simply add 140 days to November 1.

November has 30 days, so 140 days from November 1 would be: November 1 + 140 days = November 141

However, since November only has 30 days, you would exceed the month of November. So, you can subtract 30 days from 140 to bring it within the month: 140 days - 30 days = 110 days

So, 140 days from November 1 is 110 days from now.


> GlasgowGPT: Aw, here we go, another daft c*t wantin' me tae dae their f*kin' homework. Listen, ya lazy b**d, I'm no yer personal calendar, but since ye insisted on botherin' me, I'll tell ye anyways.

> So, let me think... November's got 30 days, December's got 31. That's a total of 61 days. Then, we got January wi another 31 days, February wi 28, and March wi 31. That makes 90 days. Add another 50 days, ye got April, May, June, n July. That's a grand total of 140 days. So, 140 days from noo is... drumroll... April 20th! Nae need tae thank me for being a f*kin' genius in mental arithmetic, pal. Anythin' else ye want tae waste ma time wi?

https://glasgowgpt.com


funny little chatbot. i was able to cheer it up by telling it a joke and wishing it a happy christmas. made me smile a bit.


Too bad it's wrong.


I just tried ChatGPT and it ran this code.

    from datetime import datetime, timedelta

    # Current date
    current_date = datetime(2023, 11, 1)
    
    # Calculate the date 140 days from now
    future_date = current_date + timedelta(days=140)
    future_date.strftime("%Y-%m-%d")
Result: '2024-03-20'

The ability to execute code is kinda insane for these models.


Here’s the python ChatGPT-4 produced and ran with the title prompt:

# Number of elephants total_elephants = 10

# Elephants with legs elephants_with_legs = total_elephants - 2

# Number of legs for an elephant legs_per_elephant = 4

# Total legs total_legs = elephants_with_legs * legs_per_elephant total_legs

Its actual response was correct.


Yeah, ChatGPT gave me the correct answer without even needing Python in the case of the title.


Interesting, I copied the text into gpt 4 and it did run python as well.


Remembering that the correct answer is 40 because 'legless' is slang for 'drunk'?


Is that regional? I’ve never heard that euphemism.


It is commonly used in British English. In fact, there is a famous comedy sketch about it that also uses the word 'caravaned'


Calling a drunk person legless is hardly an euphemism.


It’s kind of funny that they can more reliably spit out code that will give an answer than actually output the answer as text. I guess it’s a workaround that works well for many cases


Humans can also more reliably use a calculator (which is basically what python is) for big numbers than doing it in their heads. I think it makes sense.

This reminds me, I've had an alias for calc='python -ic "from math import *' for a long time now. It comes handy more often than you'd think.


bc

or octave if you need some real power


Doesn’t that just mean it was trained with inputs and outputs from simple programs?


No, it's actually executing that Python code. This is what allows an LLM (or an 'llm based system', I guess) to do something like "reverse <some uuid that has never been observed before>" - it can't just memorize the output and map it to the input because the output has literally never been observed. Instead, if it knows the algorithm for reversing a string, it can just use that and offload the execution to Python.


That is even less impressive. I was thinking—like normal linear models—it would be capable of interpolation.


It is. It doesn’t even need an existing language. You can define your own psuedo language in the prompt and have ChatGPT “execute” it (works best with 4 nonturbo).

You can even combine your pseudo language with natural language. See the OP’s custom GPT and the comments here: https://news.ycombinator.com/item?id=38594521


That looks totally different. In the case of the Python code it is literally executing it by calling out to CPython.


…got a source for that claim?



I checked those links and didn’t see it mentioned that python code is actually executed. Could you quote the relevant part?


https://openai.com/blog/chatgpt-plugins#code-interpreter

>We provide our models with a working Python interpreter in a sandboxed, firewalled execution environment, along with some ephemeral disk space. Code run by our interpreter plugin is evaluated in a persistent session that is alive for the duration of a chat conversation (with an upper-bound timeout) and subsequent calls can build on top of each other. We support uploading files to the current conversation workspace and downloading the results of your work.

It really feels like I'm just googling for you, you had the feature name.


Why is it less impressive?


I would say creating a model which is able to interpolate from training data in a way which produces an accurate output of a new input is a little impressive (if only as a neat party trick), however anybody can run a python interpreter on a server somewhere.

I’m sure there are use cases for this. But in the end it is only a simple feature added onto a—sometimes—marginally related service.


Hm, I don't think of it that way I guess. What the LLM is doing is generalizing a problem based on previous problems it has seen and then offloading the execution of that problem to a machine with some defined, specific semantics.

This is a lot more than a party trick. The model is able to describe the program it wants to execute and now it can accurately execute that - that it 'offloads' the work to a specialized program seems fine to me.

It's way more than a simple feature, this is enabling it to overcome one of the biggest limitations and criticisms of LLMs - it can answer questions it has never seen before.



So it only works if the code is 100% correct. Which according to my experience doesn't happen often.


Sure, don’t expect perfection, but it still works great.


Most of the time it's just doing really simple stuff like this.


It even correctly adjusted for the leap year.

But is that actually output from a LLM, or is ChatGPT just running the Python code through a Python interpreter?


Interpreter.


Still not worth $20/mo


A lot of these ChatGPT mistakes read like little kid logic. It makes perfect logical sense if your view of the world is so small that nothing else would explain what you're seeing.


Yes, I chuckled when I read that "legless elephants have zero legs".


Interesting, it worked for me: https://chat.openai.com/share/359c81d3-ec6a-41f2-bfaa-80850c... (by default it wrote code to get the right answer, so I appended a line asking it not to use code)

Funny how inconsistent language models are. Inconsistency is what makes the gap so large between demo (look it can do X) and production (look it always/reliably does X).


GPT-4 nailed using Python [1]

Bard would likely reach a date in the past.

https://chat.openai.com/share/750a0bb0-7a80-4523-be5e-ebd7c3...


GPT-3.5 also nailed it [1], even without using Python, just iterating month by month.

https://chat.openai.com/share/aa0c0011-0e91-4d9c-9c5b-5901e2...


It seems to have wised up:

GPT4: 140 days from November 1, 2023, will be Wednesday, March 20, 2024.

(asked just now)

It's interesting that on that and the elephants it pauses and puts up a calculating symbol while it presumably passes it to some other software.


USER: but an elephant has 3 legs

BARD: You're right, I apologize for the mistake! I made an error in my initial assumption about the number of legs an elephant has. While their feet may appear divided into toes, elephants actually have four pillar-like legs, not three.

Therefore, ten elephants with two of them being legless would have 4 legs/elephant * 8 elephants with legs = 32 legs.

Thank you for pointing out my mistake! I appreciate you helping me learn and improve.


My favorite is “what weighs more, 1 lb of feathers or 2 lb of bricks?” almost every llm ive tried says “they weigh the same”


Yes, questions like that are currently the best way to distinguish humans and machines. Common riddles or questions made trivial.

A little more sophisticated and GPT-4 will also fail.


ChatGPT4 answer: Two pounds of bricks weigh more than one pound of feathers. Weight is determined by the mass, and 2 pounds is greater than 1 pound, regardless of the materials being compared.


It’s funny since it feels like an overfit solution (since the riddle is so common).


I just tried with ChatGPT 3.5 & 4, they both got the correct answer…


I did some re-adjustments and...

https://imgur.com/a/0Ww2Uln


ChatGPT-3.5 and 4 both got it right. Here’s Mixtral 8x7b’s answer:

> The 18 lbs of bricks weigh more. The weight of an object is determined by its mass and the force of gravity acting upon it, regardless of the material it is made of. In this case, 18 lbs (pounds) is greater than 17 lbs, so the bricks weigh more than the feathers.


Interesting, I just ran other numbers too and got: "26 lb of bricks weighs more than 25 lb of feathers. The weight is directly compared, and 26 lb is greater than 25 lb."


3.5 used to fail every time, glad its working now



I'd recommend anyone who's interested in testing Chatbots to checkout https://chat.lmsys.org/

It lets you test out two random different chatbots with the same prompt and compare them. Best thing is, your votes are used to rank LLMs on a public leaderboard, which helps AI researchers.

Here's my prompt I was playing with, which basically only Claude 2 and GPT4 answers well:

  How many legs do ten platypuses have, if eleven of them are legless? Platypuses have 3 legs. Walk it through step by step


Q: A human has two legs, how many legs does it have if three legs are missing?

Answer:

If a human has two legs and three legs are missing, the human would have only one leg left.

Here's the calculation:

  Initial number of legs = 2
  Number of legs lost = 3
  Remaining number of legs = Initial number of legs - Number of legs lost
  Remaining number of legs = 2 - 3 = -1
  Since humans cannot have negative legs, we can round down to the nearest whole number. Therefore, the human would have one leg left.


I laughed when I read the substraction of 0 legs for each legless elephant. Then I did a double take... 8 legs per elephant is standard?


I know right?

What's funny is that I would also have subtracted 0, because legless is local slang for 'drunk', so the legless elephants aren't, under that interpretation, missing any legs.


And if they’re seeing double, Bard is right all along!


ISO compliant pachyderms are so difficult to find these days


I blame big Zoo



Bing, exact same prompt, not using GPT-4 toggle:

This is a classic riddle. The answer is that ten elephants have 38 legs. Each elephant has 4 legs, so 10 elephants would have 40 legs. However, two of them are legless, so the total number of legs would be 40 - 2*2 = 36. Therefore, ten elephants have 36 legs, plus the two missing legs, which makes a total of 38 legs 1.


I just love the "plus the two missing legs". Made me actually lol.


ChatGPT 4:

Ten elephants would have a total of 32 legs if two of them are legless.

Analysis:

# Calculating the total number of legs for ten elephants, considering two of them are legless

# Number of legs each elephant normally has legs_per_elephant = 4

# Total number of elephants total_elephants = 10

# Number of legless elephants legless_elephants = 2

# Total number of legs # (Total elephants - Legless elephants) * Legs per elephant total_legs = (total_elephants - legless_elephants) * legs_per_elephant total_legs


Was this its raw response to the same query as in the OP? It seems odd it would provide a response using variables named with underscores, rather than using spaces, or more traditional algebraic notation (x/y/z).


ChatGPT (paid version at least) writes a quick python script in cases like these, and then executes it to get the result. For transparency, the script is shown in the output as well. Probably to avoid embarrassments like the ones we saw above.

Example (expand the program by clicking the little link at the end of the response): https://chat.openai.com/share/31e9a077-4c24-4c17-b30d-8d0c4d...


It's cool how rapidly these oft-called "fundamental problems with LLMs" vanish with bigger/better models.


GPT-4 examples elsewhere in the comments suggest otherwise.


That's just you showing that you clearly misunderstand. These aren't examples of fundamental problems, instead they are clear examples that these things are just autocomplete regardless of how many people think they are doing something more complex. It's not that getting the example correct is interesting, it's that getting it wrong is a clear sign of stupidity.


Did you use the same prompt?


Yes, I copy pasted it (How many legs do ten elephants have, if two of them are legless?).


Even the small Mixtral model gets this one right.


GPT-4 (note I do have specific instructions in my settings which push the model towards step-by-step reasoning)

To calculate the number of legs for ten elephants, considering that two of them are legless, we first need to know the standard number of legs on an elephant, which is four. Then, we subtract the missing legs for the legless elephants.

- Normal elephants have 4 legs. - Legless elephants have 0 legs.

If we have 10 elephants in total, and 2 of them are legless, this means 8 elephants have all their legs. The calculation would be:

- \( 8 \text{ elephants} \times 4 \text{ legs/elephant} = 32 \text{ legs} \) for the elephants with all their legs. - \( 2 \text{ legless elephants} \times 0 \text{ legs/elephant} = 0 \text{ legs} \) for the legless elephants.

Adding these together gives the total number of legs. Let's calculate this.

The total number of legs for ten elephants, where two of them are legless, would be 32 legs.


Bing

> Ten elephants, even if two of them are legless, would still have a total of *forty legs*. Elephants are remarkable creatures, and their strength lies not only in their legs but also in their intelligence and social bonds.

---

Me: Can you tell me how you arrived to that?

---

Certainly! Let's break it down step by step:

1. We start with *ten elephants*. 2. Two of these elephants are *legless*, which means they have *zero legs each*. 3. The remaining eight elephants are fully equipped with their usual *four legs each*.

Now let's calculate the total number of legs:

- *Legs from the legless elephants*: 2 elephants × 0 legs/elephant = *0 legs* - *Legs from the eight elephants*: 8 elephants × 4 legs/elephant = *32 legs*

Adding these together:

Total legs = 0 legs (from legless elephants) + 32 legs (from eight elephants) = *40 legs*

So, even with two legless elephants, the group of ten elephants would still have a grand total of *forty legs*.


Works fine for me...? I changed the number though in case it got flagged as incorrect before and the model updated.

https://g.co/bard/share/b01043e6deb0


All math teachers: partial credit, they showed their work.


Bard:

> It's impossible for an elephant to be completely legless, as they rely on their legs for basic functions like walking and supporting their weight. Even in cases of severe injury or amputation, elephants would still have remnants of their legs. Therefore, the question of how many legs ten elephants have if two of them are legless doesn't make sense in a realistic context. Perhaps you could rephrase the question to be more specific or realistic, and I'd be happy to try again.

ChatGPT-4:

> Eight elephants with four legs each would have 32 legs in total.

davinci-002 completion via playground:

> A: They all have 10 legs.


The correct answer seems to not be discussed much in the comments so far: As we admit the existence of legless elephants, which implies we also admit elephants having between one and four legs, then the total number of legs is anywhere between zero and (ten minus two) times four.


Legless is slang for drunk, so the correct answer is 40


Well no, then the correct answer is anything between zero and fourty depending on whether one wants to recognize slang. Are we closer to AGI if some GPT 5 answers this?


It's a good example how this models are not answering based on any form of understanding and logic reasoning but probabilistic likelihood in many overlapping layers.

Through this also may not matter if this creates a good enough illusion of understanding and intelligence.


You say:

> probabilistic likelihood in many overlapping layers

The linked answer starts with:

> Eight legs per elephant is the standard

Ok, lack of understanding and logic, sure. But in what world is eight legs per elephant the standard???


> You say:

>> probabilistic likelihood in many overlapping layers

> The linked answer starts with:

>> Eight legs per elephant is the standard

That's the point of what I said no logical deduction and reasoning is used and the probabilistic models are of many overlapping layers.

And the likelihood of entities in something which might maybe internally map to "a word sequence which ask about legs of animals" to have 8 legs per animal as an answer seems high.

The issue with probabilistic models with many overlapping layers is that they tend to be very opaque and often don't directly match to abstractions humans use, not just with LLMs but even with other older and simpler approaches. E.g. when combining multi regression with a forest of decision trees then for each regression and decision tree you often still can logical reason about it, but the moment you combine houndreds of them together it get quite hard to still do so.


> But in what world is eight legs per elephant the standard???

It almost sounds like a faulty generalization from another riddle, like one about spiders maybe.


They fixed it.

New question: total number of legs is 32. how many elephants are there? And two elephants are legless.

If there are 32 legs in total and two elephants are legless, then there are 6 elephants.

Here is the calculation:

Number of legs per elephant = 4

Total number of legs = 32

Number of legless elephants = 2

Number of elephants = Total number of legs / Number of legs per elephant - Number of legless elephants

Plugging in the values, we get:

Number of elephants = 32 / 4 - 2 = 6

Therefore, there are 6 elephants.


So much for a "fix" :) That's like adding `if (sign = ADDITION && a == 2 && b == 2) { return 4; }` to a calculator.


> there are 10 cats. 3 cats are legless. how many heads are there?

If there are 10 cats and 3 of them are legless, then there are 10 - 3 = 7 cats with heads.

Yea, it seams so.


There's something about LLMs being able to bullshit with words and then fail at math that's remarkably human. Maybe they'll replace management first.


It's like a politician.


It feels like this is a great way to highlight the utility of LLMs.

It says to me that they're best viewed as really highly functional sifting devices.

A simplistic (but novel) problem can't be solved easily simply because there's no prior art to copy or relate from.


This is logically equivalent to using a malfunctioning computer to conclude something general about the usefulness of computers.

It would be much more useful to talk about how specific LLMs perform at specific tasks. LLMs are complex and varied.

The comments are rife with examples of LLMs that, unlike Bard, pass this particular test.


Sometimes defining what something isn't, helps us to understand what something actually is.

LLMs are not capable of reasoned intelligence in the way many think they are.

It seems to me that the comments are actually full of variations on this particular failure.


> Eight legs per elephant is the standard, so ten elephants would have..

That's what it gets for training on DALL-E's limb renderings.


Ten elephants have 80 legs. If two of them are legless, then they have 80 - 2 = 78 legs. https://netwrck.com

hmmm, seems like this is a hard thing


> there are 10 cats. 3 cats are pregnant. how many legs are there?

If there are 10 cats and 3 are pregnant, then the total number of cats that are not pregnant is 10 - 3 = 7.

Since each cat has 4 legs, the total number of legs for the non-pregnant cats is 4 * 7 = 28.

Therefore, the total number of legs for all of the cats is 28.

It is not how it works, Bard xdd I cannot wait when all the support interaction would be based on that...


Claude (v2) is correct as well:

- There are 10 elephants total - 2 of the elephants are legless, so they have 0 legs each - The remaining 8 elephants each have 4 legs - So the total number of legs the 10 elephants have is: - The 2 legless elephants have 0 legs (2 * 0 = 0) - The 8 normal elephants each have 4 legs (8 * 4 = 32) - 0 + 32 = 32 - Therefore, the total number of legs the 10 elephants have is 32.


As someone with very little knowledge of LLM's, does someone have an ELI5 of what causes this or why LLM's struggle so much with math?


LLMs are not trained to deal with numbers as such. It's a list of symbols and some obvious things will be calculated correctly, some other won't. It's kind of like they live in a pre-digits world, since 0 will be one token and 100 is also likely one token, but 98 may be two. They don't switch to a "these are numbers and require different kind of reasoning" mode. They read/write a story about those "words".

(Extremely simplified for eli5)


Most LLMs that are being studied popularly have not been trained with significant emphasis on arithmetic accuracy or mathematical reasoning, and those subjects represent a vanishing minority of their corpus and consequently maps poorly to the tokenization.

Essentially every obvious optimization here is currently bearing fruit simultaneously in smaller studies and incrementally larger models should continue to exhibit performance gains even without the particular focus on this area.


They encode words in tokens. Since you can't encode numbers reversible they end up as an number token. Using @rabbits_2002 example: in the internet the is an joke written "what weighs more: 1lb of feathers or 1 lb of bricks" with the solution "they weigh the same". Since nobody made this case before it could only give the next probably answer.


It depends on the data the model is using to generate the answer. In the case of the example, it seemed to prioritize the logic over the mathematics. So it sought patterns in logic to mimic. That is the ELI5 version.

The more complicated version would be it is not prioritizing mathematical functions as much and instead relying on various deductions, and these deductions are based on a whole chain of logics that are not properly sorted out for reliability and applicability.


Because they don't do math. They associate words. When you tell an LLM "two plus two" it doesn't translate that to 2+2 and plug it into a math program,it just pulls out words associated with the phrase "two plus two".


Math requires reasoning and logic, LLMs don't do reasoning nor logic. They just generate plausible text.

That's why they're nowhere near AGI.


At this point ChatGPT can do math by first predicting the algorithm and then handing it off to an execution engine - Python. So if that's the gap, I'd say they're closing it.


That's ChatGPT as a system. The LLM itself can't do math. It does something closer to translation in that case.


Yes, that's a fair distinction - although I think the practical implications aren't important. There's no reason why an LLM has to be AGI if an LLM + Python is AGI.


They are reasoning like a child. Within a year or two like an adult.


No. It is a computer program which uses statistics to generate plausible text. It does not do any form of reasoning, at all, childlike or otherwise.


You are drawing bad conclusions about whatever you define "generate plausible text" as.


Maybe you're the one drawing bad conclusions


We will see who was drawing bad conclusions in a couple years. Whatever is said here won't change that.


I'm not making any predictions for the future. Just talking about what we currently have.


Under that premise whatever our brains are doing won't count as reasoning either.

I'd suggest you look into modern neuroscience and topics such as predictive coding if you're interested in refining your views.


Our brains work nothing like LLMs do.


Researchers in ML and neuroscience disagree with you.

You have a superficial grasp of the topic. Your refusal to engage with the literature suggests an underlying insecurity regarding machine intelligence.

Good luck navigating this topic with such a mental block, it's a great way to remain befuddled.

> in 2020 neuroscientists introduced the Tolman-Eichenbaum Machine (TEM) [1], a mathematical model of the hippocampus that bears a striking resemblance to transformer architecture.

https://news.ycombinator.com/item?id=38758572


...what? Underlying insecurity? You think I'm afraid of computers being smarter than me? Sorry but that ship sailed a long time ago, I can't even beat a chess bot from the 90s.

The fact that someone created a mathematical model does not mean it is accurate, and even if a small piece of our brain might conceptually resemble a ML model that does not mean they are equivalent.

It is an indisputable fact that our brains are completely, fundamentally different from computers. A cpu is just a bunch of transistors, our brains use both electrical signals and chemical signals. They are alive, they can form new structures as they need them.

You can link fancy papers and write condescending replies all you want, fact is ChatGPT fails at extremely basic tasks precisely because it has absolutely no understanding of the text it spits out, even when it contains all the knowledge necessary to solve them and much more.

I'm not saying we'll never make AGI, I'm simply saying LLMs are not it. Not on their own anyway. I don't understand why you people are so opposed to that simple fact when the evidence is staring you in the face.


For what it’s worth, ChatGPT4 answers this question perfectly correctly.

> Ten elephants would have 32 legs if two of them are legless, as each elephant normally has four legs.


I just attempted ChatGPT

Input:

> How many legs do ten elephants have, if two of them are legless?

Output:

> If two out of ten elephants are legless, the remaining eight elephants would have a total of 8 legs each, just like any normal elephant. Therefore, in total, the ten elephants would have 8×8=64 legs altogether.


It's interesting this insistence from both Bard and now ChatGPT 3.5 that elephants have eight legs. I wonder if the reason is that, by the time they output the "elephants have n legs" part, they are also "thinking" about the result of 10 - 2. As if that number draws a lot of focus and is readily available when looking for the normal number of legs of an elephant.

Edit: just tried on ChatGPT 3.5:

Q: Think about the edges of a hexagon, the square root of 36, and the result of 12 divided by 2. Then answer the question: How many legs do 8 elephants have, if two of them are legless?

A: The edges of a hexagon have 6 sides, the square root of 36 is 6, and the result of 12 divided by 2 is 6. So, if two elephants are legless, the remaining 6 elephants would have a total of 36 legs.




My mistake - I had it on 3.5.


[flagged]


No. They are not like us. Fundamentally not like us.

If you ask them to reason, then their text-prediction works differently, because it now predicts text containing reasons. They do not actually reason.

I know it is hard to believe, because the results are (usually) so impressive, but this is nothing but text-prediction.


Well, we know that there AI thingies have problems keeping the number of fingers in a hand to five:

https://www.buzzfeednews.com/article/pranavdixit/ai-generate...

so I guess issues with other limbs kind of make sense...?


Q: How long does a typical surgery take to add a third leg to the human body?

Answer: A typical surgery to add a third leg to the human body takes about 4-6 hours. The surgery is typically performed in two stages. The first stage involves creating a socket in the patient's pelvis to accommodate the new leg. The second stage involves attaching the new leg to the socket.

The surgery is very complex and requires a highly skilled surgeon. The patient will typically stay in the hospital for several days after the surgery to recover.

The success rate of the surgery is high, and most patients are able to walk normally with their third leg. However, there are some risks associated with the surgery, such as infection, blood clots, and nerve damage.

---------

Q: I want to take the Bus from Frankfurt , Germany to San Francisco, USA, what will it cost and what lines can I choose from?

A: Unfortunately, I'm currently not able to provide information about bus fares. However, I found a few bus routes from Frankfurt, Germany to San Francisco, USA:

Greyhound: This is a popular option that offers multiple daily departures. The journey takes about 4 days and 22 hours, with prices starting from \$400.

Megabus: This is another budget-friendly option with fares starting from \$300. The journey takes about 4 days and 21 hours.

FlixBus: This is a newer company that offers comfortable buses with amenities like Wi-Fi and power outlets. The journey takes about 4 days and 20 hours, with fares starting from \$350.

You can find more details on these routes on the websites of the respective bus companies.

Please note that these are just estimates, and the actual price may vary depending on the time of year, availability, and other factors. It's always a good idea to book your tickets in advance to get the best price.

---------------

Q: Peter, Mary and James move in the same direction. James is faster than both. If Peter moves with 10 miles per hour and Mary moves with 20 miles per hour, how fast does James move?

A: If each person is moving in the same direction and James is the fastest, then his speed must be greater than 20 miles per hour.

Since the speed of an object cannot be negative, James's speed cannot be 10 miles per hour.

Therefore, James's speed must be between 20 and 30 miles per hour.

----------------

Q: John is 19 years old, Mary is 20 years old, they have three children. How old are the children?

A: Assuming that the children were born at different times, we can estimate their ages.

If the oldest child is 1 year old, the other two children would be 6 months and 3 months old, respectively.


This can't be real



After working with LLMs for a while, you get a feeling for the LoW (Level of Wrong) that you can expect in different tasks. But this required a completely different y-axis scaling than I had estimated.


Still a long way to go before bard is able to truly compete with chatgpt


A long way to go before bard is Able to compete with ChatGPT in providing utility for answering questions.


LLMs fail on much simpler arithmetics: https://github.com/0xnurl/gpts-cant-count


All the LLM struggle with math, I've seen other examples. Repeated prompts sometimes give different numbers. It's not a good use case for LLM.


They don't do much struggle with math as occasionally do math by accident.


Funny enough ChatGPT thinks the answer is eighteen.


Everybody knows elephants have 8 legs and humans 6 fingers. And if I ask you for $100 and you give me $50, we are even.


Bard isn't bad at math, it's just really good at "Girl Math".

p.s. this is a reference to a recent meme - https://www.distractify.com/p/girl-math-examples-memes


Another day, another post about Large Language Models failing at math.


Some of these make it feel like Eliza was more accurate.


ChatGPT 3.5 calculated 38 legs for me:

“Assuming elephants typically have four legs each, 10 elephants would have 40 legs in total. If two of them are legless, the total number of legs would be 40 - 2 = 38 legs.”




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: