Hacker News

supermdguy · on Nov 17, 2022

I was curious, so I just tried using Copilot on the first 10. Here are the results:

#1- Got it after the first hint

#2- Didn't get it

#3- Didn't get it

#4- Correct first try

#5- Correct after the first hint

#6- Correct first try

#7- Correct first try

#8- Correct after the first hint

#9- Didn't get it

#10- Didn't get it

bugfix-66 · on Nov 17, 2022

From your list, it has solved simple matrix multiplication, LSD radix sort, and pointer padding, all of which appear many, many times in its training set.

I'm surprised it can fix the two prediction compressor bugs, even with a hint... That shouldn't be in the training set. But the solutions to those puzzles did appear on the front page of Hacker News a few weeks ago (https://news.ycombinator.com/item?id=33396037), so they may have been uploaded to GitHub.

Can you paste the Correct! message (as evidence of solving it) and do more than just the first 10? Just list the ones it can solve. Thanks, I appreciate it.

sillysaurusx · on Nov 17, 2022

Hey, cool website. Thanks.

(It’s fair to throw down the gauntlet like you’re doing. You’re right that it’s a nice challenge, and that AI could solve or assist with at least one of those bugs. The trouble is that very few people have access to the AI, and even fewer have the skills to write custom tooling on top of it. The author is probably the only one who could even attempt your challenge. Hopefully that will change within a couple more years.)

bugfix-66 · on Nov 17, 2022

Lots of people here have access to Microsoft Copilot or GPT3.

People with access to these models can demonstrate how the system performs on code THAT WASN'T IN THE TRAINING SET by solving a few of these puzzles.

The reality is that all (?) the amazing demonstrations involve code very similar or identical to what appeared in the training set.

fenomas · on Nov 17, 2022

"It only works well on inputs that are similar to what appeared in its training set" seems like a strange criticism to make about an ML project, no?

bugfix-66 · on Nov 17, 2022

There are people who believe this is real AI, not just aggregation and interpolation. They really believe the software understands code generally.

crummy · on Nov 17, 2022

I don't think many people here think this is true AI.

alophawen · on Nov 17, 2022

Who cares, really?

There are people who believe in god, too.

Matheus28 · on Nov 17, 2022

From my understanding, your website doesn’t actually run the user code to see if it fixes the bug. Doesn’t that mean the user also has to guess how you fixed the bug?

bugfix-66 · on Nov 17, 2022

The code you submit is compiled, linked, and executed against a suite of tests.

Any functionally correct solution is accepted.

If the site rejected your code, it's because your code was wrong (sorry).

Matheus28 · on Nov 17, 2022

Interesting, it runs so fast I thought it was a simple string match

bugfix-66 · on Nov 17, 2022

No, it's just a well-written a piece of software.

Thanks :)

EMIRELADERO · on Nov 17, 2022

The question is polite, but the intention behind it isn't. It's a bait rethoric question to throw shade at the whole thing.

bugfix-66 · on Nov 17, 2022

I think we should question this, and give it a real test.

Let's see how Copilot does on code that wasn't in the training set.

No problem, right?

EMIRELADERO · on Nov 17, 2022

Sure, but based on your previous comments and overall stance on the matter, don't be surprised when most people have the opinion I expressed about your question.

bugfix-66 · on Nov 17, 2022

Please, if you have an "artificial intelligence" that can write and understand code, I'm sure it can fix some tiny bugs in a little code that wasn't in the training set.

Why is this so contentious?

Surely the "AI" can withstand a little scrutiny.

alophawen · on Nov 17, 2022

> Why is this so contentious?

Please follow the discussion revolving copilot if you really have no clue.

fastball · on Nov 17, 2022

How did you develop these tests?