Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

[flagged]


I was curious, so I just tried using Copilot on the first 10. Here are the results:

#1- Got it after the first hint

#2- Didn't get it

#3- Didn't get it

#4- Correct first try

#5- Correct after the first hint

#6- Correct first try

#7- Correct first try

#8- Correct after the first hint

#9- Didn't get it

#10- Didn't get it


From your list, it has solved simple matrix multiplication, LSD radix sort, and pointer padding, all of which appear many, many times in its training set.

I'm surprised it can fix the two prediction compressor bugs, even with a hint... That shouldn't be in the training set. But the solutions to those puzzles did appear on the front page of Hacker News a few weeks ago (https://news.ycombinator.com/item?id=33396037), so they may have been uploaded to GitHub.

Can you paste the Correct! message (as evidence of solving it) and do more than just the first 10? Just list the ones it can solve. Thanks, I appreciate it.


Hey, cool website. Thanks.

(It’s fair to throw down the gauntlet like you’re doing. You’re right that it’s a nice challenge, and that AI could solve or assist with at least one of those bugs. The trouble is that very few people have access to the AI, and even fewer have the skills to write custom tooling on top of it. The author is probably the only one who could even attempt your challenge. Hopefully that will change within a couple more years.)


Lots of people here have access to Microsoft Copilot or GPT3.

People with access to these models can demonstrate how the system performs on code THAT WASN'T IN THE TRAINING SET by solving a few of these puzzles.

The reality is that all (?) the amazing demonstrations involve code very similar or identical to what appeared in the training set.


"It only works well on inputs that are similar to what appeared in its training set" seems like a strange criticism to make about an ML project, no?


There are people who believe this is real AI, not just aggregation and interpolation. They really believe the software understands code generally.


I don't think many people here think this is true AI.


Who cares, really?

There are people who believe in god, too.


From my understanding, your website doesn’t actually run the user code to see if it fixes the bug. Doesn’t that mean the user also has to guess how you fixed the bug?


The code you submit is compiled, linked, and executed against a suite of tests.

Any functionally correct solution is accepted.

If the site rejected your code, it's because your code was wrong (sorry).


Interesting, it runs so fast I thought it was a simple string match


No, it's just a well-written a piece of software.

Thanks :)


The question is polite, but the intention behind it isn't. It's a bait rethoric question to throw shade at the whole thing.


I think we should question this, and give it a real test.

Let's see how Copilot does on code that wasn't in the training set.

No problem, right?


Sure, but based on your previous comments and overall stance on the matter, don't be surprised when most people have the opinion I expressed about your question.


Please, if you have an "artificial intelligence" that can write and understand code, I'm sure it can fix some tiny bugs in a little code that wasn't in the training set.

Why is this so contentious?

Surely the "AI" can withstand a little scrutiny.


> Why is this so contentious?

Please follow the discussion revolving copilot if you really have no clue.


How did you develop these tests?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: