At first I was like "What is this jerpint model that's beating the competition s... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		moffkalast on Dec 30, 2024 \| parent \| context \| favorite \| on: Performance of LLMs on Advent of Code 2024 At first I was like "What is this jerpint model that's beating the competition so soundly?" then it hit me, lol. Anyhow this is like night and day compared to last year, and it's impressive that Sonnet is now apparently 50% as good as a professional human at this sort of thing.

zkry on Dec 31, 2024 [–]

I don't think comparing star counts would be a good measure though, as with AOC 90% of the effort and difficulty goes into the harder problems towards the end and it was the beginning, easy problems where the bulk of the sonnet's stars came from.

moffkalast on Dec 31, 2024 | [–]

Ah yeah that's true, the difficulty curve is not very linear.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact