More

blenderob · 2026-02-13T17:22:13 1771003333

> I understand what the author means, but I think that in any human-2-human interaction, we are all entitled to at least basic courtesy.

Correct. The article does not disagree with you.

blenderob · 2026-02-13T17:21:06 1771003266

(edit: I totally misunderstood the parent comment and wrote this reply. I've apologized for it in comment below. I could delete this comment but I am leaving it here so that others don't get confused when they see the replies below it.)

> or that anyone should have more power just because they are active in the project.

So you are saying that although I create a project to solve my problems but as soon as I make it open source (so that others can also benefit) my power on the project will become equal to the power every random person on earth has on my project?

If making my project reduces my power on the project, why would I ever open source anything?

Good thing that open source world does not work like that. When I make my project open source, I still have full power on my project and I decide what goes in it and what is rejected. I have no reason to not use the powers I have on the project.

If it ever became like you say that as the creator of an open source project, my powers will be equal to the powers of every random user, I'd stop making anything open source.

hinkley · 2026-02-13T21:54:26 1771019666

> If making my project reduces my power on the project, why would I ever open source anything?

For me, especially when I'm increasing the bus number of something I worked on at work, it's down to two things. Either I'm hoping that my 'power' will remain the same but the 'power' of the project will grow and the new people will take their share out of the surplus.

Or, I want to focus my power elsewhere, and as long as I'm sole proprietor to this project I will be associated with it to the exclusion of other things. It was having my face pressed against glass of shiny new things I was iced out of at work that finally taught me the value of sharing. Indispensable can make you typecast. Deputizing someone has benefits that usually outweigh the costs.

rlnorthcutt · 2026-02-13T17:44:51 1771004691

Straw man + slippery slope.

I never said that, or implied it. It would be dumb to say that someone who creates an open source project is at the mercy of the people who use it.

But, many people have had the experience of dealing with loud voices in open source communities, and sometimes abusive voices. Or people who are pushing/promoting things that they want but are actually contrary to the goals and well being of the project.

As I stated, that power is a potential route to abuse. This is absolutely true whether the person is a maintainer, contributor, or creator.

If you create an open source project, of course you have absolute power over it... to suggest otherwise is foolish.

And we have seen projects that fail or collapse due to lack of leadership, corrosive culture, myopia, or burnout. That is inevitable.

My point is that we need to be realistic about these things. This goes back to the original post that "open source is not about you". Users aren't "owe" anything by a project or its creator. At the same time, creators/maintainers have a relationship with the community.

How they choose to manage that relationship is their choice... but we should be aware and honest about what that means and how it impacts the project (and the community).

blenderob · 2026-02-13T17:53:20 1771005200

Yes, totally fair. I totally misunderstood your original comment. My bad and my apologies!

rlnorthcutt · 2026-02-13T19:21:00 1771010460

Wow - I really appreciate you taking the time to look at it again. My original comment was written quickly, and probably no where near as clear as it could have been.

I respect your willingness to modify your original stance upon closer examination. Non-ironic hat tip.

blenderob · 2026-02-07T16:17:42 1770481062

The goalposts have been on wheels basically since the field was born. Look up "AI effect". I've stopped caring what HN comments have to say about whether something is or isn't AI. If its useful to me, I'm gonna use it.

blenderob · 2026-02-07T16:11:04 1770480664

> I.e. a solution is known, but is guaranteed to not be in the training set for any AI.

Not a mathematician and obviously you guys understand this better than I do. One thing I can't understand is how they're going to judge if a solution was AI written or human written. I mean, a human could also potentially solve the problem and pass it off as AI? You might say why would a human want to do that? Normal mathematicians might not want to do that. But mathematicians hired by Anthropic or OpenAI might want to do that to pass it off as AI achievements?

teraflop · 2026-02-07T16:47:39 1770482859

Well, I think the paper answers that too. These problems are intended as a tool for honest researchers to use for exploring the capabilities of current AI models, in a reasonably fair way. They're specifically not intended as a rigorous benchmark to be treated adversarially.

Of course a math expert could solve the problems themselves and lie by saying that an AI model did it. In the same way, somebody with enough money could secretly film a movie and then claim that it was made by AI. That's outside the scope of what this paper is trying to address.

The point is not to score models based on how many of the problems they can solve. The point is to look at the models' responses and see how good they are at tackling the problem. And that's why the authors say that ideally, people solving these problems with AI would post complete chat transcripts (or the equivalent) so that readers can assess how much of the intellectual contribution actually came from AI.

blenderob · 2026-02-07T15:59:57 1770479997

February 13 seems right to me. I mean it's not like LLMs need to manually write out a 10 page proof. But a longer deadline can give human mathematicians time to solve the problem and write out a proof. A close deadline advantages the LLM and disadvantages humans which should be the goal if we want to see if LLMs are able to solve these.

blenderob · 2026-02-07T15:52:42 1770479562

Can someone explain how this would work?

> the answers are known to the authors of the questions but will remain encrypted for a short time.

Ok. But humans may be able to solve the problems too. What prevents Anthropic or OpenAI from hiring mathematicians, have them write the proof and pass it off as LLM written? I'm not saying that's what they'll do. But shouldn't the paper say something about how they're going to validate that this doesn't happen?

Honest question here. Not trying to start a flame here. Honestly confused how this is going to test what it wants to test. Or maybe I'm just plain confused. Someone help me understand this?

yorwba · 2026-02-07T16:12:10 1770480730

This is not a benchmark. They just want to give people the opportunity to try their hand at solving novel questions with AI and see what happens. If an AI company pulls a solution out of their hat that cannot be replicated with the products they make available to ordinary people, that's hardly worth bragging about and in any case it's not the point of the exercise.

YeGoblynQueenne · 2026-02-07T18:30:56 1770489056

Hey, sorry, totally out of context but I've always wanted to ask about the username. I keep reading it as "yoruba" in my mind. What does it mean, if I'm not being indiscreet?

yorwba · 2026-02-07T20:04:51 1770494691

You're not the first to have wondered: https://news.ycombinator.com/item?id=20730027

YeGoblynQueenne · 2026-02-07T22:51:42 1770504702

Well, now that I read that comment I remembered having read it before. My mind is going.

cocoto · 2026-02-07T16:54:30 1770483270

They could solve the problems and train the next models with the answers, as such the future models could “solve” theses.

fph · 2026-02-07T17:03:39 1770483819

The authors mention that before publications they tested these questions on Gemini and GPT, so they have been available to the two biggest players already; they have a head start.

data_maan · 2026-02-07T18:16:13 1770488173

Looks like very sloppy research.

pickleRick243 · 2026-02-07T23:39:30 1770507570

I don't think it's that serious...it's an interesting experiment that assumes people will take it in good faith. The idea is also of course to attach the transcript log and how you prompted the LLM so that anyone can attempt to reproduce if they wish.

data_maan · 2026-02-08T08:26:23 1770539183

If you want to do this rigorously, you should run it as a competition like the guys at the AI-MO Prize are doing on Kaggle.

That way you get all the necessary data.

I still think this is bro science.

yorwba · 2026-02-08T09:34:15 1770543255

If this were a competition, some people would try hard to win it. But the goal here is exploration, not exploitation. Once the answers are revealed, it's unlikely a winner will be identified, but a bunch of mathematicians who tried prompting AI with the questions might learn something from the exercise.

data_maan · 2026-02-10T19:11:46 1770750706

But everything has been explored in other datasets already.

If only a bunch of mathematicians learn something, why are so many people talking about this, why is the NY Times posting about this?

This is the attention economy at its worst.

iLoveOncall · 2026-02-07T18:10:06 1770487806

That was exactly my first thought as well. All those exercises are pointless and people don't seem to understand it, it's baffling.

Even if it's not Anthropic or OpenAI paying for the solutions, maybe it'll be someone solving them "for fun" because the paper got popular and posting them online.

It's a futile exercise.

data_maan · 2026-02-07T18:17:29 1770488249

Nothing prevents them, and they are already doing that. I work in this field and one can be sure that now, because of the notoriety this preprint got, the questions will be solved soon.

conformist · 2026-02-07T16:14:29 1770480869

It's possible but unlikely given the short timeline, diverse questions that require multiple matheamticians, and low stakes. Also they've already run preliminary tests.

blenderob · 2026-02-07T16:29:31 1770481771

> It's possible but unlikely given the short timeline

Yep. "possible but unlikely" was my take too. As another person commented, this isn't really a benchmark, and as long as that's clear, it seems fair. My only fear is that some submissions may be AI-assisted rather than fully AI-generated, with crucial insights coming from experienced mathematicians. That's still a real achievement even if it's human + AI collaboration. But I fear that the nuance would be lost on news media and they'll publish news about the dawn of fully autonomous math reasoning.

fruitworks · 2026-02-07T23:52:17 1770508337

Because LLMs are deterministic, they could provide the model files, prompt, and seed used.

blenderob · 2026-02-03T13:04:33 1770123873

Has people's ability to read messages and formulate sensible replies been going down of late? I see this kind of meaningless replies more and more often these days.

direwolf20 · 2026-02-03T13:43:52 1770126232

Yes, there's a global intelligence crisis, due to tiktok instagram et al

chrisjj · 2026-02-03T14:12:07 1770127927

Meaningless? Its a clear question.

PKop · 2026-02-03T15:30:39 1770132639

You're accusing him of having a problem with it, which his comment does not imply.

blenderob · 2026-02-03T12:53:59 1770123239

You can just email hn@ycombinator.com to get help. They can reset your password if there's someway for them to verify that you were the owner of the account.

blenderob · 2026-01-29T13:35:50 1769693750

> By contrast, engineers mostly require a laptop and company hoodie.

Alas, gone are the days when engineers too required specialized equipment like a desktop computer on the desk that you couldn't move with you. Every evening, you left it at office and went home to live a 100% home life. Alas, gone are those days.

bitwize · 2026-01-29T15:00:16 1769698816

I was a contractor for a FAANG. My immediate cubicle neighbor liked to work out on lunch break, and to hang his sweaty, smelly gym things on the framework of his desk to air dry when he returned. I would've KILLED for a laptop I could take to the complimentary office café so I could get something done without holding my nose. If I were a full employee, I could just ask for one. Alas, as a mere contractor I was something less than a person, so I had to remain tethered to my desktop per company policy.

steve1977 · 2026-01-29T16:03:55 1769702635

I mean, if killing was an option, getting rid of the cubicle neighbor would have solved the problem as well.

zrail · 2026-01-29T14:17:33 1769696253

Maybe for some. I've worked from home for 15 years and a huge thing that I've learned is that I have to have a hard physical boundary. My work laptop stays at my desk unless I'm on call and actively fighting a fire. When I want to use my desk for non-work things the work laptop gets put away.

QuiEgo · 2026-01-29T14:52:32 1769698352

Crunch time in those days sucked. I remember mandatory nights and weekends and the managers ordering in pizza for everyone.

blenderob · 2026-01-27T15:20:23 1769527223

> I also do not have a robots.txt so google doesnt index.

That doesn't sound right. I don't have robots.txt too but Google indexes everything for me.

mystraline · 2026-01-27T15:32:28 1769527948

https://news.ycombinator.com/item?id=46681454

I think this is a recent change.

daveoc64 · 2026-01-27T15:41:51 1769528511

All the comments there seem to suggest that there has been no change and that robots.txt isn't required.