Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Considering its corpus, to me it makes almost no sense for it to be more helpful when offered a tip. One must imagine the conversation like a forum thread, since that’s the type of internet content GPT has been trained on. Offering another forum user a tip isn’t going to yield a longer response. Probably just confusion. In fact, linguistically, tipping for information would be seen as colloquially dismissive, like “oh here’s a tip, good job lol”. Instead, though, I’ve observed that GPT responses improve when you insinuate that it is in a situation where dense or detailed information is required. Basically: asking it for the opposite of ELI5. Or telling it it’s a PhD computer scientist. Or telling it that the code it provides will be executed directly by you locally, so it can’t just skip stuff. Essentially we must build a kind of contextual story in each conversation which slightly orients GPT to a more helpful response. See how the SYSTEM prompts are constructed, and follow in suit. And keep in the back of your mind that it’s just a more powerful version than GPT2 and Davinci and all those old models… a “what comes next” machine built off all human prose. Always consider the material it has learned from.


If GPT is trained mostly on forums, it should obey "Cunningham's Law", which, if you're a n00b, says:

> "the best way to get the right answer on the internet is not to ask a question; it's to post the wrong answer."

This seems very empirically testable!


I like this idea, although preference-tuning for politeness might negate this effect


> ” One must imagine the conversation like a forum thread, since that’s the type of internet content GPT has been trained on”

Is it? Any source for that claim?

I would guess that books, fiction and nonfiction, papers, journalistic articles, lectures, speeches, all of it have equal or more weight than forum conversations


Hmm well I believe reddit made up a huge portion of the training data for GPT2 but yes, tbh I have no support for the claim that that's the case with current versions. Anyway, I guess if we consider a forum as following the general scaffold of human conversation, it's a good analogy. But yes there's a tonne of other content at play. If we consider, "where does chatgpt inherit its conversational approach from?" .. that may be a good approach. Almost nowhere in human prose, from either journals or novels, is there an exchange where a tip is seen as inviting a more verbose or detailed conversational response. It's kinda nonsensical to assume it would work.


The conversational approach is deliberate via fine tuning and alignment.


What the parent is suggesting is that content from forums is the only place where the model would have encountered the concept of getting a tip for a good answer. For all the other content in the training set like websites, books, articles and so on, that concept is completely foreign.

This is a first principles sanity check - very good to have against much of the snake oil in prompt engineering.

The one thing that is conceivable to me is that the model might have picked up on the more general concept, but if there has been a clear incentive then the effort to find a good answer is usually higher. This abstract form, I imagine, the model may have encountered not only in internet forums, but also in articles, books, and so on.


Between books and chats, there must be countless examples of someone promising a positive/negative result and the response changing.

Far as proof, I have lists of what many models used, including GPT3, in the "What Do Models Use?" section here:

https://gethisword.com/tech/exploringai/provingwrongdoing.ht...

For GPT3, the use of Common Crawl, WebText, and books will have conversational tactics like the OP used.


That’s why I also tested nonmonetary incentives, but “you will be permabanned, get rekt n00b” would be a good negative incentive to test.


Why? That's not usually part of a forum conversation.


> Considering its corpus, to me it makes almost no sense for it to be more helpful when offered a tip.

I think, to be able to simulate humans, an internal state of desirable and undesirable, which is similar to human's, is helpful.


It's as simple as questions that are phrased nicer get better responses. From there a tip might be construed as a form of niceness, which warrants a more helpful response. Same goes for posts that appeal for help due to a dying relative or some other reason getting better responses, which implies that you (the llm emulating human responses) want to help questions where the negative consequences are worse.


Consider that it’s seen SE bounties and the tipping behavior becomes more intelligible




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: