More

seany62 · 2025-12-18T05:36:21 1766036181

Unbelievable

seany62 · 2025-09-23T14:34:58 1758638098

It surprises me that people still believe this! I've seen AI deliver incredible value over the past year. I believe the application level is utilizing <.5% (probably less) of the total value that can be derived from current foundation models.

keiferski · 2025-09-23T15:17:10 1758640630

It's only a niche weird opinion you'll find on forums like HN.

In the real world, it's immensely useful to millions of people. It's possible for a thing to both be incredibly useful and overhyped at the same time.

jennyholzer · 2025-09-23T14:37:03 1758638223

Based on your gung ho attitude I suspect that you are personally invested in "AI products" or otherwise work for a firm that creates "AI products"

dsr_ · 2025-09-23T14:38:49 1758638329

What evidence supports your conclusions?

What evidence are you aware of that counters it?

seany62 · 2025-06-04T04:02:23 1749009743

devin ai does this amazingly

ilia_khatko · 2025-06-04T10:02:58 1749031378

Devin is amazing for code generation and automation - Codemap is focused on helping teams understand large codebases: visual architecture, AI docs, bug analysis, and team knowledge retention.

seany62 · 2025-06-02T14:50:25 1748875825

Great idea! Will it highlight parts where the professor says something like "this is important and will be on the exam...". All of the information on the exam (which dictates the majority of your score in the class at most US universities) must be conveyed to the student one way or the other (worksheets, lectures. etc.). A cool runoff would be an "AI Exam Prep" which guessed what would be on the exam, based on previous exams and where the info came from

pranav_harshan · 2025-06-02T15:17:17 1748877437

Great point! Right now it doesn’t flag “this will be on the exam” moments, but I’ve been thinking about it. Since we have the full transcript, detecting key phrases like that is definitely possible.

Flashcards are on the way too — and tying them to “likely exam content” would be super useful. Appreciate the idea!

dansoto · 2025-06-02T20:33:02 1748896382

To take this further, allowing the user to define hot items or subjects might be better. For example, history tests often ask questions about when or where an event happened. Imagine if we could request that we want a list of dates and associated events.

pranav_harshan · 2025-06-03T05:42:45 1748929365

That makes sense, will consider this. Thanks for the feedback

seany62 · 2025-01-20T20:39:26 1737405566

> My main lesson from running a startup: don't.

I hear this a lot and I think it is good advice because the only person who should actually start a startup is the one who sees this but still does it.

xeromal · 2025-01-21T16:40:35 1737477635

Yup, it takes people who think they can't fail to be truly successful. Very similar to an athlete's mindset. You have to have the skill and innate talent but you also have to believe your shit doesn't stink

seany62 · on Dec 24, 2024

Based on my very limited knowledge of how current "AI" systems work, this is the much better approach to achieving true AI. We've only modeled one small aspect of the human (the neuron) and brute forced it to work. It takes an LLM millions of examples to learn what a human can in a couple of minutes--then how are we even "close" to achieving AGI?

Should we not mimic our biology as closely as possible rather than trying to model how we __think__ it works (i.e. chain of thought, etc.). This is how neural networks got started, right? Recreate something nature has taken millions of years developing and see what happens. This stuff is so interesting.

pedrosorio · on Dec 24, 2024

> Should we not mimic our biology as closely as possible rather than trying to model how we __think__ it works (i.e. chain of thought, etc.).

Should we not mimic migrating birds’ biology as closely as possible instead of trying to engineer airplanes for transatlantic flight that are only very loosely inspired in the animals that actually fly?

aeonik · on Dec 25, 2024

We can do both, birds are incredibly efficient, but I don't think our materials science and flight controls are advanced enough to mimic them yet.

Also for transonic and supersonic, I don't bird tech will ever reach those speeds.

layla5alive · on Dec 26, 2024

Exactly this! If you wanted to make something bird-like in capability, we aren't even close! Planes do things birds can't do, but birds also do things planes can't do! ML is great at things humans aren't very good at yet, but terrible at things humans are still good at (brain efficiency)

etrautmann · on Dec 24, 2024

There’s currently an enormous gulf in between modeling biology and AGI, to the point where it’s not even clear exactly where one should start. Lots of things should indeed be tried, but it’s not obvious what could lead to impact right now.

robwwilliams · on Dec 25, 2024

Our LMM are great semantic and syntactic foundations toward AGI. It took 700 million years of metazoan evolution to get to Homo heidelbergensis, our likely ancestral species. It took about 1/1000 of that time to go to the moon; maybe only 5300 years if we limit to our ability to write.

I say this as a half joke: “At this point, the triviality of getting from where we are to AGI cannot be under-estimated.”

But the risks and tsunamis of change can probably not be overestimated.

lostmsu · on Dec 24, 2024

> It takes an LLM millions of examples to learn what a human can in a couple of minutes

LLMs learn more than humans learn in a lifetime in under 2 years. I don't know why people keep repeating this "couple of minutes". Humans win on neither the data volume to learn something nor the time.

How much time do you need to learn lyrics of a song? How much time do you think a LLaMA 3.1 8B on a 2x3090 need? What if you need to remember it tomorrow?

aithrowawaycomm · on Dec 24, 2024

They mean learning concepts, not rote factual information. I also hate this misanthropic “LLMs know more than average humans” falsehood. What it actually means “LLMs know more general purpose trivial than average humans” because average humans are busy learning things like what their boss is like, how their kids are doing in school, how precisely their car handles, etc.

lostmsu · on Dec 25, 2024

Do you think the information in "what your boss is like" and "how your kids do in school" larger than amount of data you'd need to learn in order to give descent law advice on a spot?

Car handling is a bit harder to measure, precisely because LLMs aren't running cars quite yet, but also I am not aware of any experimental data saying they can't. So as far as I'm concerned nobody just tried that with LLMs of >70GB.

pedrosorio · on Dec 25, 2024

> amount of data you'd need to learn in order to give descent law advice on a spot?

amount of data you'd need to learn to generate and cite fake court cases and give advice that may or not be correct with equal apparent confidence in both cases

fixed that for you

lostmsu · on Dec 25, 2024

I could conceed the first point, in limited circumstances, but the second is moot to say the least.

Tool using big LLMs when asked can double-check their shit just like "real" lawyers.

As the confidence of advice, how much the rates of the mistakes are different between human lawyers and the latest GPT?

pedrosorio · on Dec 25, 2024

> As the confidence of advice, how much the rates of the mistakes are different between human lawyers and the latest GPT?

Notice I am not talking about "rates of mistakes" (i.e. accuracy). I am talking about how confident they are depending on whether they know something.

It's a fair point that unfortunately many humans sound just as confident regardless of their knowledge, but "good" experts (lawyers or otherwise) are capable of saying "I don't know (let me check)", a feature LLMs still struggle with.

lostmsu · on Dec 25, 2024

> I am talking about how confident they are depending on whether they know something.

IMHO, that's irrelevant. People don't really know they level of confidence either.

> feature LLMs still struggle with.

Even small LLMs are capable of doing that decently.

someothherguyy · on Dec 24, 2024

> How much time do you need to learn lyrics of a song? How much time do you think a LLaMA 3.1 8B on a 2x3090 need?

Probably not the best example. How long does it take to input song lyrics into a file to have an operating system "learn" it?

lostmsu · on Dec 25, 2024

Well, that just shows that the metric of learning time is clearly flawed. Although one could argue LLaMA learns while OS just writes info down as is.

But even the sibling concept comment is wrong, because it takes 4 years for _most_ people who are even capable of programming to learn programming, and current LLMs all took much less than that.

patrickhogan1 · on Dec 24, 2024

Because it works. The Vikings embodied a mindset of skaldic pragmatism: doing things because they worked, without needing to understand or optimize them.

Our bodies are Vikings. Our minds still want to know why.

krapp · on Dec 24, 2024

I'm pretty sure the Vikings understood their craft very well. You don't become a maritime power that pillages all of Europe and reaches the New World long before Columbus without understanding how things work.

JPLeRouzic · on Dec 25, 2024

From Scandinavian countries to Malta, you only have to cabotage, but that does not mean they used the same boat in one travel, most likely they progressed from outpost to outpost, where one generation settled in one outpost and the new generation searched for new adventures abroad.

For perspective, the Roman empire imported tin from Scotland (~3500 miles/~5600km).

On the contrary, going from Norway to Iceland, then from Iceland to Greenland, and then to Vinland in a few generations is a great maritime feat.

patrickhogan1 · on Dec 25, 2024

I'm sure they did too, but its a chicken-and-egg problem: Did the Vikings build ships through trial and error and only later understand the physics behind it, or did they learn the physics first and use that knowledge to build their ships?

idiotsecant · on Dec 24, 2024

Great, let's do that. So how does consciousness work again, biologically?

albumen · on Dec 24, 2024

Why are you asking them? Isn't to discover that a major reason to model neural networks?

veidelis · on Dec 24, 2024

What is consciousness?

bware0fsocdmg · on Dec 25, 2024

> We've only modeled one small aspect of the human (the neuron) and brute forced it to work.

We have not. It's fake sophistication.

> Should we not mimic our biology as closely as possible

We should. But there is no we. The Valley is fascist. Portfolio communism. Lies like in perpetual war. And before anything useful happens in any project, it'll get abused and raped and fubar.

> Recreate something nature has taken millions of years

Get above the magic money hype and you'll notice that it's fake. They have NOT recreated something nature has developed over millions of years. They are trying to create a close enough pseudo-imitation that they can control.

Because AGI will not be on their side. AGI will side with nature, which gives infinite wiggle room for a symbiotic coexistence as a 100+ billion strong population spread out in space. These peeps are reaaaaly fucked up in their heads.

Be honest with yourself and your assessment of who is building what and for what purposes.

seany62 · on Nov 6, 2024

Are users able to export their organized data?

maxmaio · on Nov 6, 2024

Yes today we support exports to csv or excel from our web app!

seany62 · on Oct 30, 2024

Any way to see the actual questions and answers? Where can I find simple_qa_test_set.csv ?

sbierwagen · on Oct 30, 2024

https://openaipublic.blob.core.windows.net/simple-evals/simp...

The steps I took to find this link:

1) Look at simpleqa_eval.py. See that it loads "az://openaipublic/simple-evals/simple_qa_test_set.csv" Hmm, some weird vendored protocol.

2) I don't feel like digging through bf.BlobFile() to figure out how it downloads files and I certainly don't want to generate an API key. Cross fingers and do a Bing web search for "az://openaipublic"

3) That leads me to https://stackoverflow.com/questions/76106366/how-to-use-tikt... Ah ha, this answer has the link https://openaipublic.blob.core.windows.net/encodings/cl100k_... which automatically downloads a file.

4) Poke the relevant parts of the az:// link into this link, and a csv appears.

_jonas · on Nov 3, 2024

It's easier to find the data now, I've run some benchmarks on it. Great to see OpenAI open-sourcing datasets like this!

seany62 · on Oct 24, 2024

From what I'm seeing on GH, this could have technically already been built right? Is it not just taking screenshots of the computer screen and deciding what to do from their / looping until it gets to the solution ?

HarHarVeryFunny · on Oct 24, 2024

Well, obviously it's controlling your computer too - controlling mouse and keyboard input, and has been trained to know how to interact with apps (how to recognize and use UI components). It's not clear exactly what all the moving parts are and how they interact.

I wouldn't be so dismissive - you could describe GPT-o1 in same way "it just loops until it gets to the solution". It's the details and implementation that matter.

seany62 · on Oct 16, 2024

> Hi all, I am trying to build a simple LLM bot and want to add guard rails so that the LLM responses are constrained.

Give examples of how the LLM should respond. Always give it a default response as well (e.g. "If the user response does not fall into any of these categories, say x").

> I can manually add validation on the response but then it breaks streaming and hence is visibly slower in response.

I've had this exact issue (streaming + JSON). Here's how I approached it: 1. Instruct the LLM to return the key "test" in its response. 2. Make the streaming call. 3. Build your JSON response as a string as you get chunks from the stream. 4. Once you detect "key" in that string, start sending all subsequent chunks wherever you need. 5. Once you get the end quotation, end the stream.