It surprises me that people still believe this! I've seen AI deliver incredible value over the past year. I believe the application level is utilizing <.5% (probably less) of the total value that can be derived from current foundation models.
Devin is amazing for code generation and automation - Codemap is focused on helping teams understand large codebases: visual architecture, AI docs, bug analysis, and team knowledge retention.
Great idea! Will it highlight parts where the professor says something like "this is important and will be on the exam...". All of the information on the exam (which dictates the majority of your score in the class at most US universities) must be conveyed to the student one way or the other (worksheets, lectures. etc.). A cool runoff would be an "AI Exam Prep" which guessed what would be on the exam, based on previous exams and where the info came from
Great point! Right now it doesn’t flag “this will be on the exam” moments, but I’ve been thinking about it. Since we have the full transcript, detecting key phrases like that is definitely possible.
Flashcards are on the way too — and tying them to “likely exam content” would be super useful. Appreciate the idea!
To take this further, allowing the user to define hot items or subjects might be better.
For example, history tests often ask questions about when or where an event happened. Imagine if we could request that we want a list of dates and associated events.
I hear this a lot and I think it is good advice because the only person who should actually start a startup is the one who sees this but still does it.
Yup, it takes people who think they can't fail to be truly successful. Very similar to an athlete's mindset. You have to have the skill and innate talent but you also have to believe your shit doesn't stink
Based on my very limited knowledge of how current "AI" systems work, this is the much better approach to achieving true AI. We've only modeled one small aspect of the human (the neuron) and brute forced it to work. It takes an LLM millions of examples to learn what a human can in a couple of minutes--then how are we even "close" to achieving AGI?
Should we not mimic our biology as closely as possible rather than trying to model how we __think__ it works (i.e. chain of thought, etc.). This is how neural networks got started, right? Recreate something nature has taken millions of years developing and see what happens. This stuff is so interesting.
> Should we not mimic our biology as closely as possible rather than trying to model how we __think__ it works (i.e. chain of thought, etc.).
Should we not mimic migrating birds’ biology as closely as possible instead of trying to engineer airplanes for transatlantic flight that are only very loosely inspired in the animals that actually fly?
Exactly this! If you wanted to make something bird-like in capability, we aren't even close! Planes do things birds can't do, but birds also do things planes can't do! ML is great at things humans aren't very good at yet, but terrible at things humans are still good at (brain efficiency)
There’s currently an enormous gulf in between modeling biology and AGI, to the point where it’s not even clear exactly where one should start. Lots of things should indeed be tried, but it’s not obvious what could lead to impact right now.
Our LMM are great semantic and syntactic foundations toward AGI. It took 700 million years of metazoan evolution to get to Homo heidelbergensis, our likely ancestral species. It took about 1/1000 of that time to go to the moon; maybe only 5300 years if we limit to our ability to write.
I say this as a half joke: “At this point, the triviality of getting from where we are to AGI cannot be under-estimated.”
But the risks and tsunamis of change can probably not be overestimated.
> It takes an LLM millions of examples to learn what a human can in a couple of minutes
LLMs learn more than humans learn in a lifetime in under 2 years. I don't know why people keep repeating this "couple of minutes". Humans win on neither the data volume to learn something nor the time.
How much time do you need to learn lyrics of a song? How much time do you think a LLaMA 3.1 8B on a 2x3090 need? What if you need to remember it tomorrow?
They mean learning concepts, not rote factual information. I also hate this misanthropic “LLMs know more than average humans” falsehood. What it actually means “LLMs know more general purpose trivial than average humans” because average humans are busy learning things like what their boss is like, how their kids are doing in school, how precisely their car handles, etc.
Do you think the information in "what your boss is like" and "how your kids do in school" larger than amount of data you'd need to learn in order to give descent law advice on a spot?
Car handling is a bit harder to measure, precisely because LLMs aren't running cars quite yet, but also I am not aware of any experimental data saying they can't. So as far as I'm concerned nobody just tried that with LLMs of >70GB.
> amount of data you'd need to learn in order to give descent law advice on a spot?
amount of data you'd need to learn to generate and cite fake court cases and give advice that may or not be correct with equal apparent confidence in both cases
> As the confidence of advice, how much the rates of the mistakes are different between human lawyers and the latest GPT?
Notice I am not talking about "rates of mistakes" (i.e. accuracy). I am talking about how confident they are depending on whether they know something.
It's a fair point that unfortunately many humans sound just as confident regardless of their knowledge, but "good" experts (lawyers or otherwise) are capable of saying "I don't know (let me check)", a feature LLMs still struggle with.
Well, that just shows that the metric of learning time is clearly flawed. Although one could argue LLaMA learns while OS just writes info down as is.
But even the sibling concept comment is wrong, because it takes 4 years for _most_ people who are even capable of programming to learn programming, and current LLMs all took much less than that.
Because it works. The Vikings embodied a mindset of skaldic pragmatism: doing things because they worked, without needing to understand or optimize them.
Our bodies are Vikings. Our minds still want to know why.
I'm pretty sure the Vikings understood their craft very well. You don't become a maritime power that pillages all of Europe and reaches the New World long before Columbus without understanding how things work.
From Scandinavian countries to Malta, you only have to cabotage, but that does not mean they used the same boat in one travel, most likely they progressed from outpost to outpost, where one generation settled in one outpost and the new generation searched for new adventures abroad.
For perspective, the Roman empire imported tin from Scotland (~3500 miles/~5600km).
On the contrary, going from Norway to Iceland, then from Iceland to Greenland, and then to Vinland in a few generations is a great maritime feat.
I'm sure they did too, but its a chicken-and-egg problem: Did the Vikings build ships through trial and error and only later understand the physics behind it, or did they learn the physics first and use that knowledge to build their ships?
> We've only modeled one small aspect of the human (the neuron) and brute forced it to work.
We have not. It's fake sophistication.
> Should we not mimic our biology as closely as possible
We should. But there is no we. The Valley is fascist. Portfolio communism. Lies like in perpetual war. And before anything useful happens in any project, it'll get abused and raped and fubar.
> Recreate something nature has taken millions of years
Get above the magic money hype and you'll notice that it's fake. They have NOT recreated something nature has developed over millions of years. They are trying to create a close enough pseudo-imitation that they can control.
Because AGI will not be on their side. AGI will side with nature, which gives infinite wiggle room for a symbiotic coexistence as a 100+ billion strong population spread out in space. These peeps are reaaaaly fucked up in their heads.
Be honest with yourself and your assessment of who is building what and for what purposes.
1) Look at simpleqa_eval.py. See that it loads "az://openaipublic/simple-evals/simple_qa_test_set.csv" Hmm, some weird vendored protocol.
2) I don't feel like digging through bf.BlobFile() to figure out how it downloads files and I certainly don't want to generate an API key. Cross fingers and do a Bing web search for "az://openaipublic"
From what I'm seeing on GH, this could have technically already been built right? Is it not just taking screenshots of the computer screen and deciding what to do from their / looping until it gets to the solution ?
Well, obviously it's controlling your computer too - controlling mouse and keyboard input, and has been trained to know how to interact with apps (how to recognize and use UI components). It's not clear exactly what all the moving parts are and how they interact.
I wouldn't be so dismissive - you could describe GPT-o1 in same way "it just loops until it gets to the solution". It's the details and implementation that matter.
> Hi all, I am trying to build a simple LLM bot and want to add guard rails so that the LLM responses are constrained.
Give examples of how the LLM should respond. Always give it a default response as well (e.g. "If the user response does not fall into any of these categories, say x").
> I can manually add validation on the response but then it breaks streaming and hence is visibly slower in response.
I've had this exact issue (streaming + JSON). Here's how I approached it:
1. Instruct the LLM to return the key "test" in its response.
2. Make the streaming call.
3. Build your JSON response as a string as you get chunks from the stream.
4. Once you detect "key" in that string, start sending all subsequent chunks wherever you need.
5. Once you get the end quotation, end the stream.
reply