Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

To be honest, I have gotten 100x more useful answers out of Siri's WolframAlpha integration than I ever have out of ChatGPT. People don't want a "not completely incompetent graduate student" responding to their prompts, they want NLP that reliably processes information. Last-generation voice assistants could at least do their job consistently, ChatGPT couldn't be trusted to flick a light switch on a regular basis.


I use both for different things. WolframAlpha is great for well-defined questions with well-defined answers. LLMs are often great for anything that doesn't fall into that.


I use home assistant with the extended open ai integration from HACS. Let me tell you, it’s orders of magnitude better than generic voice assistants. It can understand fairly flexibly my requests without me having a literal memory of every device in the house. I can ask for complex tasks like turning every light in the basement on without there being a zone basement by inferring from the names. I have air quality sensors throughout and I can ask it to turn on the fan in areas with low air quality and if literally does it without programming an automation.

Usually Alexa will order 10,000 rolls of toilet paper and ship them to my boss when I ask it to turn on the bathroom fan.

Personally tho the utility of this level of skill (beginner grad in many areas) for me personally is in areas I have undergraduate questions in. While I literally never ask it questions in my field, I do for many other fields I don’t know well to help me learn. over the summer my family traveled and I was home alone so I fixed and renovated tons of stuff I didn’t know how to do. I work a headset and had the voice mode of ChatGPT on. I just asked it questions as I went and it answered. This enabled me to complete dozens of projects I didn’t know how to even start otherwise. If I had had to stop and search the web and sift through forums and SEO hell scapes, and read instructions loosely related and try to synthesize my answers, I would have gotten two rather than thirty projects done.


How does this square up with literally what Terence Tao (TFA) writes about O1? Is this meant to say there's a class of problems that O1 is still really bad at (or worse than intuition says it should be, at least)? Or is this "he says, she says" time for hot topics again on HN?


o1-preview is still quite a specialized model, and you can come up with very easy questions that it fails embarassingly despite it's success in seemingly much more difficult tests like olympiad programming/maths questions.

You certainly shouldn't think of it like having access to a graduate student whenever you want, although hopefully that's coming.


Wait til you generate WolframAlpha queries from natural language using Claude 3.5 and use it to interpret results as well.


I've tried the ChatGPT integration and it was kinda just useless. On smaller datasets it told me nothing that wasn't obviously apparent from the charts and tables; on larger datasets it couldn't do much besides basic key/value retrieval. Asking it to analyze a large time-series table was an exercise in futility, I remain pretty unimpressed with current offerings.


Then you have a skill issue. 10 million paying are for GPT monthly because a large of them are getting useful value out of it. WolframAlpha has been out for a while and didn't take off for a reason. "GPT couldn't be trusted to flick a light switch on a regular basis" pretty much implies you are not serious or your knowledge about the capabilities of LLM is pretty much dated or derived from things you have read.


Wolframalpha is a free service, really kind of an ad for all the (curated, accurate) datasets built into Wolfram Language

Wolfram Research is a profitable company btw


FACT: The technology is inherently unreliable in its current form. And the weakness is built in, its not going to go away anytime soon.


The same is true of search engines, yet they are still incredibly useful.


Not the same technology at all, until recently at least.

EDIT: Looks like I hurt someone's feelings by killing their unicorn. It was going to happen sooner or later, and pretending isn't very constructive. In fact, pretending this technology is reliable is a very risky thing to do.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: