To be honest, I have gotten 100x more useful answers out of Siri's WolframAlpha ...

meowface · on Sept 14, 2024

I use both for different things. WolframAlpha is great for well-defined questions with well-defined answers. LLMs are often great for anything that doesn't fall into that.

fnordpiglet · on Sept 15, 2024

I use home assistant with the extended open ai integration from HACS. Let me tell you, it’s orders of magnitude better than generic voice assistants. It can understand fairly flexibly my requests without me having a literal memory of every device in the house. I can ask for complex tasks like turning every light in the basement on without there being a zone basement by inferring from the names. I have air quality sensors throughout and I can ask it to turn on the fan in areas with low air quality and if literally does it without programming an automation.

Usually Alexa will order 10,000 rolls of toilet paper and ship them to my boss when I ask it to turn on the bathroom fan.

Personally tho the utility of this level of skill (beginner grad in many areas) for me personally is in areas I have undergraduate questions in. While I literally never ask it questions in my field, I do for many other fields I don’t know well to help me learn. over the summer my family traveled and I was home alone so I fixed and renovated tons of stuff I didn’t know how to do. I work a headset and had the voice mode of ChatGPT on. I just asked it questions as I went and it answered. This enabled me to complete dozens of projects I didn’t know how to even start otherwise. If I had had to stop and search the web and sift through forums and SEO hell scapes, and read instructions loosely related and try to synthesize my answers, I would have gotten two rather than thirty projects done.

Karrot_Kream · on Sept 14, 2024

How does this square up with literally what Terence Tao (TFA) writes about O1? Is this meant to say there's a class of problems that O1 is still really bad at (or worse than intuition says it should be, at least)? Or is this "he says, she says" time for hot topics again on HN?

sebzim4500 · on Sept 14, 2024

o1-preview is still quite a specialized model, and you can come up with very easy questions that it fails embarassingly despite it's success in seemingly much more difficult tests like olympiad programming/maths questions.

You certainly shouldn't think of it like having access to a graduate student whenever you want, although hopefully that's coming.

thelastparadise · on Sept 14, 2024

Wait til you generate WolframAlpha queries from natural language using Claude 3.5 and use it to interpret results as well.

talldayo · on Sept 14, 2024

I've tried the ChatGPT integration and it was kinda just useless. On smaller datasets it told me nothing that wasn't obviously apparent from the charts and tables; on larger datasets it couldn't do much besides basic key/value retrieval. Asking it to analyze a large time-series table was an exercise in futility, I remain pretty unimpressed with current offerings.

segmondy · on Sept 14, 2024

Then you have a skill issue. 10 million paying are for GPT monthly because a large of them are getting useful value out of it. WolframAlpha has been out for a while and didn't take off for a reason. "GPT couldn't be trusted to flick a light switch on a regular basis" pretty much implies you are not serious or your knowledge about the capabilities of LLM is pretty much dated or derived from things you have read.

jazzyjackson · on Sept 14, 2024

Wolframalpha is a free service, really kind of an ad for all the (curated, accurate) datasets built into Wolfram Language

Wolfram Research is a profitable company btw

codr7 · on Sept 14, 2024

FACT: The technology is inherently unreliable in its current form. And the weakness is built in, its not going to go away anytime soon.

jonahx · on Sept 14, 2024

The same is true of search engines, yet they are still incredibly useful.

codr7 · on Sept 14, 2024

Not the same technology at all, until recently at least.

EDIT: Looks like I hurt someone's feelings by killing their unicorn. It was going to happen sooner or later, and pretending isn't very constructive. In fact, pretending this technology is reliable is a very risky thing to do.