But you could trust a $3 drug-store checkout isle slim wallet calculator (with solar panels for recharging) to perform those calculations and you'd get more nines of reliability from that calculator than the LLM. And it didn't cost hundreds of billions of dollars to develop the calculator, so you won't end up paying thousands of dollars for it.
That LLMs cannot beat a $3 calculator is because they're not fit for purpose. Everything they offer is a hallucination. Just because that fabrication matches with reality some of the time does not make it good. Reliability matters and these things just don't have it.
Nobody is trying to get 7 TRILIONS of dollars investment on calculators being capable of doing any task, but OpenAI swears to god that any day soon their language model will became a general intelligence capable of doing all things, including math.