Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Better question: why does a simple search for “What color is a labrador retriever” require any compute time when the answer can be cached? This is a simple example, but 90% of my searches don’t require an llm to process a simple question.


One time I came across a git repo that let me download a gigabyte of prime numbers and I thought to myself, is that more or less efficient than me running a program locally to generate a gigabyte of prime numbers?

The compute for a direct answer like that is fractions of a penny, it might be better to create answers on the fly than store an index of every question anyone has asked (well, that's essentially what the weights are after all)


It’s in interesting question. I assume they’re using accelerators, and the alternative is a disk or memory hit. It still seems expensive to me.

https://www.linkedin.com/pulse/rising-cost-llm-based-search-...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: