> With web search, Claude has access to the latest events and information, boosting its accuracy on tasks that benefit from the most recent data.
I'm surprised that they only expect performance to improve for tasks involving recent information. I thought it was widely accepted that using an LLM to extract information from a document is much more reliable than asking it to recall information it was trained on. In particular, it is supposed to lead to fewer instances of inventing facts out of thin air. Is my understanding out of date?
I have found that for RAG use cases where the source can be document or web data, hallucinations can still occur. This is largely driven by the prompt and alignment to the data available for processing and re-ranking.
I'm surprised that they only expect performance to improve for tasks involving recent information. I thought it was widely accepted that using an LLM to extract information from a document is much more reliable than asking it to recall information it was trained on. In particular, it is supposed to lead to fewer instances of inventing facts out of thin air. Is my understanding out of date?