The whole business of superforecasting is nice but doesn't seem to be the revolution some have promoted it as. The questions are kinda boring and the predictions seem very middle-of-the-road conventional wisdom, what you'd expect after a few well-chosen Google searches for background information.
For example, their GDP growth prediction is centered around "the recent average, maybe a little less", which makes sense given headwinds from inflation and central bank policy. "Will Putin use nukes" is answered "very unlikely, but maybe?" Etc.
If nothing else, though, it has illuminated for me that asking interesting questions about the future is not easy. Could we do better than a bunch of questions that can be answered with some variation of "past trends will probably continue"?
Considering that since 2020 we've had Covid, an attempted insurrection, and an unexpected(ish) war, what are the odds that something dramatic and completely unexpected will happen next year?
I'm not expecting a number. Just pointing out that to date "past trends will probably continue" has been absolutely wrong for this decade.
I agree. Stuff like this was coming, only a matter of time (not that I predicted the timing, for sure).
Ukraine war is the only thing that kinda surprised me. But given that so many people expected Russia to trounce Ukraine in an invasion, maybe it makes sense that Putin saw an opening.
He has a book about it. The reason the questions tend to be less exciting is that they need to be pretty well defined so that there is no question as to whether the event in question happened or not.
So, "what could surprise us next year" would need to be reformulated to something more specific. "What are the chances GDP will grow 5% over the next 12 months?". "What are the chances the US launches a missile intended to hit mainland china?".
Prediction markets liquidity is tiny, thus their predictions have very little "skin in the game".
This was made very clear during a couple of big world events (stuff like major elections, ...), where I was watching simultaneously financial markets, dedicated event betting markets and prediction markets. Conclusion was that financial markets is where the real super-forecasters work.
Sure but often there isn't an obvious financial market to check. Prediction markets still update faster and more accurately than say the news or pundits.
Hopefully they stop being hindered and become more popular so liquidity increases though. It'd be much more useful for everyone if a portion of sports and other gambling spending can be redirected towards them.
> It'd be much more useful for everyone if a portion of sports and other gambling spending can be redirected towards them.
The problem is that prediction markets is typically about very rare stuff, that happens once or twice a year.
You typical sports punter doesn't have the patience for this kind of long bets.
> Hopefully they stop being hindered and become more popular so liquidity increases though
That's just a US problem (as always). In the rest of the world there are no major blocks to create prediction markets. In some form, they already exist - BetFair has election betting, ...
It's been a while since I've read it, but the book (Superforcasting) also had an additional section elaborating on a comparison against prediction markets.
From memory, the core thesis of the GJP is that some individuals are good at making forecasts, and this accuracy is not domain specific or require insider information. Once measured, more weight is put onto those who make better opinions. As an analogy consider asking 100 chess players for the next move in a game - those with a higher elo are more likely to find a better next move.
Conventional prediction market doesn't have this kind of "long term weighting", instead relying on individuals to bet according to their confidence (which may not always correspond to their accuracy).
Of interest is this article (https://mikesaintantoine.substack.com/p/scoring-midterm-elec...) which compared PredictIt, FiveThirtyEight and Manifold Markets (a prediction market with play money, so in theory no "proper" incentives). Even with the "proper incentives" it did no better Manifol Markets and a decent bit worse than 538.
> Good Judgment maintains a global network of elite Superforecasters who collaborate to tackle our clients’ forecasting questions with unparalleled accuracy. We continue to grow this network by identifying and recruiting fresh talent from our public forecasting platform, Good Judgment Open. And, we train others to apply the methods that make our Superforecasters so accurate.
Instead of "laughing your ass off", could you offer some constructive commentary about your reason for dismissing this?
I don't know anything about that organization and I come to HN to learn about things like this. So if you do know something about it, please share your knowledge.
I enrolled with the Good Judgement project for awhile. Most of these super high-level assessments are useless, and may even be put out as a bit of disinformation. What they really get is a lot of text from the participants which is essentially free amalgamation of OSINT that they turn over to the sponsors.
How do they avoid attracting people who would see it as gambling for fun?
A lot of people are attracted to gambling - possibly everyone, to some degree, and for a wide enough description of "gambling" - even when there's no money involved. Put even a small monetary reward in and you'll get loads of people taking part just for fun, in most cases.
Why would they want to avoid them? They just add liquidity to the market, incentivising better forecasters and improving the overall accuracy and robustness. More useful for them to gamble on that than something else.
In some sense they are 'suckers' that give an incentive for the good predictors to play. It's much nicer to participate if most of the other bets are poorly made.
I am a little disappointed to see that the results of “super forecasters” in the economist and on the underlying Good Judgement open website does not present a 95% credible interval, or even a good old fashioned confidence interval.
Would love to see results presented with the uncertainty quantified. Especially given that the yes/no questions are aggregated binarized predictions from what is almost certainly a collection of continuous models. A lot of information is lost between the people performing the analysis and either of these pages.
They are giving probabilities for discrete events, which already captures their level of uncertainty. Probabilities of probabilities (i.e., a probability distribution of a probability) are not very useful concepts.
It looks like they are simply providing the summary data to their multiple choice survey questions. Still, CI or SEM would not apply (there is no SEM for 80% of forecasters said 'yes'). These graphs are literally just telling you the percent of responders who picked a given answer to the corresponding question.
edit- the longer I browse their website for the exact methodology, the less impressed I am with this group. The "Introducing the Superforecasters" section is so cringe.
This would imply that the confidence interval around the coefficient in a logistic regression is not a very useful concept, which I don't think is true.
That it is a little different. There you are estimating a continuous parameter (which happens to be interpretable as a probability) and it makes sense to have a probability distribution over that.
But if you are talking about whether a single discrete events will happen or not, a single number (the probability) already fully captures the uncertainty about it.
There is a lot of missing information in their probability of binary events presented. Presumably they polled N forecasters and are presenting an x/N prediction. The fact that each forecaster is estimating in a continuous space and then binarizing their result means that a lot of information has been lost.
To look at an extreme example… were all the “yes” votes 95%+ certain and the “no” votes just under the line 49%? Or was it more like a bunch of no votes at 49% and a bunch of yes votes at 51%?
Binarizing forecasts necessarily discards information. Aggregating a bunch of binary predictions into a percentage does no recapture said information, unfortunately.
It's definitely an odd emission since the original research project used such a calculation. Metaculus, which uses a similar technique provides such a confidence interval, along with a nice history graph.
As some wild speculation, I suspect that since the GJP only employs a handful of Superforcasters, the initial confidence intervals for these broad questions may be quite large. That's to be expected when predicting a year in advance, but publically admitting to have such a broad confidence interval is probably not very good for marketing.
These events are binary. They will happen or not. A percentage prediction already incorporates the uncertainty. In what way would a 45-55% prediction be any different than a 50% prediction? The domain of outcomes is a set of size 2: (0, 1). There are no intermediate outcomes. Both predictions are functionally identical: {0: 50%, 1: 50%}.
(Where confidence intervals over percentages makes more sense is estimating a parameter rather than a single event. E.g., if we flip this coin a bunch of times in series, I predict that the percentage of flips landing heads will be between 45% and 55%. Or next year I predict GDP will be between 1%-3%. Or I estimate the effect size of this ad campaign was -$2-$5 earned per dollar spent.)
((One sense in which a 45%-55% prediction on a binary event might have some semantic meaning is that it could signal a higher willingness to adjust if new evidence or information is brought to light. But that's quite different than a confidence interval.))
I read Tetlock's Expert Political Judgement many years ago, and though I can't guarantee my memory of it, I think one upshot of some pretty detailed empirical work was that no-one's any good at predicting political and economic futures. Foxes (in Isiah Berlin's sense, ie. who approach problems without an overarching conceptual framework) were marginally better than Hedgehogs (who have a central big idea), but no-one was up to much.
I think the trick, if there is one beyond luck, involves the ability
to draw conspicuous attention to the occasion one is "right", while
distracting from the all the other off-target pronouncements.
The incorrect 3 were related to the Omicron variant -- not bad for armchair* analysis!
* My take from reading Tetlock's book is that superforecasting is essentially painstaking analysis by laypersons based on common rationality followed through diligently. If among the only things this process fails to predict is mutations then this is actually very encouraging.
I don’t generally place much stock in forecasts but… I’m unaware of what the 8 questions were last year but this years includes a few non binary outcomes. If this was the case last year then their performance was a fair bit better than random.
When I was an undergraduate at UCLA in the late 1960s, there was a tiny classified ad in the Daily Bruin Classifieds looking for people to participate in a study of decision making; it paid quite well, by the hour.
I called the phone number given and after a few questions over the phone was told to come to an address on Ocean Avenue in Santa Monica at 1 pm Saturday afternoon.
The address was that of the RAND Corporation, which was running the Delphi Project, an inquiry into whether a group of unspecialized individuals could, as a group, make better predictions of future events than would be expected.
There were 10-15 of us in the room, all UCLA undergraduates and graduate students, and we'd be given a scenario and asked to predict what we thought would happen next, after an hour or so spent batting it back and forth between us.
I'm certain we were recorded though I don't recall cameras but in the late 60s that wasn't really even a thing to average people.
This went on every Saturday afternoon for a couple months and then it ended.
We were paid in cash at the end of each session.
I later read about the Delphi Project and learned it was funded by the CIA.
>This report deals with one aspect of RAND's continuing study of methods for improving decision making. It describes the results of an extensive set of experiments conducted at RAND during the spring and summer of 1968. The experiments were concerned with evaluating the effectiveness of the Delphi procedures for formulating group judgments.
It is astonishing how little information these predictions convey.
So GDP will probably grow about 2% again. First -- you can "forecast" that by looking at a fifty-year chart and picking the median. Second -- who would benefit from knowing that information a year early?
All of the predictions are like that. There is little variance from an uneducated guess, and no actionable suggestions.
Maybe if this was framed as "a survey of the status-quo" rather than "superforecasting" I wouldn't be so negative about it. But there is nothing "super" about this.
Crystal balls are back in fashion, along with smoke, mirrors and
ectoplasm. Centennial recurrence perhaps.
Note the disclaimer of all practitioners who dabble in the dark arts;
this is for entertainment purposes only.
An artist friend recently wrote an essay [1] associating AI art with
"soft propaganda for the ideology of prediction". An interesting
phrase I thought. Is prediction an ideology? Is blind faith in "AI"
ushering in secular denominations of crystal botherers?
It's a feature of the interregnum, similar to that of the 1920's
perhaps, that we grow ever more desperate to peer around the corner of
time, and so ever more credulous of techo-spiritualists, mechanical
mediums and silicon psychics.
Not quite sure what most of what you said means, which is probably my own fault. But it does sound flowery, vaguely profound and poetic in a graduate studies essay kind of way.
Meanwhile, if you're interested in how this particular predictions market actually functions (spoiler: neither smoke, nor mirrors, nor ectoplasm make an appearance, and thanks for making me look up that last one): [0] [1].
That's actually a really good question about the nature of progress.
I suspect there's something more to finding yourself in the tent of
Madame Mystic Meg than a simple wish for foresight. Machines that are
eminently successful at foretelling might only amplify that pathology
(minus the incense, elegant dress, mood lighting and arabesque
panache).
Even if 2023 is not far away, it's important that we as a culture look forward towards the future and not get bogged down in the drama of the day (be it twitter, covid, or inflation), it robs us of time to plan to grow for the future and prepare for future challenges.
A lot of the comments are dismissive. I read the book on this superforecasters project / people / studies.
Turns out (some) people can learn about the world enough to build probabilistic weighted trees for the different outcomes.
Their predictions are benchmarked using a statistical tool named Brier Score.
Yes, commenters should read the book (Superforcasting). The forecasters in The Good Judgment Project are assessed on their self-assigned confidence in their prediction and how soon they can make their prediction. Some people seem to be provably better at this than others, they are Superforcasters.
What was most useful to me in the book was the notion of being more specific and testable when thinking about the future. I think the better you are at predicting the future (a delta between you and everyone else, not 100% accuracy, which is impossible) the better you are at understanding the world and the better you can plan ahead.
Good choice! It's one of those "one idea book" but it's a good idea. You CAN do a decent job at forecasting as a human and the book does explain to some extent:
1) how to benchmark predictions.
2) how these people approach making a prediction.
Thanks! I was also planning to be dismissive and say something snarky before reading your comment, so it is appreciated. I sincerely hope the forecasting process is legitimately something like this: