Forecasting Newsletter - Draft

This commit is contained in:
Nuno Sempere 2020-06-29 12:34:09 +02:00
parent 5edb0a651a
commit e96b47f3b4

View File

@ -1,13 +1,33 @@
Whatever happened to forecasting? June 2020 # Forecasting Newsletter. June 2020.
===========================================
## Highlights
1. Facebook launches [Forecast](https://www.forecastapp.net/), a community for crowdsourced predictions.
2. Foretell, a forecasting tournament by the Center for Security and Emerging Technology, is now [open](www.cset-foretell.com).
3. [A Preliminary Look at Metaculus and Expert Forecasts](https://www.metaculus.com/news/2020/06/02/LRT/): Metaculus forecasters do better.
## Index ## Index
- Prediction Markets & Forecasting platforms. - Highlights.
- In the News. - In the News.
- Grab bag. - Prediction Markets & Forecasting Platforms.
- Negative examples. - Negative Examples.
- Hard to Categorize.
- Long Content. - Long Content.
## In the News.
- Facebook releases a forecasting app ([link to the app](https://www.forecastapp.net/), [press release](https://npe.fb.com/2020/06/23/forecast-a-community-for-crowdsourced-predictions-and-collective-insights/), [TechCrunch take](https://techcrunch.com/2020/06/23/facebook-tests-forecast-an-app-for-making-predictions-about-world-events-like-covid-19/), [hot-takes](https://cointelegraph.com/news/crypto-prediction-markets-face-competition-from-facebook-forecasts)). The release comes before Augur v2 launches, and it is easy to speculate that it might end up being combined with Facebook's stablecoin, Libra.
- The Economist has a new electoral model out ([article](https://www.economist.com/united-states/2020/06/11/meet-our-us-2020-election-forecasting-model), [model](https://projects.economist.com/us-2020-forecast/president)) which gives Trump an 11% chance of winning reelection. Given that Andrew Gelman was involved, I'm hesitant to critizice it, but it seems a tad overconfident.
- [COVID-19 vaccine before US election](https://www.aljazeera.com/ajimpact/wall-street-banking-covid-19-vaccine-election-200619204859320.html). Analysts see White House pushing through vaccine approval to bolster Trump's chances of reelection before voters head to polls. "All the datapoints we've collected make me think we're going to get a vaccine prior to the election," Jared Holz, a health-care strategist with Jefferies, said in a phone interview. The current administration is "incredibly incentivized to approve at least one of these vaccines before Nov. 3."
- ["Israeli Central Bank Forecasting Gets Real During Pandemic"](https://www.nytimes.com/reuters/2020/06/23/world/middleeast/23reuters-health-coronavirus-israel-cenbank.html). Israeli Central Bank is using data to which it has real time access, like credit-card spending, instead of lagging indicators.
- [Google](https://www.forbes.com/sites/jeffmcmahon/2020/05/31/thanks-to-renewables-and-machine-learning-google-now-forecasts-the-wind/) produces wind schedules for windfarms. "The result has been a 20 percent increase in revenue for wind farms". See [here](https://www.pv-magazine-australia.com/2020/06/01/solar-forecasting-evolves/) for essentially the same thing on solar forecasting.
- Survey of macroeconomic researchers predicts economic recovery will take years, reports [538](https://fivethirtyeight.com/features/dont-expect-a-quick-recovery-our-survey-of-economists-says-it-will-likely-take-years/).
## Prediction Markets & Forecasting platforms. ## Prediction Markets & Forecasting platforms.
Ordered in subjective order of importance: Ordered in subjective order of importance:
@ -27,7 +47,9 @@ Ordered in subjective order of importance:
- Good Judgement Analytics continues to provide their [covid dashboard](https://goodjudgment.com/covidrecovery/). - Good Judgement Analytics continues to provide their [covid dashboard](https://goodjudgment.com/covidrecovery/).
- [PredictIt](https://www.predictit.org/) & [Election Betting Odds](http://electionbettingodds.com/). I stumbled upon an old [538 piece](https://fivethirtyeight.com/features/fake-polls-are-a-real-problem/) on fake polls: some may have been conducted by PredictIt traders in order to mislead or troll other PredictIt traders. - [PredictIt](https://www.predictit.org/) & [Election Betting Odds](http://electionbettingodds.com/). I stumbled upon an old 538 piece on fake polls: [Fake Polls are a Real Problem](https://fivethirtyeight.com/features/fake-polls-are-a-real-problem/). Some polls may have been conducted by PredictIt traders in order to mislead or troll other PredictIt traders; all in all, an amusing example of how prediction markets could encourage worse information.
- [An online prediction market with reputation points](https://www.lesswrong.com/posts/sLbS93Fe4MTewFme3/an-online-prediction-market-with-reputation-points), implementing an [idea](https://sideways-view.com/2019/10/27/prediction-markets-for-internet-points/) by Paul Christiano. As of yet slow to load.
- Augur: - Augur:
- [An overview of the platform and of v2 modifications](https://bravenewcoin.com/insights/augur-price-analysis-v2-release-scheuled-for-june-12th). - [An overview of the platform and of v2 modifications](https://bravenewcoin.com/insights/augur-price-analysis-v2-release-scheuled-for-june-12th).
@ -36,32 +58,6 @@ Ordered in subjective order of importance:
- [Coronavirus Information Markets](https://coronainformationmarkets.com/) is down to ca. $12000 in trading volume; it seems like they didn't take off. - [Coronavirus Information Markets](https://coronainformationmarkets.com/) is down to ca. $12000 in trading volume; it seems like they didn't take off.
## In the News.
- Facebook releases a forecasting app ([link to the app](https://www.forecastapp.net/), [press release](https://npe.fb.com/2020/06/23/forecast-a-community-for-crowdsourced-predictions-and-collective-insights/), [TechCrunch take](https://techcrunch.com/2020/06/23/facebook-tests-forecast-an-app-for-making-predictions-about-world-events-like-covid-19/), [hot-takes](https://cointelegraph.com/news/crypto-prediction-markets-face-competition-from-facebook-forecasts)). The release comes before Augur v2 launches, and it is easy to speculate that it might end up being combined with Facebook's stablecoin, Libra.
- Survey of macroeconomic researchers predicts economic recovery will take years, reports [538](https://fivethirtyeight.com/features/dont-expect-a-quick-recovery-our-survey-of-economists-says-it-will-likely-take-years/).
- The Economist has a new electoral model out ([article](https://www.economist.com/united-states/2020/06/11/meet-our-us-2020-election-forecasting-model), [model](https://projects.economist.com/us-2020-forecast/president)) which gives Trump an 11% chance of winning reelection. Given that Andrew Gelman was involved, I'm hesitant to critizice it, but it seems a tad overconfident.
- [Google](https://www.forbes.com/sites/jeffmcmahon/2020/05/31/thanks-to-renewables-and-machine-learning-google-now-forecasts-the-wind/) produces wind schedules for windfarms. "The result has been a 20 percent increase in revenue for wind farms". See [here](https://www.pv-magazine-australia.com/2020/06/01/solar-forecasting-evolves/) for essentially the same thing on solar forecasting.
- ["Israeli Central Bank Forecasting Gets Real During Pandemic"](https://www.nytimes.com/reuters/2020/06/23/world/middleeast/23reuters-health-coronavirus-israel-cenbank.html). Israeli Central Bank is using data to which it has real time access, like credit-card spending, instead of lagging indicators.
## Grab bag.
- [An online prediction market with reputation points](https://www.lesswrong.com/posts/sLbS93Fe4MTewFme3/an-online-prediction-market-with-reputation-points), implementing an [idea](https://sideways-view.com/2019/10/27/prediction-markets-for-internet-points/) by Paul Christiano.
- [Box Office Pro](https://www.boxofficepro.com/the-art-and-science-of-box-office-forecasting/) looks at some factors around box-office forecasting.
- [How to improve space weather forecasting](https://eos.org/research-spotlights/how-to-improve-space-weather-forecasting) (see [here](https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2018SW002108#) for the original paper):
> For instance, the National Oceanic and Atmospheric Administrations Deep Space Climate Observatory (DSCOVR) satellite sits at the location in space called L1, where the gravitational pulls of Earth and the Sun cancel out. At this point, which is roughly 1.5 million kilometers from Earth, or barely 1% of the way to the Sun, detectors can provide warnings with only short lead times: about 30 minutes before a storm hits Earth in most cases or as little as 17 minutes in advance of extremely fast solar storms.
- [A Personal COVID-19 Postmortem](https://www.lesswrong.com/posts/B7sHnk8P8EXmpfyCZ/a-personal-interim-covid-19-postmortem), by FHI researcher [David Manheim](https://twitter.com/davidmanheim).
> I think it's important to clearly and publicly admit when we were wrong. It's even better to diagnose why, and take steps to prevent doing so again. COVID-19 is far from over, but given my early stance on a number of questions regarding COVID-19, this is my attempt at a public personal review to see where I was wrong.
- [FantasyScotus](https://fantasyscotus.net/user-predictions/case/altitude-express-inc-v-zarda/) beat [GoodJudgementOpen](https://www.gjopen.com/questions/1300-in-zarda-v-altitude-express-inc-will-the-supreme-court-rule-that-the-civil-rights-act-of-1964-prohibition-against-employment-discrimination-because-of-sex-encompasses-discrimination-based-on-an-individual-s-sexual-orientation) on legal decisions. I'm still waiting to see whether [Hollywood Stock Exchange](https://www.hsx.com/search/?action=submit_nav&keyword=Mulan&Submit.x=0&Submit.y=0) will also beat GJOpen on [film predictions](https://www.gjopen.com/questions/1608-what-will-be-the-total-domestic-box-office-gross-for-disney-s-mulan-as-of-8-september-2020-according-to-box-office-mojo).
## Negative examples. ## Negative examples.
- World powers to converge on strategies for presenting COVID-19 information to make forecasters' jobs more interesting: - World powers to converge on strategies for presenting COVID-19 information to make forecasters' jobs more interesting:
@ -75,9 +71,25 @@ Ordered in subjective order of importance:
- [India has the fourth-highest number of COVID-19 cases, but the Government denies community transmission](https://www.abc.net.au/news/2020-06-21/india-coronavirus-fourth-highest-covid19-community-transmission/12365738) - [India has the fourth-highest number of COVID-19 cases, but the Government denies community transmission](https://www.abc.net.au/news/2020-06-21/india-coronavirus-fourth-highest-covid19-community-transmission/12365738)
- One suspects that this denial is political, because India is otherwise [being](https://www.maritime-executive.com/editorials/advanced-cyclone-forecasting-is-saving-thousands-of-lives) [extremely](https://economictimes.indiatimes.com/news/politics-and-nation/world-meteorological-organization-appreciates-indias-highly-accurate-cyclone-forecasting-system/articleshow/76280763.cms) [competent](https://economictimes.indiatimes.com/news/politics-and-nation/mumbai-to-get-hyperlocal-rain-outlooks-flood-forecasting-launched/articleshow/76343558.cms) in weather forecasting. - One suspects that this denial is political, because India is otherwise [being](https://www.maritime-executive.com/editorials/advanced-cyclone-forecasting-is-saving-thousands-of-lives) [extremely](https://economictimes.indiatimes.com/news/politics-and-nation/world-meteorological-organization-appreciates-indias-highly-accurate-cyclone-forecasting-system/articleshow/76280763.cms) [competent](https://economictimes.indiatimes.com/news/politics-and-nation/mumbai-to-get-hyperlocal-rain-outlooks-flood-forecasting-launched/articleshow/76343558.cms) in weather forecasting.
- Youyang Gu's model, widely aclaimmed as one of the best coronavirus models for the US, produces 95% confidence intervals which are [too narrow](https://twitter.com/LinchZhang/status/1270443040860106753) when extended to [Pakistan](https://covid19-projections.com/pakistan). - Youyang Gu's model, widely aclaimmed as one of the best coronavirus models for the US, produces 95% confidence intervals which [seem too narrow](https://twitter.com/LinchZhang/status/1270443040860106753) when extended to [Pakistan](https://covid19-projections.com/pakistan).
- [COVID-19 vaccine before US election](https://www.aljazeera.com/ajimpact/wall-street-banking-covid-19-vaccine-election-200619204859320.html). Analysts see White House pushing through vaccine approval to bolster Trump's chances of reelection before voters head to polls. "All the datapoints we've collected make me think we're going to get a vaccine prior to the election," Jared Holz, a health-care strategist with Jefferies, said in a phone interview. The current administration is "incredibly incentivized to approve at least one of these vaccines before Nov. 3."
## Hard to categorize.
- [A Personal COVID-19 Postmortem](https://www.lesswrong.com/posts/B7sHnk8P8EXmpfyCZ/a-personal-interim-covid-19-postmortem), by FHI researcher [David Manheim](https://twitter.com/davidmanheim).
> I think it's important to clearly and publicly admit when we were wrong. It's even better to diagnose why, and take steps to prevent doing so again. COVID-19 is far from over, but given my early stance on a number of questions regarding COVID-19, this is my attempt at a public personal review to see where I was wrong.
- [FantasyScotus](https://fantasyscotus.net/user-predictions/case/altitude-express-inc-v-zarda/) beat [GoodJudgementOpen](https://www.gjopen.com/questions/1300-in-zarda-v-altitude-express-inc-will-the-supreme-court-rule-that-the-civil-rights-act-of-1964-prohibition-against-employment-discrimination-because-of-sex-encompasses-discrimination-based-on-an-individual-s-sexual-orientation) on legal decisions. I'm still waiting to see whether [Hollywood Stock Exchange](https://www.hsx.com/search/?action=submit_nav&keyword=Mulan&Submit.x=0&Submit.y=0) will also beat GJOpen on [film predictions](https://www.gjopen.com/questions/1608-what-will-be-the-total-domestic-box-office-gross-for-disney-s-mulan-as-of-8-september-2020-according-to-box-office-mojo).
- [How to improve space weather forecasting](https://eos.org/research-spotlights/how-to-improve-space-weather-forecasting) (see [here](https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2018SW002108#) for the original paper):
> For instance, the National Oceanic and Atmospheric Administrations Deep Space Climate Observatory (DSCOVR) satellite sits at the location in space called L1, where the gravitational pulls of Earth and the Sun cancel out. At this point, which is roughly 1.5 million kilometers from Earth, or barely 1% of the way to the Sun, detectors can provide warnings with only short lead times: about 30 minutes before a storm hits Earth in most cases or as little as 17 minutes in advance of extremely fast solar storms.
- [Coup cast](https://oefresearch.org/activities/coup-cast): A site which estimates the yearly probability of a coup. The color coding is misleading; click on the countries instead.
- [Prediction = Compression](https://www.lesswrong.com/posts/hAvGi9YAPZAnnjZNY/prediction-compression-transcript-1). "Whenever you have a prediction algorithm, you can also get a correspondingly good compression algorithm for data you already have, and vice versa."
- Other LessWrong posts which caught my attention were [Betting with Mandatory Post-Mortem](https://www.lesswrong.com/posts/AM5JiWfmbAytmBq82/betting-with-mandatory-post-mortem) and [Radical Probabilism](https://www.lesswrong.com/posts/ZM63n353vh2ag7z4p/radical-probabilism-transcript)
- [Box Office Pro](https://www.boxofficepro.com/the-art-and-science-of-box-office-forecasting/) looks at some factors around box-office forecasting.
## Long Content. ## Long Content.
@ -90,12 +102,6 @@ Ordered in subjective order of importance:
- [A review of Tetlocks Superforecasting (2015)](https://dominiccummings.com/2016/11/24/a-review-of-tetlocks-superforecasting-2015/), by Dominic Cummings. Cummings then went on to hire one such superforecaster, which then resigned over a [culture war](https://www.bbc.com/news/uk-politics-51545541) scandal, characterized by adversarial selection of quotes which indeed are outside the British Overton Window. Notably, Dominic Cummings then told reporters to "Read Philip Tetlock's *Superforecasters*, instead of political pundits who don't know what they're talking about." - [A review of Tetlocks Superforecasting (2015)](https://dominiccummings.com/2016/11/24/a-review-of-tetlocks-superforecasting-2015/), by Dominic Cummings. Cummings then went on to hire one such superforecaster, which then resigned over a [culture war](https://www.bbc.com/news/uk-politics-51545541) scandal, characterized by adversarial selection of quotes which indeed are outside the British Overton Window. Notably, Dominic Cummings then told reporters to "Read Philip Tetlock's *Superforecasters*, instead of political pundits who don't know what they're talking about."
- [Coup cast](https://oefresearch.org/activities/coup-cast): A site which estimates the yearly probability of coup. The color coding is misleading; click on the countries instead.
- [A list of prediction markets](https://docs.google.com/spreadsheets/d/1XB1GHfizNtVYTOAD_uOyBLEyl_EV7hVtDYDXLQwgT7k/edit#gid=0), and their fates, mantained by Jacob Laguerros. Like most startups, most prediction markets fail.
- [Prediction = Compression](https://www.lesswrong.com/posts/hAvGi9YAPZAnnjZNY/prediction-compression-transcript-1). "Whenever you have a prediction algorithm, you can also get a correspondingly good compression algorithm for data you already have, and vice versa."
- [Assessing the Performance of Real-Time Epidemic Forecasts: A Case Study of *Ebola* in the Western Area Region of Sierra Leone, 2014-15](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6386417/). The one caveat is that their data is much better than coronavirus data, because Ebola symptoms are more evident; otherwise, pretty interesting: - [Assessing the Performance of Real-Time Epidemic Forecasts: A Case Study of *Ebola* in the Western Area Region of Sierra Leone, 2014-15](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6386417/). The one caveat is that their data is much better than coronavirus data, because Ebola symptoms are more evident; otherwise, pretty interesting:
> Real-time forecasts based on mathematical models can inform critical decision-making during infectious disease outbreaks. Yet, epidemic forecasts are rarely evaluated during or after the event, and there is little guidance on the best metrics for assessment. > Real-time forecasts based on mathematical models can inform critical decision-making during infectious disease outbreaks. Yet, epidemic forecasts are rarely evaluated during or after the event, and there is little guidance on the best metrics for assessment.
@ -105,7 +111,7 @@ Ordered in subjective order of importance:
> Comparing different versions of our model to simpler models, we further found that it would have been possible to determine the model that was most reliable at making forecasts from early on in the epidemic. This suggests that there is value in assessing forecasts, and that it should be possible to improve forecasts by checking how good they are during an ongoing epidemic. > Comparing different versions of our model to simpler models, we further found that it would have been possible to determine the model that was most reliable at making forecasts from early on in the epidemic. This suggests that there is value in assessing forecasts, and that it should be possible to improve forecasts by checking how good they are during an ongoing epidemic.
> One forecast that gained particular attention during the epidemic was published in the summer of 2014, projecting that by early 2015 there might be 1.4 million cases. This number was based on unmitigated growth in the absence of further intervention and proved a gross overestimate, yet it was later highlighted as a “call to arms” that served to trigger the international response that helped avoid the worst-case scenario > One forecast that gained particular attention during the epidemic was published in the summer of 2014, projecting that by early 2015 there might be 1.4 million cases. This number was based on unmitigated growth in the absence of further intervention and proved a gross overestimate, yet it was later highlighted as a “call to arms” that served to trigger the international response that helped avoid the worst-case scenario.
> Methods to assess probabilistic forecasts are now being used in other fields, but are not commonly applied in infectious disease epidemiology > Methods to assess probabilistic forecasts are now being used in other fields, but are not commonly applied in infectious disease epidemiology
@ -113,11 +119,26 @@ Ordered in subjective order of importance:
> On the other hand, a well-calibrated mechanistic model that accounts for all relevant dynamic factors and external influences could, in principle, have been used to predict the behaviour of the epidemic reliably and precisely. Yet, lack of detailed data on transmission routes and risk factors precluded the parameterisation of such a model and are likely to do so again in future epidemics in resource-poor settings. > On the other hand, a well-calibrated mechanistic model that accounts for all relevant dynamic factors and external influences could, in principle, have been used to predict the behaviour of the epidemic reliably and precisely. Yet, lack of detailed data on transmission routes and risk factors precluded the parameterisation of such a model and are likely to do so again in future epidemics in resource-poor settings.
- In the selection of quotes above, we gave an example of a forecast which ended up overestimating the incidence, yet might have "served as a call to arms". It's maybe a real life example of a forecast changing the true result, leading to a fixed point problem, like the ones hypothesized in the parable of the [Predict-O-Matic](https://www.lesswrong.com/posts/SwcyMEgLyd4C3Dern/the-parable-of-predict-o-matic).
- It would be a fixed point problem if \[forecast above the alarm threshold\] → epidemic being contained, but \[forecast below the alarm thresold\] → epidemic not being contained.
- Maybe the fix-point solution, i.e., the most self-fulfilling (and thus, accurate) forecast, would have been a forecast on the edge of the alarm threshold, which would have ended up leading to mediocre containment.
- The [troll polls](https://fivethirtyeight.com/features/fake-polls-are-a-real-problem/) created by PredictIt traders are perhaps a more clear cut example of Predict-O-Matic problems.
- [Calibration Scoring Rules for Practical Prediction Training](https://arxiv.org/abs/1808.07501). I found it most interesting when considering how Brier and log rules didn't have all the pedagogic desiderata. - [Calibration Scoring Rules for Practical Prediction Training](https://arxiv.org/abs/1808.07501). I found it most interesting when considering how Brier and log rules didn't have all the pedagogic desiderata.
- I also found the following derivation of the logarithmic scoring rule interesting. Consider: If you assign a probability to n events, then the combined probability of these events is p1 x p2 x p3 x ... pn. Taking logarithms, this is log(p1 x p2 x p3 x ... x pn) = Σ log(pn), i.e., the logarithmic scoring rule. - I also found the following derivation of the logarithmic scoring rule interesting. Consider: If you assign a probability to n events, then the combined probability of these events is p1 x p2 x p3 x ... pn. Taking logarithms, this is log(p1 x p2 x p3 x ... x pn) = Σ log(pn), i.e., the logarithmic scoring rule.
- [Binary Scoring Rules that Incentivize Precision](https://arxiv.org/abs/2002.10669). The results (the closed-form of scoring rules which minimize the a given forecasting error) are interesting, but the journey to get there is kind of a drag, and ultimately the logarithmic scoring rule ends up being pretty decent according to their measure of error. - [Binary Scoring Rules that Incentivize Precision](https://arxiv.org/abs/2002.10669). The results (the closed-form of scoring rules which minimize the a given forecasting error) are interesting, but the journey to get there is kind of a drag, and ultimately the logarithmic scoring rule ends up being pretty decent according to their measure of error.
- Opinion: I'm not sure whether their results are going to be useful for things I'm interested in (like human forecasting tournaments, rather than kaggle data analysis competitions). In practice, what I might do if I wanted to incentivize precision is to ask myself if this is a question where the answer is going to be closer to 50%, or closer to either of 0% or 100%, and then use either the Brier or the logarithmic scoring rules. That is, I don't want to minimize an l-norm of the error over [0,1], I want to minimize an l-norm over the region I think the answer is going to be in, and the paper falls short of addressing that. - Opinion: I'm not sure whether their results are going to be useful for things I'm interested in (like human forecasting tournaments, rather than kaggle data analysis competitions). In practice, what I might do if I wanted to incentivize precision is to ask myself if this is a question where the answer is going to be closer to 50%, or closer to either of 0% or 100%, and then use either the Brier or the logarithmic scoring rules. That is, I don't want to minimize an l-norm of the error over [0,1], I want to minimize an l-norm over the region I think the answer is going to be in, and the paper falls short of addressing that.
- [A list of prediction markets](https://docs.google.com/spreadsheets/d/1XB1GHfizNtVYTOAD_uOyBLEyl_EV7hVtDYDXLQwgT7k/edit#gid=0), and their fates, mantained by Jacob Laguerros. Like most startups, most prediction markets fail.
Note to the future: All links are added automatically to the Internet Archive. In case of link rot, go [here](https://archive.org/)
***
> "I beseech you, in the bowels of Christ, think it possible that you may be mistaken."
> [Oliver Cromwell](https://en.wikipedia.org/wiki/Cromwell%27s_rule)
***