Forecasting Newsletter for May draft

This commit is contained in:
Nuno Sempere 2020-05-23 20:09:01 +02:00
parent dd5559d099
commit 111b1157ad

View File

@ -1,4 +1,4 @@
Whatever happened to forecasting? April 2020 Whatever happened to forecasting? May 2020
============================================ ============================================
A forecasting digest with a focus on experimental forecasting. You can sign up [here](https://mailchi.mp/18fccca46f83/forecastingnewsletter). The newsletter itself is experimental, but there will be at least five more iterations. A forecasting digest with a focus on experimental forecasting. You can sign up [here](https://mailchi.mp/18fccca46f83/forecastingnewsletter). The newsletter itself is experimental, but there will be at least five more iterations.
@ -13,6 +13,7 @@ A forecasting digest with a focus on experimental forecasting. You can sign up [
- Metaculus. - Metaculus.
- Good Judgement Open. - Good Judgement Open.
- In the News. - In the News.
- Grab bag.
- Long Content. - Long Content.
## Prediction Markets & Forecasting platforms. ## Prediction Markets & Forecasting platforms.
@ -39,7 +40,7 @@ Some of the interesting and wrong ones are:
- [Will Michelle Obama run for president in 2020?](https://www.predictit.org/markets/detail/4632/Will-Michelle-Obama-run-for-president-in-2020) - [Will Michelle Obama run for president in 2020?](https://www.predictit.org/markets/detail/4632/Will-Michelle-Obama-run-for-president-in-2020)
- [Will Hillary Clinton run for president in 2020?](https://www.predictit.org/markets/detail/4614/Will-Hillary-Clinton-run-for-president-in-2020) - [Will Hillary Clinton run for president in 2020?](https://www.predictit.org/markets/detail/4614/Will-Hillary-Clinton-run-for-president-in-2020)
Answers are: 80%, 15%, 69%, 79%, 8%, 2%, 7%, 11%. Market odds are: 80%, 15%, 69%, 79%, 8%, 2%, 7%, 11%.
Further, the following two markets are plain inconsistent: Further, the following two markets are plain inconsistent:
- [Will the 2020 Democratic nominee for president be a woman?](https://www.predictit.org/markets/detail/2902/Will-the-2020-Democratic-nominee-for-president-be-a-woman): 11% - [Will the 2020 Democratic nominee for president be a woman?](https://www.predictit.org/markets/detail/2902/Will-the-2020-Democratic-nominee-for-president-be-a-woman): 11%
@ -89,9 +90,11 @@ Good Judgement Inc. also organizes the Good Judgement Open [gjopen.com](https://
- [Before 1 January 2021, will the People's Liberation Army (PLA) and/or Peoples Armed Police (PAP) be mobilized in Hong Kong?](https://www.gjopen.com/questions/1499-before-1-january-2021-will-the-people-s-liberation-army-pla-and-or-people-s-armed-police-pap-be-mobilized-in-hong-kong) - [Before 1 January 2021, will the People's Liberation Army (PLA) and/or Peoples Armed Police (PAP) be mobilized in Hong Kong?](https://www.gjopen.com/questions/1499-before-1-january-2021-will-the-people-s-liberation-army-pla-and-or-people-s-armed-police-pap-be-mobilized-in-hong-kong)
- [Will the winner of the popular vote in the 2020 United States presidential election also win the electoral college?](https://www.gjopen.com/questions/1495-will-the-winner-of-the-popular-vote-in-the-2020-united-states-presidential-election-also-win-the-electoral-college)- This one is interesting, because it has infrequently gone the other way historically, but 2/5 of the last USA elections were split. - [Will the winner of the popular vote in the 2020 United States presidential election also win the electoral college?](https://www.gjopen.com/questions/1495-will-the-winner-of-the-popular-vote-in-the-2020-united-states-presidential-election-also-win-the-electoral-college)- This one is interesting, because it has infrequently gone the other way historically, but 2/5 of the last USA elections were split.
- [Will Benjamin Netanyahu cease to be the prime minister of Israel before 1 January 2021?](https://www.gjopen.com/questions/1498-will-benjamin-netanyahu-cease-to-be-the-prime-minister-of-israel-before-1-january-2021). Just when I thought he was out, he pulls himself back in. - [Will Benjamin Netanyahu cease to be the prime minister of Israel before 1 January 2021?](https://www.gjopen.com/questions/1498-will-benjamin-netanyahu-cease-to-be-the-prime-minister-of-israel-before-1-january-2021). Just when I thought he was out, he pulls himself back in.
- [Before 28 July 2020, will Saudi Arabia announce the cancellation or suspension of the Hajj pilgrimage, scheduled for 28 July 2020 to 2 August 2020?] (https://www.gjopen.com/questions/1621-before-28-july-2020-will-saudi-arabia-announce-the-cancellation-or-suspension-of-the-hajj-pilgrimage-scheduled-for-28-july-2020-to-2-august-2020) - [Before 28 July 2020, will Saudi Arabia announce the cancellation or suspension of the Hajj pilgrimage, scheduled for 28 July 2020 to 2 August 2020?](https://www.gjopen.com/questions/1621-before-28-july-2020-will-saudi-arabia-announce-the-cancellation-or-suspension-of-the-hajj-pilgrimage-scheduled-for-28-july-2020-to-2-august-2020)
- [Will formal negotiations between Russia and the United States on an extension, modification, or replacement for the New START treaty begin before 1 October 2020?](https://www.gjopen.com/questions/1551-will-formal-negotiations-between-russia-and-the-united-states-on-an-extension-modification-or-replacement-for-the-new-start-treaty-begin-before-1-october-2020)s - [Will formal negotiations between Russia and the United States on an extension, modification, or replacement for the New START treaty begin before 1 October 2020?](https://www.gjopen.com/questions/1551-will-formal-negotiations-between-russia-and-the-united-states-on-an-extension-modification-or-replacement-for-the-new-start-treaty-begin-before-1-october-2020)s
Odds: 20%, 75%, 44%, 86%, 19%
On the Good Judgement Inc. side, [here](https://goodjudgment.com/covidrecovery/) is a dashboard presenting forecasts related to covid. The ones I found most worthy are: On the Good Judgement Inc. side, [here](https://goodjudgment.com/covidrecovery/) is a dashboard presenting forecasts related to covid. The ones I found most worthy are:
- [When will the FDA approve a drug or biological product for the treatment of COVID-19?](https://goodjudgment.io/covid-recovery/#1384) - [When will the FDA approve a drug or biological product for the treatment of COVID-19?](https://goodjudgment.io/covid-recovery/#1384)
- [Will the US economy bounce back by Q2 2021?](https://goodjudgment.io/covid-recovery/#1373) - [Will the US economy bounce back by Q2 2021?](https://goodjudgment.io/covid-recovery/#1373)
@ -113,7 +116,7 @@ The Center for Security and Emerging Technology is looking for forecasters to pr
- [BMW Cuts Profit Forecast Again, And Warns About Uncertainty](https://www.forbes.com/sites/neilwinton/2020/05/06/bmw-cuts-profit-forecast-again-and-warns-about-uncertainty/#2ac2be64468c), Forbes reports. - [BMW Cuts Profit Forecast Again, And Warns About Uncertainty](https://www.forbes.com/sites/neilwinton/2020/05/06/bmw-cuts-profit-forecast-again-and-warns-about-uncertainty/#2ac2be64468c), Forbes reports.
- [Central Bankers Adopt Scenario Forecasting for Post-Virus World](https://www.bloomberg.com/news/articles/2020-05-11/central-bankers-adopt-scenario-forecasting-for-post-virus-world). I find it cute that China, seeing as how they're not going to be able to meet their GDP targets, is "considering dropping its traditional numerical GDP target". Otherwise, central banks are coming to terms with the depths of their uncertainty. - [Central Bankers Adopt Scenario Forecasting for Post-Virus World](https://www.bloomberg.com/news/articles/2020-05-11/central-bankers-adopt-scenario-forecasting-for-post-virus-world). I find it cute that China, seeing as how they're not going to be able to meet their GDP targets, is "considering dropping its traditional numerical GDP target". Otherwise, central banks are coming to terms with the depths of their uncertainty.
- [Locust-tracking application for the UN](https://www.research.noaa.gov/article/ArtMID/587/ArticleID/2620/NOAA-teams-with-the-United-Nations-to-create-locust-tracking-application). (and [here](https://www.washingtonpost.com/weather/2020/05/13/east-africa-locust-forecast-tool/) is a take by the Washington Post), using software originally intended to track the movements of air polution. NOAA also sounds like a really cool organization: "NOAA Research enables better forecasts, earlier warnings for natural disasters, and a greater understanding of the Earth. Our role is to provide unbiased science to better manage the environment, nationally, and globally." - [Locust-tracking application for the UN](https://www.research.noaa.gov/article/ArtMID/587/ArticleID/2620/NOAA-teams-with-the-United-Nations-to-create-locust-tracking-application). (and [here](https://www.washingtonpost.com/weather/2020/05/13/east-africa-locust-forecast-tool/) is a take by the Washington Post), using software originally intended to track the movements of air polution. NOAA also sounds like a really cool organization: "NOAA Research enables better forecasts, earlier warnings for natural disasters, and a greater understanding of the Earth. Our role is to provide unbiased science to better manage the environment, nationally, and globally."
- [United Nations: World Economic Situation and Prospects as of mid-2020](https://www.un.org/development/desa/dpad/publication/world-economic-situation-and-prospects-as-of-mid-2020/). A recent report is out, which predicts a 3.2% contraction of the global economy. Between 34 and 160 million people are expected to fall below the extreme poverty line this year. - [United Nations: World Economic Situation and Prospects as of mid-2020](https://www.un.org/development/desa/dpad/publication/world-economic-situation-and-prospects-as-of-mid-2020/). A recent report is out, which predicts a 3.2% contraction of the global economy. Between 34 and 160 million people are expected to fall below the extreme poverty line this year.
- [Kelsey Piper of Vox disses on the IHME model](https://www.vox.com/future-perfect/2020/5/2/21241261/coronavirus-modeling-us-deaths-ihme-pandemic). "Some of the factors that make the IHME model unreliable at predicting the virus may have gotten people to pay attention to it;"or "Other researchers found the true deaths were outside of the 95 percent confidence interval given by the model 70 percent of the time." - [Kelsey Piper of Vox disses on the IHME model](https://www.vox.com/future-perfect/2020/5/2/21241261/coronavirus-modeling-us-deaths-ihme-pandemic). "Some of the factors that make the IHME model unreliable at predicting the virus may have gotten people to pay attention to it;"or "Other researchers found the true deaths were outside of the 95 percent confidence interval given by the model 70 percent of the time."
- [Fox News](https://www.fox10phoenix.com/news/cdc-says-all-models-forecast-increase-in-covid-19-deaths-in-coming-weeks-exceeding-100k-by-june-1) and [Business Insider](https://www.businessinsider.com/cdc-forecasts-100000-coronavirus-deaths-by-june-1-2020-5?r=KINDLYSTOPTRACKINGUS) report over the CDC forecasting 100k deaths by June the 1st, differently. - [Fox News](https://www.fox10phoenix.com/news/cdc-says-all-models-forecast-increase-in-covid-19-deaths-in-coming-weeks-exceeding-100k-by-june-1) and [Business Insider](https://www.businessinsider.com/cdc-forecasts-100000-coronavirus-deaths-by-june-1-2020-5?r=KINDLYSTOPTRACKINGUS) report over the CDC forecasting 100k deaths by June the 1st, differently.
- Yahoo has automated finance forecast reporting. It took me a while (three months) to notice that the low quality finance articles that were popping up in my google alerts were machine generated. See [Synovus Financial Corp. Earnings Missed Analyst Estimates: Here's What Analysts Are Forecasting Now](https://finance.yahoo.com/news/synovus-financial-corp-earnings-missed-152645825.html), [Wienerberger AG Earnings Missed Analyst Estimates: Here's What Analysts Are Forecasting Now](https://finance.yahoo.com/news/wienerberger-ag-earnings-missed-analyst-070545629.html), [Park Lawn Corporation Earnings Missed Analyst Estimates: Here's What Analysts Are Forecasting Now](https://news.yahoo.com/park-lawn-corporation-earnings-missed-120314826.html); they have a similar structure, paragraph per paragraph, and seem to have been generated from a template which changes a little bit depending on the data (they seem to have different templates for very positive, positive, neutral and negative change). To be clear, I could program something like this given a good finance api and a spare week/month, and in fact did so a couple of years ago for an automatic poetry generator. *But I didn't notice because I wasn't paying attention*. - Yahoo has automated finance forecast reporting. It took me a while (three months) to notice that the low quality finance articles that were popping up in my google alerts were machine generated. See [Synovus Financial Corp. Earnings Missed Analyst Estimates: Here's What Analysts Are Forecasting Now](https://finance.yahoo.com/news/synovus-financial-corp-earnings-missed-152645825.html), [Wienerberger AG Earnings Missed Analyst Estimates: Here's What Analysts Are Forecasting Now](https://finance.yahoo.com/news/wienerberger-ag-earnings-missed-analyst-070545629.html), [Park Lawn Corporation Earnings Missed Analyst Estimates: Here's What Analysts Are Forecasting Now](https://news.yahoo.com/park-lawn-corporation-earnings-missed-120314826.html); they have a similar structure, paragraph per paragraph, and seem to have been generated from a template which changes a little bit depending on the data (they seem to have different templates for very positive, positive, neutral and negative change). To be clear, I could program something like this given a good finance api and a spare week/month, and in fact did so a couple of years ago for an automatic poetry generator. *But I didn't notice because I wasn't paying attention*.
@ -129,19 +132,19 @@ The Center for Security and Emerging Technology is looking for forecasters to pr
## Grab bag ## Grab bag
- [SlateStarCodex](https://slatestarcodex.com/2020/04/29/predictions-for-2020/) brings us a hundred more predictions for 2020. Some analysis by Zvi Mowshowitz [here](https://www.lesswrong.com/posts/gSdZjyFSky3d34ySh/slatestarcodex-2020-predictions-buy-sell-hold) and by [Bucky](https://www.lesswrong.com/posts/orSNNCm77LiSEBovx/2020-predictions). - [SlateStarCodex](https://slatestarcodex.com/2020/04/29/predictions-for-2020/) brings us a hundred more predictions for 2020. Some analysis by Zvi Mowshowitz [here](https://www.lesswrong.com/posts/gSdZjyFSky3d34ySh/slatestarcodex-2020-predictions-buy-sell-hold) and by [Bucky](https://www.lesswrong.com/posts/orSNNCm77LiSEBovx/2020-predictions).
- [FLI Podcast: On Superforecasting with Robert de Neufville](https://futureoflife.org/2020/04/30/on-superforecasting-with-robert-de-neufville/). Leaning towards introductory, broad and superficial; I would have liked to see a more intense drilling on some of the points. It still gives pointers to interesting stuff, though, chiefly [The NonProphets Podcast](https://nonprophetspod.wordpress.com/), which looks like it has some more in-depth stuff. Some quotes: - [FLI Podcast: On Superforecasting with Robert de Neufville](https://futureoflife.org/2020/04/30/on-superforecasting-with-robert-de-neufville/). Leaning towards introductory, broad and superficial; I would have liked to see a more intense drilling on some of the points. It still gives pointers to interesting stuff, though, chiefly [The NonProphets Podcast](https://nonprophetspod.wordpress.com/), which looks like it has some more in-depth stuff. Some quotes:
> So its not clear to me that our forecasts are necessarily affecting policy. Although its the kind of thing that gets written up in the news and who knows how much that affects peoples opinions, or they talk about it at Davos and maybe those people go back and they change what theyre doing. > So its not clear to me that our forecasts are necessarily affecting policy. Although its the kind of thing that gets written up in the news and who knows how much that affects peoples opinions, or they talk about it at Davos and maybe those people go back and they change what theyre doing.
> I wish it were used better. If I were the advisor to a president, I would say you should create a predictive intelligence unit using superforecasters. Maybe give them access to some classified information, but even using open source information, have them predict probabilities of certain kinds of things and then develop a system for using that in your decision making. But I think were a fair ways away from that. I dont know any interest in that in the current administration. > I wish it were used better. If I were the advisor to a president, I would say you should create a predictive intelligence unit using superforecasters. Maybe give them access to some classified information, but even using open source information, have them predict probabilities of certain kinds of things and then develop a system for using that in your decision making. But I think were a fair ways away from that. I dont know any interest in that in the current administration.
> Now one thing I think is interesting is that often people, theyre not interested in my saying, “Theres a 78% chance of something happening.” What they want to know is, how did I get there? What is my arguments? Thats not unreasonable. I really like thinking in terms of probabilities, but I think it often helps people understand what the mechanism is because it tells them something about the world that might help them make a decision. So I think one thing that maybe can be done is not to treat it as a black box probability, but to have some kind of algorithmic transparency about our thinking because that actually helps people, might be more useful in terms of making decisions than just a number. > Now one thing I think is interesting is that often people, theyre not interested in my saying, “Theres a 78% chance of something happening.” What they want to know is, how did I get there? What is my arguments? Thats not unreasonable. I really like thinking in terms of probabilities, but I think it often helps people understand what the mechanism is because it tells them something about the world that might help them make a decision. So I think one thing that maybe can be done is not to treat it as a black box probability, but to have some kind of algorithmic transparency about our thinking because that actually helps people, might be more useful in terms of making decisions than just a number.
- [Forecasting s-curves is hard](https://constancecrozier.com/2020/04/16/forecasting-s-curves-is-hard/): Some sweet visualizations of what it says on the title. - [Forecasting s-curves is hard](https://constancecrozier.com/2020/04/16/forecasting-s-curves-is-hard/): Some sweet visualizations of what it says on the title.
- [Fashion Trend Forecasting](https://arxiv.org/pdf/2005.03297.pdf) using Instagram and baking preexisting knowledge into NNs. - [Fashion Trend Forecasting](https://arxiv.org/pdf/2005.03297.pdf) using Instagram and baking preexisting knowledge into NNs.
- [Space Weather Challenge and Forecasting Implications of Rossby Waves](https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2018SW002109). Recent advances may help predict solar flares better. I don't know how bad the worst solar flare could be, and how much a two year warning could buy us, but I view developments like this very positively. - [Space Weather Challenge and Forecasting Implications of Rossby Waves](https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2018SW002109). Recent advances may help predict solar flares better. I don't know how bad the worst solar flare could be, and how much a two year warning could buy us, but I tend to view developments like this very positively.
- [The advantages and limitations of forecasting](https://rwer.wordpress.com/2020/05/12/the-advantages-and-limitations-of-forecasting/). A short and sweet blog post, with a couple of forecasting anecdotes and zingers. - [The advantages and limitations of forecasting](https://rwer.wordpress.com/2020/05/12/the-advantages-and-limitations-of-forecasting/). A short and sweet blog post, with a couple of forecasting anecdotes and zingers.
- The [University of Washington Medicine](https://patch.com/washington/seattle/uw-medicine-forecasting-losses-500-million-summers-end) might be pretending they need more money to try to bait donors. Of course, America being America, they might actually not have enough money. During a pandemic. "UW Medicine has been at the forefront of the national response to COVID-19 in treating critically ill patients". - The [University of Washington Medicine](https://patch.com/washington/seattle/uw-medicine-forecasting-losses-500-million-summers-end) might be pretending they need more money to try to bait donors. Of course, America being America, they might actually not have enough money. During a pandemic. "UW Medicine has been at the forefront of the national response to COVID-19 in treating critically ill patients".
- [Forecasting drug utilization and expenditure: ten years of experience in Stockholm](https://bmchealthservres.biomedcentral.com/articles/10.1186/s12913-020-05170-0). A normally pretty good forecasting model had the bad luck of not foreseeing a Black Swan, and sending a study to a journal just before a pandemic, so that it's being published now. They write: "According to the forecasts, the total pharmaceutical expenditure was estimated to increase between 2 and 8% annually. Our analyses showed that the accuracy of these forecasts varied over the years with a mean absolute error of 1.9 percentage points." They further conclude: "Based on the analyses of all forecasting reports produced since the model was established in Stockholm in the late 2000s, we demonstrated that it is feasible to forecast pharmaceutical expenditure with a reasonable accuracy." Presumably, this has increased further because of covid, sending the mean absolute error through the roof. - [Forecasting drug utilization and expenditure: ten years of experience in Stockholm](https://bmchealthservres.biomedcentral.com/articles/10.1186/s12913-020-05170-0). A normally pretty good forecasting model had the bad luck of not foreseeing a Black Swan, and sending a study to a journal just before a pandemic, so that it's being published now. They write: "According to the forecasts, the total pharmaceutical expenditure was estimated to increase between 2 and 8% annually. Our analyses showed that the accuracy of these forecasts varied over the years with a mean absolute error of 1.9 percentage points." They further conclude: "Based on the analyses of all forecasting reports produced since the model was established in Stockholm in the late 2000s, we demonstrated that it is feasible to forecast pharmaceutical expenditure with a reasonable accuracy." Presumably, this has increased further because of covid, sending the mean absolute error through the roof.
- In this time of need, where global cooperation might prove to be immensely valuable, Italy has lessons to share about how to forecast the coronavirus. The article [Forecasting in the Time Of The Coronavirus](https://www.bancaditalia.it/media/notizie/2020/en_Previsioni_al_tempo_del_coronavirus_Locarno_Zizza.pdf), by the Central Bank of Italy, is only available in Italian. Mysteriously, the press release, however, is in [English](https://www.bancaditalia.it/media/notizia/forecasting-in-the-time-of-coronavirus/). - In this time of need, where global cooperation might prove to be immensely valuable, Italy has lessons to share about how to forecast the coronavirus. The article [Forecasting in the Time Of The Coronavirus](https://www.bancaditalia.it/media/notizie/2020/en_Previsioni_al_tempo_del_coronavirus_Locarno_Zizza.pdf), by the Central Bank of Italy, is only available in Italian. Mysteriously, the press release, however, is in [English](https://www.bancaditalia.it/media/notizia/forecasting-in-the-time-of-coronavirus/).
- [An analogy-based method for strong convection forecasts in China using GFS forecast data](https://www.tandfonline.com/doi/full/10.1080/16742834.2020.1717329). "Times in the past when the forecast parameters are most similar to those forecast at the current time are identified by searching a large historical numerical dataset", and this is used to better predict one particular class of meteorological phenomena. See [here](https://www.eurekalert.org/pub_releases/2020-05/ioap-ata051520.php) for a press release. - [An analogy-based method for strong convection forecasts in China using GFS forecast data](https://www.tandfonline.com/doi/full/10.1080/16742834.2020.1717329). "Times in the past when the forecast parameters are most similar to those forecast at the current time are identified by searching a large historical numerical dataset", and this is used to better predict one particular class of meteorological phenomena. See [here](https://www.eurekalert.org/pub_releases/2020-05/ioap-ata051520.php) for a press release.
- Some interesting discussion about forecasting over at Twitter, in [David Manheim](https://twitter.com/davidmanheim)'s, [Philip Tetlock](https://twitter.com/PTetlock)'s accounts, some of which have been incorporated into this newsletter. [This twitter thread](https://twitter.com/lukeprog/status/1262492767869009920) contains some discussion about how Good Judgement Open, Metaculus and expert forecasters fare against each other. In particular, note the caveats by @LinchZhang: "For Survey 10, Metaculus said that question resolution was on 4pm ET Sunday, a lot of predictors (correctly) gauged that the data update on Sunday will be delayed and answered the letter rather than the spirit of the question (Metaculus ended up resolving it ambiguous). [This thread](https://twitter.com/mlipsitch/status/1257857079756365824) by Marc Lipsitch has become popular, and I personally also enjoyed [these](https://twitter.com/LinchZhang/status/1262127601176334336) [two](https://twitter.com/LinchZhang/status/1261427045977874432) twitter threads by Linchuan Zhang, on forecasting mistakes. - Some interesting discussion about forecasting over at Twitter, in [David Manheim](https://twitter.com/davidmanheim)'s, [Philip Tetlock](https://twitter.com/PTetlock)'s accounts, some of which have been incorporated into this newsletter. [This twitter thread](https://twitter.com/lukeprog/status/1262492767869009920) contains some discussion about how Good Judgement Open, Metaculus and expert forecasters fare against each other. In particular, note the caveats by @LinchZhang: "For Survey 10, Metaculus said that question resolution was on 4pm ET Sunday, a lot of predictors (correctly) gauged that the data update on Sunday will be delayed and answered the letter rather than the spirit of the question (Metaculus ended up resolving it ambiguous). [This thread](https://twitter.com/mlipsitch/status/1257857079756365824) by Marc Lipsitch has become popular, and I personally also enjoyed [these](https://twitter.com/LinchZhang/status/1262127601176334336) [two](https://twitter.com/LinchZhang/status/1261427045977874432) twitter threads by Linchuan Zhang, on forecasting mistakes.
- The Cato Institute releases [12 New Immigration Ideas for the 21st Century](https://www.cato.org/publications/white-paper/12-new-immigration-ideas-21st-century), including two from Robin Hanson: Choosing Immigrants through Prediction Markets & Transferable Citizenship. - The Cato Institute releases [12 New Immigration Ideas for the 21st Century](https://www.cato.org/publications/white-paper/12-new-immigration-ideas-21st-century), including two from Robin Hanson: Choosing Immigrants through Prediction Markets & Transferable Citizenship.
- [Forecasting the Weather in 1946](https://www.smh.com.au/environment/weather/from-the-archives-1946-forecasting-the-world-s-weather-20200515-p54tfd.html) - [Forecasting the Weather in 1946](https://www.smh.com.au/environment/weather/from-the-archives-1946-forecasting-the-world-s-weather-20200515-p54tfd.html)
- Some films are so bad it's funny. [This article fills the same niche](https://www.moneyweb.co.za/investing/yes-it-is-possible-to-predict-the-market/) for forecasting. It has it all: Pythagorean laws of vibration, epicycles, an old and legendary master with mystical abilities, 90 year predictions which come true. Further, from the [Wikipedia entry](https://en.wikipedia.org/wiki/William_Delbert_Gann#Controversy): "He told me that his famous father could not support his family by trading but earned his living by writing and selling instructional courses." - Some films are so bad it's funny. [This article fills the same niche](https://www.moneyweb.co.za/investing/yes-it-is-possible-to-predict-the-market/) for forecasting. It has it all: Pythagorean laws of vibration, epicycles, an old and legendary master with mystical abilities, 90 year predictions which come true. Further, from the [Wikipedia entry](https://en.wikipedia.org/wiki/William_Delbert_Gann#Controversy): "He told me that his famous father could not support his family by trading but earned his living by writing and selling instructional courses."
- [A General Approach for Predicting the Behavior of the Supreme Court of the United States](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2463244). What seems to be a pretty simple algorithm (a random forest!) seems to do pretty well (70% accuracy). Their feature set is rich doesn't seem to include ideology. It was written in 2017; today, I'd expect that a random bright highschooler could do much beter. - [A General Approach for Predicting the Behavior of the Supreme Court of the United States](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2463244). What seems to be a pretty simple algorithm (a random forest!) seems to do pretty well (70% accuracy). Their feature set is rich doesn't seem to include ideology. It was written in 2017; today, I'd expect that a random bright highschooler could do much beter.
@ -156,36 +159,34 @@ The Center for Security and Emerging Technology is looking for forecasters to pr
- [Pan-African Heatwave Health Hazard Forecasting](http://www.walker.ac.uk/research/projects/pan-african-heatwave-health-hazard-forecasting/). "The main aim, is to raise the profile of heatwaves as a hazard on a global scale. Hopefully, the project will add evidence to this sparse research area. It could also provide the basis for a heat early warning system." The project looks to be in its early stages, yet nonetheless interesting. - [Pan-African Heatwave Health Hazard Forecasting](http://www.walker.ac.uk/research/projects/pan-african-heatwave-health-hazard-forecasting/). "The main aim, is to raise the profile of heatwaves as a hazard on a global scale. Hopefully, the project will add evidence to this sparse research area. It could also provide the basis for a heat early warning system." The project looks to be in its early stages, yet nonetheless interesting.
- [How to evaluate 50% predictions](https://www.lesswrong.com/posts/DAc4iuy4D3EiNBt9B/how-to-evaluate-50-predictions). "I commonly hear (sometimes from very smart people) that 50% predictions are meaningless. I think that this is wrong." - [How to evaluate 50% predictions](https://www.lesswrong.com/posts/DAc4iuy4D3EiNBt9B/how-to-evaluate-50-predictions). "I commonly hear (sometimes from very smart people) that 50% predictions are meaningless. I think that this is wrong."
- [Named Distributions as Artifacts](https://blog.cerebralab.com/Named%20Distributions%20as%20Artifacts). On how the named distributions we use (the normal distribution, etc.), were selected for being easy to use in pre-computer eras, rather than on being a good ur-prior on distributions for phenomena in this universe. - [Named Distributions as Artifacts](https://blog.cerebralab.com/Named%20Distributions%20as%20Artifacts). On how the named distributions we use (the normal distribution, etc.), were selected for being easy to use in pre-computer eras, rather than on being a good ur-prior on distributions for phenomena in this universe.
- [The fallacy of placing confidence in confidence intervals](https://link.springer.com/article/10.3758/s13423-015-0947-8). On how the folk interpretation of confidence intervals can be misguided, as it conflates: a. the long-run probability, before seeing some data, that a procedure will produce an interval which contains the true value, and b. and the probability that a particular interval contains the true value, after seeing the data. This is in contrast to Bayesian theory, which can use the information in the data to determine what is reasonable to believe, in light of the model assumptions and prior information. I found their example where different confidence procedures produce 50% confidence intervals which are nested inside each other particularly funny. Some quotes: - [The fallacy of placing confidence in confidence intervals](https://link.springer.com/article/10.3758/s13423-015-0947-8). On how the folk interpretation of confidence intervals can be misguided, as it conflates: a. the long-run probability, before seeing some data, that a procedure will produce an interval which contains the true value, and b. and the probability that a particular interval contains the true value, after seeing the data. This is in contrast to Bayesian theory, which can use the information in the data to determine what is reasonable to believe, in light of the model assumptions and prior information. I found their example where different confidence procedures produce 50% confidence intervals which are nested inside each other particularly funny. Some quotes:
> Using the theory of confidence intervals and the support of two examples, we have shown that CIs do not have the properties that are often claimed on their behalf. Confidence interval theory was developed to solve a very constrained problem: how can one construct a procedure that produces intervals containing the true parameter a fixed proportion of the time? Claims that confidence intervals yield an index of precision, that the values within them are plausible, and that the confidence coefficient can be read as a measure of certainty that the interval contains the true value, are all fallacies and unjustified by confidence interval theory. > Using the theory of confidence intervals and the support of two examples, we have shown that CIs do not have the properties that are often claimed on their behalf. Confidence interval theory was developed to solve a very constrained problem: how can one construct a procedure that produces intervals containing the true parameter a fixed proportion of the time? Claims that confidence intervals yield an index of precision, that the values within them are plausible, and that the confidence coefficient can be read as a measure of certainty that the interval contains the true value, are all fallacies and unjustified by confidence interval theory.
> “I am not at all sure that the confidence is not a confidence trick. Does it really lead us towards what we need the chance that in the universe which we are sampling the parameter is within these certain limits? I think it does not. I think we are in the position of knowing that either an improbable event has occurred or the parameter in the population is within the limits. To balance these things we must make an estimate and form a judgment as to the likelihood of the parameter in the universe that is, a prior probability the very thing that is supposed to be eliminated.” > “I am not at all sure that the confidence is not a confidence trick. Does it really lead us towards what we need the chance that in the universe which we are sampling the parameter is within these certain limits? I think it does not. I think we are in the position of knowing that either an improbable event has occurred or the parameter in the population is within the limits. To balance these things we must make an estimate and form a judgment as to the likelihood of the parameter in the universe that is, a prior probability the very thing that is supposed to be eliminated.”
> The existence of multiple, contradictory long-run probabilities brings back into focus the confusion between what we know before the experiment with what we know after the experiment. For any of these confidence procedures, we know before the experiment that 50 % of future CIs will contain the true value. After observing the results, conditioning on a known property of the data — such as, in this case, the variance of the bubbles — can radically alter our assessment of the probability. > The existence of multiple, contradictory long-run probabilities brings back into focus the confusion between what we know before the experiment with what we know after the experiment. For any of these confidence procedures, we know before the experiment that 50 % of future CIs will contain the true value. After observing the results, conditioning on a known property of the data — such as, in this case, the variance of the bubbles — can radically alter our assessment of the probability.
> “You keep using that word. I do not think it means what you think it means.” Íñigo Montoya, The Princess Bride (1987) > “You keep using that word. I do not think it means what you think it means.” Íñigo Montoya, The Princess Bride (1987)
- [Psychology of Intelligence Analysis](https://www.cia.gov/library/center-for-the-study-of-intelligence/csi-publications/books-and-monographs/psychology-of-intelligence-analysis/), courtesy of the American Central Intelligence Agency, seemed interesting, and I read chapters 4, 5 and 14. Sometimes forecasting looks like reinventing intelligence analysis; from that perspective, I've found this reference work useful. Thanks to EA Discord user @Willow for bringing this work to my attention. - [Psychology of Intelligence Analysis](https://www.cia.gov/library/center-for-the-study-of-intelligence/csi-publications/books-and-monographs/psychology-of-intelligence-analysis/), courtesy of the American Central Intelligence Agency, seemed interesting, and I read chapters 4, 5 and 14. Sometimes forecasting looks like reinventing intelligence analysis; from that perspective, I've found this reference work useful. Thanks to EA Discord user @Willow for bringing this work to my attention.
- Chapter 4: Strategies for Analytical Judgement. Discusses and compares the strengths and weaknesses of four tactics: situational analysis (inside view), applying theory, comparison with historical situations, and immersing oneself on the data. It then brings up several suboptimal tactics for choosing among hypothesis. - Chapter 4: Strategies for Analytical Judgement. Discusses and compares the strengths and weaknesses of four tactics: situational analysis (inside view), applying theory, comparison with historical situations, and immersing oneself on the data. It then brings up several suboptimal tactics for choosing among hypothesis.
- Chapter 5: When does one need more information, and in what shapes does new information come from? - Chapter 5: When does one need more information, and in what shapes does new information come from?
> Once an experienced analyst has the minimum information necessary to make an informed judgment, obtaining additional information generally does not improve the accuracy of his or her estimates. Additional information does, however, lead the analyst to become more confident in the judgment, to the point of overconfidence. > Once an experienced analyst has the minimum information necessary to make an informed judgment, obtaining additional information generally does not improve the accuracy of his or her estimates. Additional information does, however, lead the analyst to become more confident in the judgment, to the point of overconfidence.
> Experienced analysts have an imperfect understanding of what information they actually use in making judgments. They are unaware of the extent to which their judgments are determined by a few dominant factors, rather than by the systematic integration of all available information. Analysts actually use much less of the available information than they think they do. > Experienced analysts have an imperfect understanding of what information they actually use in making judgments. They are unaware of the extent to which their judgments are determined by a few dominant factors, rather than by the systematic integration of all available information. Analysts actually use much less of the available information than they think they do.
> There is strong experimental evidence, however, that such self-insight is usually faulty. The expert perceives his or her own judgmental process, including the number of different kinds of information taken into account, as being considerably more complex than is in fact the case. Experts overestimate the importance of factors that have only a minor impact on their judgment and underestimate the extent to which their decisions are based on a few major variables. In short, people's mental models are simpler than they think, and the analyst is typically unaware not only of which variables should have the greatest influence, but also which variables actually are having the greatest influence. > There is strong experimental evidence, however, that such self-insight is usually faulty. The expert perceives his or her own judgmental process, including the number of different kinds of information taken into account, as being considerably more complex than is in fact the case. Experts overestimate the importance of factors that have only a minor impact on their judgment and underestimate the extent to which their decisions are based on a few major variables. In short, people's mental models are simpler than they think, and the analyst is typically unaware not only of which variables should have the greatest influence, but also which variables actually are having the greatest influence.
- Chapter 14: A Checklist for Analysts. "Traditionally, analysts at all levels devote little attention to improving how they think. To penetrate the heart and soul of the problem of improving analysis, it is necessary to better understand, influence, and guide the mental processes of analysts themselves." The Chapter also contains an Intelligence Analysis reading list. - Chapter 14: A Checklist for Analysts. "Traditionally, analysts at all levels devote little attention to improving how they think. To penetrate the heart and soul of the problem of improving analysis, it is necessary to better understand, influence, and guide the mental processes of analysts themselves." The Chapter also contains an Intelligence Analysis reading list.
- [The Limits of Prediction: An Analysts Reflections on Forecasting](https://www.cia.gov/library/center-for-the-study-of-intelligence/csi-publications/csi-studies/studies/vol-63-no-4/Limits-of-Prediction.html), also courtesy of the American Central Intelligence Agency. On how intelligence analysts should inform their users of what they are and aren't capable of. It has some interesting tidbits and references on predicting discontinuities. It also suggests some guiding questions that the analyst may try to answer for the policymaker. - [The Limits of Prediction: An Analysts Reflections on Forecasting](https://www.cia.gov/library/center-for-the-study-of-intelligence/csi-publications/csi-studies/studies/vol-63-no-4/Limits-of-Prediction.html), also courtesy of the American Central Intelligence Agency. On how intelligence analysts should inform their users of what they are and aren't capable of. It has some interesting tidbits and references on predicting discontinuities. It also suggests some guiding questions that the analyst may try to answer for the policymaker.
- What is the context and reality of the problem I am facing? - What is the context and reality of the problem I am facing?
- How does including information on new developments affect my problem/issue? - How does including information on new developments affect my problem/issue?
- What are the ways this situation could play out? - What are the ways this situation could play out?
- How do we get from here to there? and/or What should I be looking out for? - How do we get from here to there? and/or What should I be looking out for?
> "We do not claim our assessments are infallible. Instead, we assert that we offer our most deeply and objectively based and carefully considered estimates." > "We do not claim our assessments are infallible. Instead, we assert that we offer our most deeply and objectively based and carefully considered estimates."
- [How to Measure Anything](https://www.lesswrong.com/posts/ybYBCK9D7MZCcdArB/how-to-measure-anything), a review. - [How to Measure Anything](https://www.lesswrong.com/posts/ybYBCK9D7MZCcdArB/how-to-measure-anything), a review.
- The World Meteorological organization, on their mandate to guarantee that [no one is surprised by a flood](https://public.wmo.int/en/our-mandate/water/no-one-is-surprised-by-a-flood). Browsing the webpage it seems that the organization is either a Key Organization Safeguarding the Vital Interests of the World or Just Another of the Many Bureaucracies Already in Existence, but it's unclear how to differentiate between the two. - The World Meteorological organization, on their mandate to guarantee that [no one is surprised by a flood](https://public.wmo.int/en/our-mandate/water/no-one-is-surprised-by-a-flood). Browsing the webpage it seems that the organization is either a Key Organization Safeguarding the Vital Interests of the World or Just Another of the Many Bureaucracies Already in Existence, but it's unclear how to differentiate between the two.
- [95%-ile isn't that good](https://danluu.com/p95-skill/): "Reaching 95%-ile isn't very impressive because it's not that hard to do." - [95%-ile isn't that good](https://danluu.com/p95-skill/): "Reaching 95%-ile isn't very impressive because it's not that hard to do."
- [The Backwards Arrow of Time of the Coherently Bayesian Statistical Mechanic](https://arxiv.org/abs/cond-mat/0410063): - [The Backwards Arrow of Time of the Coherently Bayesian Statistical Mechanic](https://arxiv.org/abs/cond-mat/0410063): Identifying thermodinamic entropy with the Bayesian uncertainty of an ideal observer leads to a contradiction, because as the observer observes more about the system, they update on this information, which reduces uncertainty, and thus entropy.
> "Many physicists think that the maximum entropy formalism is a straightforward application of Bayesian statistical ideas to statistical mechanics. Some even say that statistical mechanics is just the general Bayesian logic of inductive inference applied to large mechanical systems. This approach identifies thermodynamic entropy with the information-theoretic uncertainty of an (ideal) observer's subjective distribution over a system's microstates. In this brief note, I show that this postulate, plus the standard Bayesian procedure for updating probabilities, implies that the entropy of a classical system is monotonically non-increasing on the average -- the Bayesian statistical mechanic's arrow of time points backwards. Avoiding this unphysical conclusion requires rejecting the ordinary equations of motion, or practicing an incoherent form of statistical inference, or rejecting the identification of uncertainty and thermodynamic entropy." + This might be interesting to students in the tradition of E.T. Jaynes: for example, the paper directly conflicts with this LessWrong post: [The Second Law of Thermodynamics, and Engines of Cognition](https://www.lesswrong.com/posts/QkX2bAkwG2EpGvNug/the-second-law-of-thermodynamics-and-engines-of-cognition), part of *Rationality, From AI to Zombies*. The way out might be to postulate that actually, the Bayesian updating process itself would increase entropy, in the form of e.g., the work needed to update bits on a computer. Any applications to Christian lore are left as an excercise for the reader. Otherwise, seeing two bright people being cogently convinced of different perspectives does something funny to my probabilities: it pushes them towards 50%, but also increases the expected time I'd have to spend on the topic to move them away from 50%.
This might be interesting to students in the tradition of E.T. Jaynes: for example, the paper directly conflicts with this LessWrong post: [The Second Law of Thermodynamics, and Engines of Cognition](https://www.lesswrong.com/posts/QkX2bAkwG2EpGvNug/the-second-law-of-thermodynamics-and-engines-of-cognition), part of *Rationality, From AI to Zombies*. The way out might be to postulate that actually, the Bayesian updating process itself would increase entropy, in the form of e.g., the work needed to update bits on a computer. Any applications to Christian lore are left as an excercise for the reader. Otherwise, seeing two bright people being cogently convinced of different perspectives does something funny to my probabilities: it pushes them towards 50%, but also increases the expected time I'd have to spend on the topic to move them away from 50%.
- [Behavioral Problems of Adhering to a Decision Policy](https://pdfs.semanticscholar.org/7a79/28d5f133e4a274dcaec4d0a207daecde8068.pdf) - [Behavioral Problems of Adhering to a Decision Policy](https://pdfs.semanticscholar.org/7a79/28d5f133e4a274dcaec4d0a207daecde8068.pdf)
> Our judges in this study were eight individuals, carefully selected for their expertise as > Our judges in this study were eight individuals, carefully selected for their expertise as
handicappers. Each judge was presented with a list of 88 variables culled from the past performance charts. He was asked to indicate which five variables out of the 88 he would wish to use when handicapping a race, if all he could have was five variables. He was then asked to indicate which 10, which 20, and which 40 he would use if 10, 20, or 40 were available to him. handicappers. Each judge was presented with a list of 88 variables culled from the past performance charts. He was asked to indicate which five variables out of the 88 he would wish to use when handicapping a race, if all he could have was five variables. He was then asked to indicate which 10, which 20, and which 40 he would use if 10, 20, or 40 were available to him.
> We see that accuracy was as good with five variables as it was with 10, 20, or 40. The flat curve is an average over eight subjects and is somewhat misleading. Three of the eight actually showed a decrease in accuracy with more information, two improved, and three stayed about the same. All of the handicappers became more confident in their judgments as information increased. > We see that accuracy was as good with five variables as it was with 10, 20, or 40. The flat curve is an average over eight subjects and is somewhat misleading. Three of the eight actually showed a decrease in accuracy with more information, two improved, and three stayed about the same. All of the handicappers became more confident in their judgments as information increased.
The study contains other nuggets, such as: + The study contains other nuggets, such as:
- An experiment on trying to predict the outcome of a given equation. When the feedback has a margin of error, this confuses respondents. - An experiment on trying to predict the outcome of a given equation. When the feedback has a margin of error, this confuses respondents.
- "However, the results indicated that subjects often chose one gamble, yet stated a higher selling price for the other gamble" - "However, the results indicated that subjects often chose one gamble, yet stated a higher selling price for the other gamble"
- "We figured that a comparison between two students along the same dimension should be easier, cognitively, than a 13 comparison between different dimensions, and this ease of use should lead to greater reliance on the common dimension. The data strongly confirmed this hypothesis. Dimensions were weighted more heavily when common than when they were unique attributes. Interrogation of the subjects after the experiment indicated that most did not wish to change their policies by giving more weight to common dimensions and they were unaware that they had done so." - "We figured that a comparison between two students along the same dimension should be easier, cognitively, than a 13 comparison between different dimensions, and this ease of use should lead to greater reliance on the common dimension. The data strongly confirmed this hypothesis. Dimensions were weighted more heavily when common than when they were unique attributes. Interrogation of the subjects after the experiment indicated that most did not wish to change their policies by giving more weight to common dimensions and they were unaware that they had done so."