diff --git a/blog/2022/09/12/an-experiment-eliciting-relative-estimates-for-open/index.md b/blog/2022/09/12/an-experiment-eliciting-relative-estimates-for-open/index.md index 6bce0fd..e65630b 100644 --- a/blog/2022/09/12/an-experiment-eliciting-relative-estimates-for-open/index.md +++ b/blog/2022/09/12/an-experiment-eliciting-relative-estimates-for-open/index.md @@ -121,7 +121,7 @@ So for example, researcher #4 is saying that the first grant, to research on the ### Elicitation method #4: Discussion and new individual estimates -After holding a discussion round for an hour, participants’ estimates shifted to the following[\[1\]](#fnpmfo0q7i4di): +After holding a discussion round for an hour, participants’ estimates shifted to the following[^1]: ![](https://i.imgur.com/xleSkdf.png) @@ -146,11 +146,11 @@ In the table above, for example, the first light red “FALSE” square under ### Estimates between participants after holding a discussion round were mostly in agreement -The final estimates made by the participants after discussion were fairly concordant[\[2\]](#fnqbjzronh3oi): +The final estimates made by the participants after discussion were fairly concordant[^2]: ![](https://i.imgur.com/xleSkdf.png) -For instance, if we look at the first row, the 90% confidence intervals[\[3\]](#fnacizl98aof) of the normalized estimates are 0.1 to 1000, 48 to 90, -16 to 54, 41 to 124, 23 to 233, and 20 to 180. These all overlap! If we visualize these 90% confidence intervals as lognormals or loguniforms, they would look as follows[\[4\]](#fnclvpudp11e):  +For instance, if we look at the first row, the 90% confidence intervals[^3] the normalized estimates are 0.1 to 1000, 48 to 90, -16 to 54, 41 to 124, 23 to 233, and 20 to 180. These all overlap! If we visualize these 90% confidence intervals as lognormals or loguniforms, they would look as follows[^4]:  ![](https://i.imgur.com/LNqcXxv.png) @@ -191,7 +191,7 @@ participant_hours + organizer_hours So for 9 grants, this is 2.6 to 4.9 hours per grant. Perhaps continued investment could bring this down to one hour per grant. I also think that time might scale roughly linearly with the number of grants, because grants can be divided into buckets, and then we can apply the relative value method to each bucket. Then we can compare buckets at a small additional cost—e.g., by comparing the best grants from each bucket. -I’m not actually sure how many grants the EA ecosystem has, but I’m guessing something like 300 to 1000 grants per year[\[5\]](#fn6fjejaxnj27). Given this, it would take half to two FTEs (full-time equivalents) to evaluate all grants, which was lower than I suspected: +I’m not actually sure how many grants the EA ecosystem has, but I’m guessing something like 300 to 1000 grants per year[^5]. Given this, it would take half to two FTEs (full-time equivalents) to evaluate all grants, which was lower than I suspected: ``` hours_per_participant = 2 to 5 @@ -314,24 +314,12 @@ Note that there are various methodological inelegancies: In part because the initial estimates were not congruent, I procrastinated in hosting the discussion session, which was held around a month after the initial experiment, if I recall correctly. If I were redoing the experiment, I would hold the different parts of this experiment closer together. -1. **[^](#fnrefpmfo0q7i4di)** +[^1]: Note that in the first case, I am displaying the mean, and in the other, the medians. This is because a) means of very wide distributions are fairly counterintuitive, and in various occasions, I don't think that participants thought much about this, and b) because of a methodological accident, participants provided means in the first case and medians in the second. Note also that medians are a pretty terrible aggregation method. + +[^2]: Note that the distributions aren't necessarily lognormally distributed, hence why the medians may look off. See [this spreadsheet](https://docs.google.com/spreadsheets/d/13inKETvESvcOu8UX2uyM7nlUvUNbECEugt3ec_YqnoY/edit?usp=sharing) for details. - Note that in the first case, I am displaying the mean, and in the other, the medians. This is because a) means of very wide distributions are fairly counterintuitive, and in various occasions, I don't think that participants thought much about this, and b) because of a methodological accident, participants provided means in the first case and medians in the second. - - Note also that medians are a pretty terrible aggregation method. +[^3]: 80% for researcher #5, because of idiosyncratic reasons. -2. **[^](#fnrefqbjzronh3oi)** +[^4]: Squiggle model [here](https://www.squiggle-language.com/playground/#code=eNqdkMFOwzAQRH9l5VMiBZQ4BRVLHPmCHDGKAnWTFYkNa5sWRfl34gJqi5Dcdk6r8WqfZ0ZmO7Op%2FDA09MmEI6%2BynfWwQmfo10GNDpu%2BevfYtr2qHKFumWArtPP47B0abWteO1Nb3MI9jFLDrKN3AY9Sf%2FtB434M0s2gBEhGyqqGXjpF4DZGsgyO9w5PClgswRm4y%2Fc7U3YWoiOlYpCr4jZQbhaXUtbGUzRJERgFvxgyFx9j8HzHWP6p66wo%2BBHti5cBw8vy%2Fyinw4yOstT2LfEa14aGpDdtkl8XaQZhKvLXNE0PvvBz50nqKWRm0xfkbtQi). - Note that the distributions aren't necessarily lognormally distributed, hence why the medians may look off. See [this spreadsheet](https://docs.google.com/spreadsheets/d/13inKETvESvcOu8UX2uyM7nlUvUNbECEugt3ec_YqnoY/edit?usp=sharing) for details. - -3. **[^](#fnrefacizl98aof)** - - 80% for researcher #5, because of idiosyncratic reasons. - -4. **[^](#fnrefclvpudp11e)** - - Squiggle model [here](https://www.squiggle-language.com/playground/#code=eNqdkMFOwzAQRH9l5VMiBZQ4BRVLHPmCHDGKAnWTFYkNa5sWRfl34gJqi5Dcdk6r8WqfZ0ZmO7Op%2FDA09MmEI6%2BynfWwQmfo10GNDpu%2BevfYtr2qHKFumWArtPP47B0abWteO1Nb3MI9jFLDrKN3AY9Sf%2FtB434M0s2gBEhGyqqGXjpF4DZGsgyO9w5PClgswRm4y%2Fc7U3YWoiOlYpCr4jZQbhaXUtbGUzRJERgFvxgyFx9j8HzHWP6p66wo%2BBHti5cBw8vy%2Fyinw4yOstT2LfEa14aGpDdtkl8XaQZhKvLXNE0PvvBz50nqKWRm0xfkbtQi). - -5. **[^](#fnref6fjejaxnj27)** - - Open Philanthropy grants for 2021: 216, Long-term future fund grants for 2021: 46, FTX Future fund public grants and regrants: 113 so far, so an expected ~170 by the end of the year. In total this is 375 grants, and I'd wager it will be growing year by year. +[^5]: Open Philanthropy grants for 2021: 216, Long-term future fund grants for 2021: 46, FTX Future fund public grants and regrants: 113 so far, so an expected ~170 by the end of the year. In total this is 375 grants, and I'd wager it will be growing year by year. diff --git a/blog/2022/09/28/granular-AMF/index.md b/blog/2022/09/28/granular-AMF/index.md index 6767c66..5fc15bc 100644 --- a/blog/2022/09/28/granular-AMF/index.md +++ b/blog/2022/09/28/granular-AMF/index.md @@ -76,3 +76,5 @@ Looking again at the mortality rates: that consideration probably roughly ~halves the potential adjustment. The above post was written in response to [GiveWell's Change Our Mind Contest](https://www.givewell.org/research/change-our-mind-contest). But if you are reading this on my blog, you may want to: [Donate to GiveWell](https://secure.givewell.org/). + +PS: I've continued working on this issue [here](https://forum.effectivealtruism.org/posts/BDXnNdBm6jwj6o5nc/five-slightly-more-hardcore-squiggle-models#A_sketch_of_a_more_parsimonious_estimate_of_AMF_s_impact), where I give a template Squiggle model. diff --git a/blog/2022/10/03/samotsvety-nuclear-risk-update-october-2022/.images/385b65003d24f30e9f90423852de859685c6631d.png b/blog/2022/10/03/samotsvety-nuclear-risk-update-october-2022/.images/385b65003d24f30e9f90423852de859685c6631d.png new file mode 100644 index 0000000..a1da11c Binary files /dev/null and b/blog/2022/10/03/samotsvety-nuclear-risk-update-october-2022/.images/385b65003d24f30e9f90423852de859685c6631d.png differ diff --git a/blog/2022/10/03/samotsvety-nuclear-risk-update-october-2022/.images/4f539bd4101b466abc387fbcb6b2efc7c405d346.png b/blog/2022/10/03/samotsvety-nuclear-risk-update-october-2022/.images/4f539bd4101b466abc387fbcb6b2efc7c405d346.png new file mode 100644 index 0000000..f2fba9e Binary files /dev/null and b/blog/2022/10/03/samotsvety-nuclear-risk-update-october-2022/.images/4f539bd4101b466abc387fbcb6b2efc7c405d346.png differ diff --git a/blog/2022/10/03/samotsvety-nuclear-risk-update-october-2022/.images/662e6858fe744995f328679afad4670a6dd7d31d.png b/blog/2022/10/03/samotsvety-nuclear-risk-update-october-2022/.images/662e6858fe744995f328679afad4670a6dd7d31d.png new file mode 100644 index 0000000..c8fb6aa Binary files /dev/null and b/blog/2022/10/03/samotsvety-nuclear-risk-update-october-2022/.images/662e6858fe744995f328679afad4670a6dd7d31d.png differ diff --git a/blog/2022/10/03/samotsvety-nuclear-risk-update-october-2022/.images/74b85b0113afe3c1f575cc34d6087cfd19b28f5c.png b/blog/2022/10/03/samotsvety-nuclear-risk-update-october-2022/.images/74b85b0113afe3c1f575cc34d6087cfd19b28f5c.png new file mode 100644 index 0000000..f619e69 Binary files /dev/null and b/blog/2022/10/03/samotsvety-nuclear-risk-update-october-2022/.images/74b85b0113afe3c1f575cc34d6087cfd19b28f5c.png differ diff --git a/blog/2022/10/03/samotsvety-nuclear-risk-update-october-2022/.images/8aefbfe8fd2b29874a5546c0f2a699f97feb10b0.png b/blog/2022/10/03/samotsvety-nuclear-risk-update-october-2022/.images/8aefbfe8fd2b29874a5546c0f2a699f97feb10b0.png new file mode 100644 index 0000000..9c426ed Binary files /dev/null and b/blog/2022/10/03/samotsvety-nuclear-risk-update-october-2022/.images/8aefbfe8fd2b29874a5546c0f2a699f97feb10b0.png differ diff --git a/blog/2022/10/03/samotsvety-nuclear-risk-update-october-2022/.images/a8bec5f8b22ad2540b14639af4b9248dd227eaac.png b/blog/2022/10/03/samotsvety-nuclear-risk-update-october-2022/.images/a8bec5f8b22ad2540b14639af4b9248dd227eaac.png new file mode 100644 index 0000000..1d5c219 Binary files /dev/null and b/blog/2022/10/03/samotsvety-nuclear-risk-update-october-2022/.images/a8bec5f8b22ad2540b14639af4b9248dd227eaac.png differ diff --git a/blog/2022/10/03/samotsvety-nuclear-risk-update-october-2022/.images/b16ac46d86c4352e1dfb53f753fe5b391844a230.png b/blog/2022/10/03/samotsvety-nuclear-risk-update-october-2022/.images/b16ac46d86c4352e1dfb53f753fe5b391844a230.png new file mode 100644 index 0000000..d60337c Binary files /dev/null and b/blog/2022/10/03/samotsvety-nuclear-risk-update-october-2022/.images/b16ac46d86c4352e1dfb53f753fe5b391844a230.png differ diff --git a/blog/2022/10/03/samotsvety-nuclear-risk-update-october-2022/index.md b/blog/2022/10/03/samotsvety-nuclear-risk-update-october-2022/index.md new file mode 100644 index 0000000..2250aa3 --- /dev/null +++ b/blog/2022/10/03/samotsvety-nuclear-risk-update-october-2022/index.md @@ -0,0 +1,326 @@ +Samotsvety Nuclear Risk update October 2022 +============== + + After recent events in Ukraine, [Samotsvety](https://samotsvety.org/) convened to update our probabilities of nuclear war. In March 2022, at the beginning of the Ukraine war, we were at ~[0.01%](https://forum.effectivealtruism.org/posts/KRFXjCqqfGQAYirm5/samotsvety-nuclear-risk-forecasts-march-2022) that London would be hit with a nuclear weapon in the next month. Now, we are at ~0.02% for the next 1-3 months, and at 16% that Russia uses any type of nuclear weapon in Ukraine in the next year.  + +Expected values are more finicky and more person-dependent than probabilities, and readers are encouraged to enter their own estimates, for which we provide a [template](https://www.squiggle-language.com/playground/#code=eNqVU9tu00AQ%2FZVRnqBqklZAhSz6ADSCiKitlIQKyS8be2yPupk1e2mwqv4740tayFV9sbyzZ86ZMzP72HOFWU3Dcqls1Yu8DXjahEYpeWPXEWLypPT0d6A81zj1ljjvRb3hEK5DolFZCPdWEWPMNjhSc4euu7lDVRp2Y563CLgESfvUh4y07hPHXLMYj%2BAL5eWDUKIlk4LJwNMSgRxozDwEdiUmlBGmcZfWiV9%2Fnt3EjC5RWnkyfBO8oxQ7xW%2F0gGv1OyV13WO6XcVL9szUfHNODEsXJKC0wI%2F6OokZ4DU17DAB3gB354nIGx4KupBueyPoBeXzH%2FPpV%2FJVw9CwjvZIblv8J33T3WH3J1B7e5V6Z268LFXigRi0cR4KE6yTHcGlYLTYmlCGoz8yVq84qcb8S5w7qef9Wd2Ki7POQlIozoUzVZVrQZfw7uJDzKU1aUi8VPK9ph7zlaok%2FaLOPv%2B4ka0WGmemrrjEL5gZu6NH2BSD6UTKbSgFsrdvzcjfnEMftqnfdl075rXhePHVZe3yVcebpt5asxDBCnIjz0ScKu2MjFBei5J3RCkZV3FiZSwJZNJ%2FIzouJAUo1zL0YSz3qJy8BGMhxfXhWVe81lNTkMjf4Rx6nvF%2F6J9KBxRFRzljU6YL9kGYOW%2BxK2N1CirzaOW0XvsVsQQ6jsFgUC9Sc7oiV%2Bq6C491I%2BDog4yOIk5booO7Hx2%2B7ij2bUi092atvblu0XaoRcb81Hv6C9nOCyo%3D). We’d guess that readers would lose 2 to 300 hours by staying in London in the next 1–3 months, but this estimate is at the end of a garden of forking paths, and more pessimistic or optimistic readers might make different methodological choices. We would [recommend leaving if Russia uses a tactical nuclear weapon in Ukraine](https://forum.effectivealtruism.org/posts/2nDTrDPZJBEerZGrk/samotsvety-nuclear-risk-update-october-2022#Estimating_the_value_of_leaving_London_or_other_major_cities). + +Since March, we have also added our track record to [samotsvety.org/track-record](https://samotsvety.org/track-record/), which might be of use to readers when considering how much weight to give to our predictions.  + +_Update 2022-10-04: Changed our estimates as a result of finding an aggregation error. You can see the previous version  of our post_ [_here_](https://web.archive.org/web/20221003195959/https://forum.effectivealtruism.org/posts/2nDTrDPZJBEerZGrk/samotsvety-nuclear-risk-update-october-2022)_. We also noticed that because of the relatively low number of estimates, they are fairly sensitive to each forecasts, so we are working on incorporating more forecasts._ + +_Update 2022-10-19: These estimates seem a bit out of date now; see_ [_this comment_](https://forum.effectivealtruism.org/posts/2nDTrDPZJBEerZGrk/samotsvety-nuclear-risk-update-october-2022?commentId=fYGxRsRCfzM4nWvN9#comments) _and_ [_these forecasts from the Swift Institute_](https://www.swiftcentre.org/will-russia-use-a-nuclear-weapon/)_._ + +## Question decomposition + +We have updated our decomposition to the following: + +1. What is the probability that Russia will use a nuclear weapon in Ukraine in the next MONTH? +2. Conditional on Russia using a nuclear weapon in Ukraine what is the probability that nuclear conflict will scale beyond Ukraine in the next MONTH after the initial nuclear weapon use? +3. Conditional on the nuclear conflict expanding to NATO, what is the chance that London would get hit, one MONTH after the first non-Ukraine nuclear bomb is used? + +For each of those questions, we also asked forecasters for their yearly probabilities. Following up on previous feedback, we also asked forecasters for their core reasons behind their forecasts, and we’ll present those alongside their probabilities. + +We also asked a range of questions about counterfactuals: + +* Conditional on Russia NOT using a nuclear weapon in Ukraine, what is the probability of a nuclear conflict outside Ukraine in the next MONTH? +* Conditional on Russia NOT using a nuclear weapon in Ukraine what is the probability that nuclear conflict will scale beyond Ukraine in the next YEAR? +* Conditional on Russia NOT dropping a nuclear weapon in Ukraine in October, what is the probability that London will be hit with a nuclear weapon in October? + +As well as a sanity check: + +* What is the unconditional probability of London being hit with a nuclear weapon in October? + +## Summaries + +### Summary tables + +_For ≤ 1 month staggering times between each step_ + +| Event | Conditional on previous step | Unconditional probability | +|-----------------------------------------------------------------------------------------------|------------------------------|---------------------------| +| Russia uses a nuclear weapon in Ukraine in the next month | — | 5.3% | +| Nuclear conflict scales beyond Ukraine in the next month after the initial nuclear weapon use | 2.5% | 0.13% | +| London gets hit, one month after the first non-Ukraine nuclear bomb is used? | 14% | 0.02% | + +_For ≤ 1 year staggering times between each step_ + +| Event | Conditional on previous step | Unconditional probability | +|----------------------------------------------------------------------------------------------|------------------------------|---------------------------| +| Russia uses a nuclear weapon in Ukraine in the next year | — | 16% | +| Nuclear conflict scales beyond Ukraine in the next year after the initial nuclear weapon use | 9.6% | 1.6% | +| London gets hit, one year after the first non-Ukraine nuclear bomb is used? | 23% | 0.36% | + +### Visualizations + +This time, we are also experimenting with providing a few visualizations. Their advantage is that they may be more intuitive; the disadvantage is that they may gloss over the shape of our uncertainty, and thus mislead. Reader beware. + +For the forecast with one month between each escalation step, we have: + + + + + + + + + +### A forecaster’s perspective + +In order to understand at what level we are forecasting here, we are providing forecasters’ comments. One forecaster provided his comments in a more self-contained form—rather than question by question—so I’m presenting those comments here, lightly edited: + +> _In general, nuclear rhetoric has been used extensively before and it seems that it was fairly successful at achieving its intended goals without having to use the weapons (e.g., Germany was hesitant to send weapons to Ukraine). I think such bluffing might be wearing off but Moscow is very good at maintaining ambiguity._ +> +> * _Nonetheless, previously stated “red lines” have already been crossed in this war without nuclear escalation. E.g., cross-border raids into Belgorod and strikes against Crimea._ +> +> _Being ambiguous about one’s willingness to use these weapons is what we have seen in the past and is what we see now. E.g., Zvi_ [_previously summarizes_](https://thezvi.substack.com/p/ukraine-post-12)_, when discussing a recent_ [_Putin's speech_](https://en.kremlin.ru/events/president/transcripts/69390)_:_ +> +> \> What I heard were several instances of drawing a distinction between Russia and its territorial integrity, and the territories under occupation. He said that the call-ups would be ‘sufficient for the operation.’ He declared his intention to keep the territory, if he can maintain physical control. Then, he went back to saying that Ukraine was getting weapons that could ‘threaten Russia,’ explicitly including Crimea as part of Russia but not Donbass, whereas Ukraine’s normal forces can obviously already threaten Donbass or Kherson. He framed his threats of nuclear use in response to claimed Western nuclear blackmail and what he says are Western attempts to get Ukraine to invade clearly Russian territories. +> +> _Using nukes doesn’t feel like a good choice._ +> +> * _Using one on a battlefield can’t be all much helpful. The frontline is ~1,000 km; troops are not concentrated. I guess the main benefits can come from “scaring troops,” “being credibly nuclear,” and maybe destroying key infrastructure._ +> * _Breaking the nuclear taboo is likely to alienate parties that are ~neutral right now — most of all India. This effect is greater the more damage is done with nukes (e.g., “just testing” vs. using a very small nuke on a battlefield vs. attacking key infrastructure vs. endangering civilians)._ +> * _Using nuclear weapons would also alienate various parties in Russia:_ +> * _IIRC, most people disapprove of the use of nuclear weapons._ +> * _Likewise, elites might be legitimately more scared: it’s one thing to be cut off from EU/US: you can still live lavishly in Russia. It’s another thing to endanger yourself and your loved ones with the salient possibility of nuclear war._ +> * _Even military planners, I think, would not be happy about stretching the nuclear doctrine that far._ +> +> _Consider what will happen if the Ukrainian offensive continues. Russia is losing cities in Lugansk. I feel that Ukrainians are_ [_calling Putin’s nuclear bluff_](https://thezvi.substack.com/p/ukraine-post-12)_. And this gives Putin few good options to work with._ +> +> * _It seems like the most likely option is Russia just trying to sustain the conflict by pouring more resources and will into it. But it also might just lose in the end. I think “partial mobilisation” can be seen through that lens._ +> * _Maybe Putin’s move is just to wait until the winter, when the European energy crisis will be most acutely felt?_ +> * _I think the nuclear pretext might be important for Western leadership, because they can’t just make a deal with Putin right now, he is far beyond redemption. But making deals to “avoid nuclear holocaust” — while also giving citizens cheaper gas — might be manageable._    +> +> _If things go nuclear:_ +> +> * _I think it might be with the “least” scary nuke, because every escalation step, every credibly ambiguous situation could be turned into concessions, pauses, etc. Giving up intermediate steps is not wise._ +> * _Other forecasters discussed, just “testing” or nuking a small island or just dumping it in the Black sea._ +> * _I am worried about the multi-step conditional probabilities we are using here. While I think we have some ability to model the present situation, if the nuclear taboo were to be broken, we would be in unchartered land._ +> * _In this case, people would still push for de-escalation and would try to avoid a Russia–NATO conflict (and especially a full-out Russia–NATO nuclear war). It's just hard to think about._ +> * _(A) Because evidently previous diplomatic efforts would have failed catastrophically, and it’s unclear if there would be any remaining diplomatic tricks in their sleeves;_ +> * _(B) we haven’t been at this level of tension for a while, and we just don’t know how everyone would react;_ +> * _(C) the situation is likely to worsen for Putin (both internally and externally), and Putin might be likely to increase risk-taking as his likelihood of attaining a “win” diminishes._  +> +> _I feel uncomfortable about my estimation process for a few reasons:_ +> +> * _We are in the territory where the “proven technique” of carefully crafting base-rates is less applicable._ +> * _There is a good GJOpen “rule of thumb:” if a decision depends on one person, don’t go below 5%. This is because other people are not transparent to us, we don’t know their constraints and we don’t know the bulk of their incentives. In this case:_ +> * _It's not inconceivable that the decision to invade Ukraine in late February was misinformed (and ~unilateral). Relevant actors might be misinformed now, and they might be misinformed in surprising-to-us ways due to Putin being partly “siloed.”_   + +## Forecaster probabilities and comments + +See a later section for a comment on our aggregation method. + +### Russia using a nuclear weapon in Ukraine + +_**What is the probability that Russia will use a nuclear weapon in Ukraine in the next MONTH?**_ + +* Aggregate probability: 0.053025 (5.303%) +* All probabilities: 0.27, 0.04, 0.02, 0.001, 0.09, 0.08, 0.07 + +_**What is the probability that Russia will use a nuclear weapon in Ukraine in the next YEAR?**_ + +* Aggregate probability: 0.16388 (16%) +* All probabilities: 0.38, 0.11, 0.11, 0.005, 0.42, 0.2, 0.11 + +_**Conditional on Russia using a nuclear weapon in Ukraine in the next year, will it be a tactical nuclear weapon?**_ + +* Aggregate probability: 0.96356 (96%) +* All probabilities: 0.97, 0.93, 0.97, “Yes”, 0.98, 0.95, 0.8 + +_**Forecaster comments**_ + +These have been lightly edited. Reading them is probably indicative of the level at which we are thinking, which has the flavor of “we have a lot of uncertainty about this.” + +> _This is a particularly dangerous time. Many of the gambles Putin has taken so far have gone badly and now he stands a real risk of losing power as the war drags on and he has nothing to show for it. Even still, for Putin, even without moral guardrails, the risks of using nuclear weapons of any kind should still outweigh the benefits if he is seeing things clearly. If things continue to deteriorate, the situation may change, but for now, it seems that although Putin has been weakened, he still has a very good chance of remaining in power if he can simply get to a stalemate in the territories he now controls. Although I've frontloaded a lot of the risk into the next month, if a nuclear weapon is going to be used, there will probably be some build-up before it is deployed with warning signs along the way. It is likely Putin will try to prepare his population, and, while declaring territories within Ukraine to be part of Russia may provide some pretense of a justification, each stage of escalation brings heightened risk. At each stage, it makes sense to escalate slowly to attempt to extract the maximum possible concessions a before taking on the increased risk of further escalation. I would expect to see nuclear tests or warning shots before seeing nuclear attacks, and for the first nuclear attack, tactical nuclear weapons would be the most logical starting point._  + +> _I think the use of nuclear weapons tactically would be a lot easier for Putin to explain to the Russian people. Perhaps strategic use could come afterwards, if he is in a desperate situation._ +> +> _I think that Putin is 100% committed to conquering Ukraine. His "special military action" has largely failed so far, so he is expanding his military efforts with a "partial" mobilization. If that fails, or perhaps in combination with increased military mobilization, it looks possible to me that he could detonate a tactical nuclear weapon in the mistaken belief that it would make NATO countries back off at least from territory that Russia currently controls. In reality, I think detonating a tactical nuclear weapon would have the opposite effect, though._ + + +  + +> _\[My uncertainty is\] primarily methodological and from skewing to uncertainty. The main errors in the_ [_Superforecaster post-mortem_](https://goodjudgment.com/wp-content/uploads/2022/03/1570-Post-Mortem-v2.pdf) _for predicting invasion were overreliance on certain base rates and underestimating Putin’s willingness to take major risks. I’m hesitant to make the same mistakes twice._ +> +> _I also think Putin and Kremlin officials are less analyzable than most seem to think. I still don’t have a compelling explanation for why Putin wants Ukraine so bad and why he’s taken so much risk up until this point, which to me says my mental model of their decision-making isn’t good enough to do much with._ + +> _Plausible scenarios exist where Putin uses a tactical nuke, probably to scare Ukraine, divide NATO, etc._  + + +  + +> _I would be higher with my first two estimates if they included an attack on a nuclear plant that could lead to a radiation disaster. This might be Putin's preferred method because he could keep a level of ambiguity as to Russia being responsible. That said, Putin's reason for using a tactical nuclear weapon might precisely be to let Ukraine and the world know how serious he is about not backing down. I think Putin wants to win the Ukraine War at pretty much any cost._ +> +> _\> \[…\] I think Putin would almost definitely use a tactical nuke instead of a strategic one because it would make Ukraine and America/NATO more fearful of the situation without as high of a chance of a nuclear apocalypse (when compared to a strategic nuke being detonated in Ukraine)._ + +> _Putin has established a land bridge to Crimea, which is a major strategic goal for Russia. In recent speeches, he has explicitly said that Russia will use everything it has on the table to protect the newly annexed region._ + +> _Using nuclear weapons would drastically upend the current geopolitical order. But I don't have enough confidence to confidently reject that outcome._ + +### Nuclear conflict escalating beyond Ukraine after Russia uses a nuclear weapon in Ukraine + +**Conditional on Russia using a nuclear weapon in Ukraine what is the probability that nuclear conflict will scale beyond Ukraine in the next MONTH after the initial nuclear weapon use?**  + +* Aggregate probability: 0.0254 (2.5%) +* All probabilities: 0.15, 0.09, 0.0013, 10^(-5), 0.01, 0.3, 0.05 + +**Conditional on Russia using a nuclear weapon in Ukraine, what is the probability that nuclear conflict will scale beyond Ukraine in the next YEAR after the initial nuclear weapon use?** + +* Aggregate probability: 0.095685 (9.6%) +* All probabilities: 0.2, 0.15, 0.0151, 10^(-5), 0.15, 0.4, 0.1 + +**Forecaster comments** + +> I think nuclear war happening as a result of Russia using a tactical nuke in Ukraine is not extremely unlikely because the world would be in somewhat unprecedented territory, so this could make for a catastrophe as a result of miscalculations on one or both sides. + +> If Russia uses a nuclear weapon, the west probably would not respond with a nuclear strike, but would probably try to use other channels which I won't speculate about publicly. Depending on the type, scale, and impact of the attack, a nuclear response is possible. If there is no Russian nuclear attack there is a minuscule chance of either a preemptive strike (based on intelligence that Russia is likely to launch a nuclear attack) or a false signal based on something that looks like an attack triggering a nuclear strike against Russia. The fact of heightened tensions makes these kinds of accidents more likely than they would otherwise be. + +> I don't think Russia nuking Ukraine raises the global nuclear risk by much. I think most of the risk still comes from accidental launches due to false alarms, which I think is probably at an elevated risk currently. + +> I think that the [MAD](https://en.wikipedia.org/wiki/Mutual_assured_destruction) precludes nuclear conflict scaling up. And I think that if nuclear conflict were to expand following Russia detonating a nuclear weapon in Ukraine (or elsewhere), then that would likely happen close to immediately. + +> Payload and target of tactical nukes are all widely variable, if one is used I’d imagine those parameters would be chosen to minimize the risk of a nuclear response.  +> +> NATO isn’t currently personally involved in the war, its hard to imagine them deciding to send troops or especially to send nukes in response to a hit on a military target or a demonstration blast on Snake Island or the Black Sea. +> +> It’s possible Putin miscalculates or actually wants nuclear war, but to me the most likely outcome is negotiations (for better or for worse). + +> I have high confidence that nuclear weapons will not be used outside this conflict. +> +> I don't have high confidence that nuclear weapons will not be used in areas close to the strategic landscape (e.g., areas supporting either side in NATO, Belarus, inner Russia, etc.) + +> No one wants it to escalate. Escalating to NATO is suicidal, just clearly a loss for Putin and folks. +> +> Also, I expect revolt of elites or something. As they would feel that this is totally suicidal, not worth it. I expect a lot of people to fear that nuclear war would mean guaranteed death or misery for their families etc.  + +### London being hit with a nuclear weapon, conditional on nuclear conflict escalating beyond Ukraine + +**Conditional on the nuclear conflict expanding to NATO, what is the chance that London would get hit, one MONTH after the first non-Ukraine nuclear bomb is used?**  + +* Aggregate probability: 0.1424 (14%) +* All probabilities: 0.4, 0.15, 0.9985, 0.05, 0.02, 0.002, 0.5 + +**Conditional on the nuclear conflict expanding to NATO, what is the chance that London would get hit, one YEAR after the first non-Ukraine nuclear bomb is used?** + +* Aggregate probability: 0.232015 (23%) +* All probabilities: 0.45, 0.3, 0.9985, 0.05, 0.12, 0.01, 0.5 + +**What is the unconditional probability of London being hit with a nuclear weapon in October?** + +* Aggregate probability: 0.00066 (0.066%) +* All probabilities: 0.01, 0.00056, 0.001251, 10^-8, 0.000144, 0.0012, 0.001 + +**Forecasters' comments** +  + +> If nuclear conflict expands outside of Ukraine, it seems quite likely that London would get hit because I think that the UK would be the second choice of a Russian nuclear attack—the first choice being America. I also think that in the case of a nuclear war, it is a likely scenario that Russia launches a general nuclear attack on most, if not all of, NATO. + +> Barring accidents nd other unlikely circumstances, London will only be a target in the event of full-scale nuclear war. At each stage of escalation, prior to full-scale war, there would be attempts to take off ramps. But, it is possible, even if unlikely, that predetermined nuclear response protocols could kick in, or, in the fog of war mistakes and miscalculations could result in rapid escalation. + +> If there is a nuclear exchange between NATO and Russia, London will be hit very quickly. + +> If a nuclear conflict does expand to NATO, I would still hold out some hope that it doesn't turn into an all-out nuclear war. Thus, my forecast for London getting hit in the event of nuclear conflict with NATO is relatively low. And, if the nuclear conflict expanded to NATO, I'd expect that if London were to get hit, then it would happen within a month. My forecast for the unconditional chance of London getting hit in October is about 10% of my forecast for any nuclear conflict in October and is barely above my forecast conditional on Russia not dropping a nuclear weapon in Ukraine. + +> Conflict likely wouldn’t expand to the exchange of strategic nukes after a tactical nuke exchange. Large cities are where the leaders making decisions are. Its one thing to kill soldiers and civilians but it's another to put your own life on the line. Unlike other questions, we have a fairly strong historical track record here for mutually assured destruction during the cold war. Time has passed and tactical nukes are a key difference, but I think the core concept still applies.  +> +> London getting targeted is also a very foreseeable scenario, I’d be surprised if NATO’s military systems aren’t ready and sophisticated enough to detect and shoot down a missile or submarine.  +> +> There are also layers of complication from assassination, coups, and civil unrest. The risk to Putin feels much more personal than in other scenarios.  + +> Escalation is still possible, e.g. maybe Putin just really hates the West and that’s his true motivation, or maybe conflict simply keeps escalating once nukes are exchanged. But that type of dramatic escalation feels unlikely. + +> Escalation beyond Ukraine doesn't help Russia achieve its strategic goals. + +> hard to see intermediate escalation + +## Comparison vs other sources + +A few other sources which have forecasts on this are: + +* Back in 2019, [Luisa Rodríguez’s analysis](https://forum.effectivealtruism.org/posts/PAYa6on5gJKwAywrF/how-likely-is-a-nuclear-exchange-between-the-us-and-russia) put the chance of a US/Russia nuclear exchange at 0.38%/year (if taking the arithmetic mean of her samples), or a 0.13%/year if taking the geometric mean of odds. +* Back in March, [we gave](https://forum.effectivealtruism.org/posts/KRFXjCqqfGQAYirm5/samotsvety-nuclear-risk-forecasts-march-2022) a 0.067%/month to a “NATO/Russia nuclear exchange killing at least one person in the next month”, and an 18% probability of London being hit with a nuclear weapon after that, for an implied 0.012% monthly probability. +* Back at the end of March, [Peter Scoblic](https://forum.effectivealtruism.org/posts/W8dpCJGkwrwn7BfLk/nuclear-expert-comment-on-samotsvety-nuclear-risk-forecast-2) gave a **heavily caveated** 5% to a “NATO/Russia nuclear exchange killing at least one person in the next month”, and a likewise heavily caveated 65% probability to London being hit with a nuclear weapon after that, for an implied 3.2% probability +* [Zvi](https://thezvi.substack.com/p/ukraine-post-8-risk-of-nuclear-war) and [Daniel Filan](https://danielfilan.com/2022/03/10/prob_smart_londoner_dies_of_russian_nuke.html) also gave their probabilities using our decomposition.  +* Metaculus has several questions on nuclear weapons, such as: + * [Will there be at least one fatality due to deliberate nuclear detonation by 2024?](https://www.metaculus.com/questions/7407/deliberate-nuclear-detonation-by-2024/) (7%) + * [Will there be an offensive nuclear detonation on a nation's capital by 2024, if an offensive nuclear detonation occurs anywhere by 2024?](https://www.metaculus.com/questions/8127/nuclear-detonation-on-a-capital-by-2024/) (20%) + * [Will the first offensive nuclear detonation by 2024 be against a battlefield target, if there's an offensive detonation by then?](https://www.metaculus.com/questions/8585/bt-as-the-first-nuclear-detonation-by-2024/) (53%) + * [Will at least one nuclear weapon be detonated in Ukraine before 2023?](https://www.metaculus.com/questions/12591/nuclear-detonation-in-ukraine-by-2023/) (7%) + * [Will a Russian nuclear weapon be detonated in the US before 2023?](https://www.metaculus.com/questions/12593/2022-russian-nuclear-detonation-in-the-us/) (<1%; note that Metaculus doesn’t accept probabilities below 1%) + * [Will a non-test nuclear detonation cause at least 1 fatality before 2024?](https://www.metaculus.com/questions/7404/nuclear-detonation-fatality-by-2024/) (12%) + * [Will >2 countries offensively detonate nuclear weapons by 2024, if any offensive detonation of a country's nuclear weapon occurs by then?](https://www.metaculus.com/questions/8145/conditional-2-countries-detonate-by-2024/) 35% + * [Will >2 countries have nuclear weapons offensively detonated on or over their territories by 2024, if any country offensively detonates a nuclear weapon by then?](https://www.metaculus.com/questions/8146/conditional-2-countries-attacked-by-2024/) (49%) +* Manifold Markets also has [a few markets](https://manifold.markets/search?s=24-hour-vol&f=open&q=nuclear) on this, such as: + * [Will a nuclear weapon be launched in combat by the end of 2023?](https://manifold.markets/AndyMartin/will-a-nuclear-weapon-be-launched-i-015e44ed91f5) (7%) + * [Will Russia give a nuclear ultimatum to Ukraine and/or it's Western allies during 2022?](https://manifold.markets/Nostradamnedus/will-russia-give-a-nuclear-ultimatu) (80%) + +There is internal discord within Samotsvety about the degree to which the magnitude of the difference between our current and former probabilities is indicative of a lack of accuracy. We Samotsvety updated our endline monthly probability of London being hit with a nuclear weapon by ~2 (~0.02% vs 0.067 \* 0.18 = 0.012%). The difference was higher before correcting an aggregation error, so I've moved discussion to a footnote[\[1\]](#fn2tohbl1ecsm). + +In addition, a [former senior U.S. government official](https://en.wikipedia.org/wiki/Andrew_C._Weber) previously gave me a 20% probability of Russia using nuclear weapons by the end of the year, and at the time I thought that this was too high, but now think that this was a reasonable belief to have, and I regret not having deferred more to him. + +## Estimating the value of leaving London or other major cities + +[Here](https://www.squiggle-language.com/playground/#code=eNqVU9tu00AQ%2FZVRnqBqklZAhSz6ADSCiKitlIQKyS8be2yPupk1e2mwqv4740tayFV9sbyzZ86ZMzP72HOFWU3Dcqls1Yu8DXjahEYpeWPXEWLypPT0d6A81zj1ljjvRb3hEK5DolFZCPdWEWPMNjhSc4euu7lDVRp2Y563CLgESfvUh4y07hPHXLMYj%2BAL5eWDUKIlk4LJwNMSgRxozDwEdiUmlBGmcZfWiV9%2Fnt3EjC5RWnkyfBO8oxQ7xW%2F0gGv1OyV13WO6XcVL9szUfHNODEsXJKC0wI%2F6OokZ4DU17DAB3gB354nIGx4KupBueyPoBeXzH%2FPpV%2FJVw9CwjvZIblv8J33T3WH3J1B7e5V6Z268LFXigRi0cR4KE6yTHcGlYLTYmlCGoz8yVq84qcb8S5w7qef9Wd2Ki7POQlIozoUzVZVrQZfw7uJDzKU1aUi8VPK9ph7zlaok%2FaLOPv%2B4ka0WGmemrrjEL5gZu6NH2BSD6UTKbSgFsrdvzcjfnEMftqnfdl075rXhePHVZe3yVcebpt5asxDBCnIjz0ScKu2MjFBei5J3RCkZV3FiZSwJZNJ%2FIzouJAUo1zL0YSz3qJy8BGMhxfXhWVe81lNTkMjf4Rx6nvF%2F6J9KBxRFRzljU6YL9kGYOW%2BxK2N1CirzaOW0XvsVsQQ6jsFgUC9Sc7oiV%2Bq6C491I%2BDog4yOIk5booO7Hx2%2B7ij2bUi092atvblu0XaoRcb81Hv6C9nOCyo%3D) is a template for calculating risk, given one’s probabilities (also saved [here](https://nunosempere.com/.secret/nuclear-2022-10.squiggle) and [here](https://gist.githubusercontent.com/NunoSempere/42e44c33e4be8c973b49b154e5c0b4d8/raw/1e0ae12d0b0eba7fa784f7747f5cd79d08f41c1c/nuclear-2022-10.squiggle)).  + +If we input **the full range** of our forecasters’ probabilities together with some default values, we get [the following estimate](https://develop--squiggle-documentation.netlify.app/playground#code=eNqVVO9P2zAQ%2FVdO%2FURRW1Kg%2FKjGh22gUa0CpNKhadkkk1ySE66d2Q4sQvzvuzhpYe3aii9OfT6%2F9%2Fzurs8tm%2BmnSTGbCVO2hs4U2PGhi5icNvMIKXIk5OR3QWkqceIMqbQ1bO3twSXKHI0NlbU7wpg2nMFEzHJOQtdLjJ6NyTp%2FEqpQ8YWrIpIoDBQPRpDCUJnCkphatM3JHYpcKztS0zqDERn7R9DbP%2B5A0AsO%2Fbrv16DvP6d%2BPfHr8c82s3zoQkJSdqkh1Q7BZcLxgsCCScegE3A0QyALEhMHhbI5RpQQxktarz7eXocKbSSkcKTVdeEsxdgI%2FEKPOBd7J%2FgZDxgvRPcHbxQG%2FYMO9INfO91B2we8%2FAP%2Fc7Ci%2B5XvVlcKpirSiuvCASGZYKtxu6ECeI%2Fq%2FzwbnAbV7MdMr9UeZ2dcf6c5%2B57S6dfp5DO50iN41Is1lAtTfAlrZ05PT2qHBm%2FL6j%2BrjrxhWzZjs1m7UFnxLrGNF6NZLiIHpEBq6yDTRdXsBmecwyaMKcGLP9w2TqioHKnv7JNlOYdBZdxRAPULokyolCFjUdo66QwOjgahyo2Oi8ixkMsKeaTORcnXj6rb%2FZOl2%2BJe4q2uBOf4CRNtKkfZpyWXGDPXxr8ouUHNszhSdenuMn1O9aVj7i%2BvG%2BMxP8yz88Fah30v7fShC6sq2o2%2FW4nrtC3meaZXo16xV4yq4r5IN0bfs6wSUs1zzdYJaTW3BI%2B34MGnmLQtVWS4zBEkXE%2FNPLaIMhC2RujCiM9RWB5dbSDG%2BWbBy45UXSAg4l%2Bb79CiZ%2F7J%2FiZkgcxoKVXoZdrCPDKySuvcJ21kDCJxaHg3n7onUhxoMHq9XtWYfndONpeVC8%2BVEbD1%2F2C4NaNTA22cpeHm4wZiXR8N157MuZebcrgaqjND9dJ6%2BQs8VVWn) of how many lost hours one loses in expectation as a result of staying in London in the medium term—where, because of the way we prompted forecasters, the “medium term” can range from one to three months: + + + +If we instead input the **forecasters’ aggregate**, rather than the range, we arrive at:  + + + +A [mixture of both estimates](https://develop--squiggle-documentation.netlify.app/playground/#code=eNqVVWFP2zAQ%2FSunfmpRW1KgFKrxYRtoVKsAqXRoWjbJJJf0RGJntgNUiP%2B%2Bs5MW1q6t9sWJ7fN7z%2B%2FukpeGmamnSZnnQs8bQ6tLbPuli5is0osVkmRJZJPfJaVphhOrSaaNYWN%2FHy4xK1CbUBrTFFq34AwmIi84CG030Sofk7F%2BJ5Sh5ANXZZSh0FA%2BaEESQ6lLQ2Jq0NQ7dygKJc1ITqsIRmTsH0H3YNCGoBsc%2BfHAj0HPP079eOLHwc8Ws3zoQEJZ1qGaVFkEOxOWBwQWTCoGlYClHIEMZJhYKKUpMKKEMF7RevXx9jqUaCKRCUtKXpfWUIy1wC%2F0iAuxd4Kv8YDxUnSv%2F05h0DtsQy%2F41ez0W37Byz%2F0r%2F013W98t8opmMpISc4LL4iMCXYatxdKgP9R%2FY9rg1Ug6%2FmY6ZXc5%2BgZ598qjr6ndPp1OvlMdu4RPOrFBsqlKT6FlTOnpyeVQ%2F33afWPdUfesa2asd2sPXBW%2FJfY2otRXojIAknIlLEwU6Urdo05x7AJY0rw4pnLxgoZzUfyO%2FtkWM5R4Iw7DqC6QTQTMmXIWMxNFXQGh8f9UBZaxWVkWcilQx7JczHn48fudO9k5bS4z%2FBWOcEFfsJEaeco%2B7TiEmMWSvsbJTeouBdHskrd3UydU3VowPXldWM85ot5dt7Y6LCvpWYPOrCuolX7u5O4Ctthnmd6M%2BoNe80ot%2B6TdKPVPcuaQ6q4r9k6kRnFJcHtLbjxKSZl5jLSnOYIEs6nYh5TRjMQpkLowIj3URhuXaUhxsVkycuOuCoQEPHb9jO0rJm%2For%2BJrERmNJRK9DJNqR8ZWaZV7JPSWQwisah5tui6J5K8UGN0u11XmH52TqbInAsvzgjY%2BT0Y7oxoV0Bbe2m4fbuG2FRHw407C%2B7VohwC%2F4NkxHkc80e6mT8310LaMHBuHhy4j2qrwgnla%2BP1D%2BYEXqU%3D) gives a 90% confidence interval of ~2 to 300 hours lost. Personally, I would use this second estimate, but it's hard to say why: maybe because I think that taking the minimum and maximum out of each question does a good job of filtering the least accurate forecasts. + +Compare with a [previous estimate](https://forum.effectivealtruism.org/posts/KRFXjCqqfGQAYirm5/samotsvety-nuclear-risk-forecasts-march-2022#Nu_o_Sempere) back in March: + + + +So, the danger of staying in London has increased by ~1-10x since March. We’d guess for most people reading this post moving out of the city for 1-3 months would still cause more value in lost productivity than the updated estimates of expected lost life hours, but it might be a closer call than it was previously. + +For personal purposes, we probably don’t have a better decision rule than “leave major cities if any tactical nukes are dropped in Ukraine” (as this will ~10x risk). + +## Miscellanea + +### A sanity check + +We can compare the directly elicited probability of nuclear war reaching London in October with the conditional steps multiplied directly: + +The conditional steps are: + +1. What is the probability that Russia will use a nuclear weapon in Ukraine in the next MONTH?  0.053025 (5.303%) +2. Conditional on Russia using a nuclear weapon in Ukraine what is the probability that nuclear conflict will scale beyond Ukraine in the next MONTH after the initial nuclear weapon use? 0.0254 (2.5%) +3. Conditional on the nuclear conflict expanding to NATO, what is the chance that London would get hit, one MONTH after the first non-Ukraine nuclear bomb is used? 0.1424 (14%) + +And if we multiply these together, we get 0.053025 \* 0.0254 \* 0.1424 = .00019178930400 (0.019% ~ 0.02%), versus 0.00065 (0.066%) when elicited directly.  + +I think that the conditionals multiplied directly should be higher. Because the directly elicited probability assumes a scenario where escalation happens within one month, whereas the conditionals multiplied directly would include that scenario, but also scenarios where each escalation step is more staggered. + +One way to think about this difference is that a ~3x difference when eliciting unlikely, <1% events is relatively normal. Personally, I (Nuño) would give more weight to the conditionals multiplied directly. + +### Counterfactual baseline risk + +Forecasters also predicted on these counterfactual questions.  + +* Conditional on Russia NOT using a nuclear weapon in Ukraine, what is the probability of a nuclear conflict outside Ukraine in the next MONTH? (0.036%) +* Conditional on Russia NOT using a nuclear weapon in Ukraine what is the probability that nuclear conflict will happen beyond Ukraine in the next YEAR? (0.132%) +* Conditional on Russia NOT dropping a nuclear weapon in Ukraine in October, what is the probability that London will be hit with a nuclear weapon in October? 0.006% + * All probabilities: 0.1%, 0.002%, 0.125%, 0.000001%, 0.001%, 0.01%, 0.005%. + +The first two probabilities are dwarfed by the probabilities in the Russian conflict. The third probability indicates a very low baseline risk, but is also very sensitive to the individual forecasts. + +### A brief note on the aggregation method + +We used the geometric mean of the samples with the minimum and maximum removed to better deal with extreme outliers, as described in [our previous post](https://forum.effectivealtruism.org/posts/KRFXjCqqfGQAYirm5/samotsvety-nuclear-risk-forecasts-march-2022#fnt1dm5d62pkl). Note that the minimum (resp. maximum) do matter. For example, in \[0.1, 1, 10, 100, 1000\], the aggregate would be (1 \* 10 \* 100) ^ (1/3)  = 10. But if we remove 0.1, that aggregate would become (10 \* 100) ^ (1/2) = 31.6.  + +## Acknowledgements + +This is a project by [Samotsvety](https://samotsvety.org/). Thanks to Jared Leibowich, Jonathan Mann, Tolga Bilge, belikewater, Greg Justice (@slapthepancake), Misha Yagudin and Nuño Sempere for providing updates. Thanks as well to Eli Lifland for comments and suggestions, and to Daniel Kokotajlo and Bhuvan Singla for their [probability mass app](https://daniel-kokotajlo.vercel.app/).  + +1. **[^](#fnref2tohbl1ecsm)** + + Dropping into the first person, I (Nuño) felt that the degree to which we updated, or at least the degree to which I personally updated, is indicative that our/my probability wasn’t a [martingale](https://en.wikipedia.org/wiki/Martingale_(probability_theory)), i.e., that it didn’t accurately price the likelihood of future movements. See some discussion about this [here](https://arxiv.org/pdf/1703.06351.pdf), in the context of Nassim Taleb criticizing Nate Silver. Overall, that update to me suggests we should give probabilities closer to 50%, to better adjust for future unknowns, which we maybe aren’t pricing in. + + On the other hand, other proud Samotsvety forecasters point out that our previous forecast was only for March, even though we presented the risk in annualized units. It’s also just straight-out possible that we are in the bottom 10-20% of scenarios. So overall we are not done with our post-mortem, which would also include personal updates in April &c. diff --git a/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/.images/06633c775c2b042e5a8fd3f9d9d0e4499e149cfe.png b/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/.images/06633c775c2b042e5a8fd3f9d9d0e4499e149cfe.png new file mode 100644 index 0000000..603214e Binary files /dev/null and b/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/.images/06633c775c2b042e5a8fd3f9d9d0e4499e149cfe.png differ diff --git a/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/.images/0708f55c68651daf536e2ae7eeb37987cabf0b11.png b/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/.images/0708f55c68651daf536e2ae7eeb37987cabf0b11.png new file mode 100644 index 0000000..04008e6 Binary files /dev/null and b/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/.images/0708f55c68651daf536e2ae7eeb37987cabf0b11.png differ diff --git a/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/.images/3f3f7b089b9618633e30a06d3e6c90803e771a1d.png b/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/.images/3f3f7b089b9618633e30a06d3e6c90803e771a1d.png new file mode 100644 index 0000000..1715d75 Binary files /dev/null and b/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/.images/3f3f7b089b9618633e30a06d3e6c90803e771a1d.png differ diff --git a/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/.images/5a29230b80153babfef7b6221e4cba4c5da83d62.png b/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/.images/5a29230b80153babfef7b6221e4cba4c5da83d62.png new file mode 100644 index 0000000..cbf29c6 Binary files /dev/null and b/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/.images/5a29230b80153babfef7b6221e4cba4c5da83d62.png differ diff --git a/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/.images/62677acf8444693766bf74ca549ea1bbf18c5158.png b/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/.images/62677acf8444693766bf74ca549ea1bbf18c5158.png new file mode 100644 index 0000000..03a9f8f Binary files /dev/null and b/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/.images/62677acf8444693766bf74ca549ea1bbf18c5158.png differ diff --git a/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/.images/6387c34ff11873bc15617be69cc6c2cd2d00b02a.png b/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/.images/6387c34ff11873bc15617be69cc6c2cd2d00b02a.png new file mode 100644 index 0000000..7739b21 Binary files /dev/null and b/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/.images/6387c34ff11873bc15617be69cc6c2cd2d00b02a.png differ diff --git a/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/.images/7bd904654db9305e0869c00ba271e9b75d9bbde6.png b/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/.images/7bd904654db9305e0869c00ba271e9b75d9bbde6.png new file mode 100644 index 0000000..a9911bd Binary files /dev/null and b/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/.images/7bd904654db9305e0869c00ba271e9b75d9bbde6.png differ diff --git a/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/.images/97f16b7633338b44208a48a31a26d45df30069a3.png b/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/.images/97f16b7633338b44208a48a31a26d45df30069a3.png new file mode 100644 index 0000000..7d43c11 Binary files /dev/null and b/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/.images/97f16b7633338b44208a48a31a26d45df30069a3.png differ diff --git a/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/index.md b/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/index.md new file mode 100644 index 0000000..2cc6d5a --- /dev/null +++ b/blog/2022/10/10/five-slightly-more-hardcore-squiggle-models/index.md @@ -0,0 +1,294 @@ +Five slightly more hardcore Squiggle models. +============== + +Following up on [Simple estimation examples in Squiggle](https://forum.effectivealtruism.org/posts/vh3YvCKnCBp6jDDFd/simple-estimation-examples-in-squiggle), this post goes through some more complicated models in Squiggle. + +## Initial setup + +As well as in the [playground](https://www.squiggle-language.com/playground), Squiggle can also be used inside [VS Code](https://code.visualstudio.com/), after one installs [this extension](https://github.com/quantified-uncertainty/squiggle/tree/develop/packages/vscode-ext), following the instructions [here](https://github.com/quantified-uncertainty/squiggle/blob/develop/packages/vscode-ext/README.md). This is more convenient when working with more advanced models because models can be more quickly saved, and the overall experience is nicer. + + + +## Models + +### AI timelines at every point in time + +Recently, when talking [about AI timelines](https://forum.effectivealtruism.org/posts/W7C5hwq7sjdpTdrQF/announcing-the-future-fund-s-ai-worldview-prize), people tend to give probabilities of AGI by different points in time, and about slightly different operationalizations. This makes different numbers more difficult to compare.  + +But the problem with people giving probabilities about different years could be solved by asking or producing probabilities for all years. For example, we could write something like this: + +``` +// Own probability +_sigma(slope, top, start, t) = { +    f(t) = exp(slope*(t - start))/(1 + exp(slope*(t-start))) +    result = top * (f(t) - f(start))/f(start) +    result +} + +advancedPowerSeekingAIBy(t) = { +    sigma_slope = 0.02 +    max_prob = 0.6 +    first_year_possible = 3 + +    // sigma(t) = exp(sigma_slope*(t - first_year_possible))/(1 + exp(sigma_slope*(t-first_year_possible))) +    // t < first_year_possible ? 0 :  (sigma(t) - sigma(first_year_possible))/sigma(first_year_possible)*max_prob +    sigma(t) = _sigma(sigma_slope, max_prob, first_year_possible, t) +    t < first_year_possible ? 0 : sigma(t) +} +instantaneousAPSrisk(t) = { +    epsilon = 0.01 +    (advancedPowerSeekingAIBy(t) - advancedPowerSeekingAIBy(t-epsilon))/epsilon +} + +xriskIfAPS(t) = { +    0.5 +} + +xriskThroughAps(t) = advancedPowerSeekingAIBy(t) * xriskIfAPS(t) +``` + +This produces the cumulative and instantaneous probability of “advanced power-seeking AI” by/at each point in time: + + + +And then, assuming a constant 95% probability of x-risk given advanced power-seeking AGI, we can get the probability of such risk by every point in time: + + + +Now, the fun is that the x-risk is in fact not constant. If AGI happened tomorrow we’d be much less prepared than if it happens in 70 years, and a better model would incorporate that.  + +For individual forecasts, rather than for models which combine different forecasts, <[forecast.elicit.org](https://forecast.elicit.org/)\> had a more intuitive interface. Some forecasts produced using that interface can be seen [here](https://www.lesswrong.com/posts/hQysqfSEzciRazx8k/forecasting-thread-ai-timelines). However, that interface is currently unmaintained. Open Philanthropy has also produced a number of models, generally written in Python. + +### More detailed expected value estimates for potential career pathways + +In the [preceding post](https://forum.effectivealtruism.org/posts/vh3YvCKnCBp6jDDFd/simple-estimation-examples-in-squiggle#Expected_value_for_a_list_of_things__complexity___2_10_), I presented some quick relative estimates for possible career pathways. Shortly after that, Benjamin Todd reached out about estimating the value of various career pathways he was considering. As a result, I created [this more complicated spreadsheet](https://docs.google.com/spreadsheets/d/1QATMTzLUdmxBqD2snhiAkH-_KvwbhGdlYaU8Ho7kjDY/edit?usp=sharing): +  + + + +You can see a higher quality version of this image here: <[https://i.imgur.com/hvq0SeM.png](https://www.google.com/url?q=https://i.imgur.com/hvq0SeM.png&sa=D&source=docs&ust=1665398336469760&usg=AOvVaw1SMBMKFOfTRKgaOAsxR9np)\> + +Instead of using relative values, each row estimates the value of a broad career option in dollars, i.e., in relation to how much EA should be prepared to pay for particular outcomes (e.g., the creation of 80,000 hours, or CSET). One interesting feature of dollars is that they are a pretty intuitive measure, but it breaks down a bit under interrogation (dollars in which year? adjusted for inflation? are we implicitly assuming that twice as many dollars are twice as good?). But as long as the ratios between estimates are meaningful, they are still useful for prioritization. + +For a model in the above style which is more hardcore and more complex, see [here](https://github.com/quantified-uncertainty/squiggle-models/blob/master/bahamas/hierarchical-impact-estimate-ftx-bahamas-joel2.squiggle) (or [here](https://www.squiggle-language.com/playground#code=eNrNWW2P47YR%2Fiu8%2FRLL1sqyHV8uToOiQdPkQ4MGveZTNxfQEmUTK5E6kvLGPe9%2FzwwpyXqhdo0CQWLgvLI4MxzOPPNwyPt0p4%2Fy6X1VFFSd73ZGVSy0r75NuZGqecMFN5zm7z9W%2FHDI2XujuDjc7e6WS%2FI9y0um9IP4H1OSfE2EVAXNZ3FI4iiO41XwIHRVwMCnS861uRDF0iphM%2FwRElQKYYgmSUhONL8QeCILfHwOnh%2FEg4Ap%2FlGJxHApYJLkKKVmMxE%2BBmAyowk4CY7NRECWZNZ9Qe4JyMw7r%2BAnOLPnQhb2ZyjCEq3wjDz%2B5WtBzJEJ0k7gtMsPj%2FA9W4G1MvjQWGW5ZiSuvfueM0VVcuQJzUnO9w%2FiwAz4b5dslwRfUTaD74A8N6OKJZXS%2FMSsGGjnqWLi8iBI%2FUmkMEwYDeMgXjE9a4SCqxCMoEBBy1kjHxI3wUDqQTTR%2FMnwnJvzg6jcAzrgZBNZlFKgkV3zCj%2BlkpAxw08g3BvAj6iKkskyZzsCUBEJNRA8B4F1HL4LwlUcbrZBCBNTQ6g4E5iBGF6wvh2j6InlOzJbQyK3McZ%2BaG%2BFkFoFYRzCV187rRRFhOzIOiQwVQHBOOqRp0cJMd%2BzTKquuxtiJPkSjIdk9XZoudGimWGqqxStrFq0mVBkWcYwaEwwrYsqN7zMASlPkETGRcbyXD7pIy93sKh3aGoVba3vsBYoLkQjuUr1bVtAyOxJqscdKX6drR%2BtgTh%2BDIl73trn%2F8bRlzZoPw%2B9wxj19baxU%2Fyhr4bVgRl7oohFEACcaZ4yRXKJXlKe62hkmwuSyjynSi9tKojMCHrbF8wATRdE16X%2F3sYdYIlDUQuwsUzayDTpH4uYRsThayzwsYKFuSpoprvme9l75YDjNZF3LNya%2BLEhFAYrrUvz1vRY1hGMnbCDBl8cgb5SoC2DJWVnmKPyWBLS9qOtdKaBNLnKz%2BTID0fCNFQrQB6YRUhDdKWYBcWJqXoI4aoLc%2BzbfL7%2BfO6gL%2BcZw%2FLfA8%2Ba341OIDClVIgJmTmDkLunoyzstFyXUmu%2B900DsN8i%2BNeBLew4%2BnxYOy%2FbPtKyZEInsgI%2BVrj7QA7zc2ei2FFHjPWO1HEjc3ABWyPLYbMxLD0zqC1b%2B6u6hENyv5q7X5s4DnpVDOlfDSZpbcEyuKjNNS7%2BG%2FJuwM81WgNG%2FSIe%2BogK%2BdkizzHQ1lEJsEf9OCSSPyv%2F2KRNEJAde4mBylUrdSviPEbWNxqZgNbYImsN3owlP3U6Ix6seAipFe9gw89HELU5rnoOjsI2j4T0h9FRgjDZS7c4%2FbvxkaLQ93WM%2FRO636gq%2FyNnYICghSGGURiWDS61am%2Bwx12%2FQUZv30Fpr4M3Q3VYVbsowNQeOnQ0Bn9Set65znztSCiON7btgJ4f%2BkhYgJ0VC8ZN8Ndpy30g%2Bqh0E8bR28BPc11LVyJZdSiheR5SQq%2FAX164NvTMUiggVq961ax6%2FZo7XdWXHbP7Rbx9xbVBj%2BprUS0hWUcuvoJASmrOJ03%2BgW8DP3mt49jDNWgCJ4hehMhYMcm8mn0ITLUpY70Jdigx3JMuXjPinWhSd5BOP9U5tSZJUGObGLMJ3xCPehvxshSwEFT8EcrfHLmG3DMtPoMf9BGISQBOgD4xSrajp4V9bKvrK%2BK1ieY%2B04QXwNYJN4gp1MYg0L07tYEJ2mcvMKnQ2chrEs6Asj7KAaOWihkwAig9NE55OF3MZ0k2L%2BeYxXuXnHkd6GBubuJXLROO3PBp1EPtTaVgOzmwE89dcTa1uRo3XHtwGDYx05O%2FFvNQw2b8IGXqpueIUZq46osHspn5FTZZUwmo81X8DZr8PP5mIOTsWLN4F6JZnqElDKySsiBggaRc04NirAA%2FyRG2o8hT3c7Qxb9nu8FoEBx%2FFXdk%2B7Hxn5Nq8etix2L6KuZZrs9j2MXhX0oWRE%2BCAT9%2BaOz5IeP6OMKGxT9UkoZTV56SPeBecq2HxzugwNbCaG%2FewsZsN%2BdtfNNhYT82hUB0%2BFr5NzAben6ShuZe7a3TXvu1bXBbPf8GM9pf%2Fr8dpZ5lqsl1o9E1nlN9bi3YCddUM9uR7Edoom%2Bs5bsx8cO4kWx5%2BnWK7necJwTsTcwFZ3VFNSQUtja8L%2FIwmGvIoe7ogXLha4De2aZli5dWCAF0txIcSrCYufNf9M4HK2uPinTS7mZk92rW3pONOspO81%2FAFCdLUzyrZ2jPpJvQAyAbiguZwIUdjcbxmADHVby3Ss91SCv8ku9jPTgEL9CzBVnPy3UAh%2BJFEdyU8pQW9DBOdXMh88u3f1OsrExddNea%2FaGp2G3dEr61Fwc%2Fe1Pr9u%2BBLQ9vTNwO2IQ4R30Z%2BfiLgrC58cjruCctXZ0pH%2F1TzVF5OrbusQmxdb2%2Bde7ddDfXzuBFPRxdX14F%2FVfSoORuvGdXnagrEEyd9ryafZFg2Ml4ddxQMOAOr2ifVYLRbuhVqseCiVs0r05PIhhi3KvihjqywFZRFPkSMOtlYDEM7KKJ1qJd12Lg833tSTC3IamR4v7U%2F1lx6oa7BkZQ%2Fy%2FG37kuc4o3o8tlXyJ6KZnL5atoWy4%2F2RksyHavYSt0sm6i3et4quWbtngSS7Vc2yBMg6KRdPeprwGhlm6YbhIEtZzNze4WHC%2BXTc5chlpGgzchuYeN7jtgs%2B%2BsJPlA%2FiXyc9M1YyK%2FIooVwOzaHnTwWu%2FN3fNvDguDhg%3D%3D) in the Squiggle playground.) + +### A sketch of a more parsimonious estimate of AMF’s impact + +The estimates in [this post](https://forum.effectivealtruism.org/posts/4Qdjkf8PatGBsBExK/adding-quantified-uncertainty-to-givewell-s-cost), and overall GiveWell’s [estimates](https://docs.google.com/spreadsheets/d/1tytvmV_32H8XGGRJlUzRDTKTHrdevPIYmb_uc6aLeas/edit#gid=1377543212) of the value of AMF had been bothering me because they divide the population into very coarse chunks. This is somewhat suboptimal because the first chunk ranges from 0 to 4 years, but malaria mortality [differs a fair amount between a newborn and a four-year-old](https://nunosempere.com/blog/2022/09/28/granular-AMF/). + +Instead, we could express impact estimates in a more elegant functional form. I’ll sketch how this would look like, but I’ll stop halfway through because at some point the functional form would require more research about mortality at each age. + +The core of the impact estimate is a function that takes the number of beneficiaries, the age distribution of a population, and the benefit of that intervention for someone of a given age, and outputs an estimate of impact. + +In Squiggle, this would look as follows: + +``` +valueOfInterventionInPopulation(num_beneficiaries, population_age_distribution, benefitForPersonOfAge) = { +  age_of_beneficiaries = sampleN(population_age_distribution, num_beneficiaries) +  benefits_array = List.map(age_of_beneficiaries, {|a| benefitForPersonOfAge(a)}) +  total_benefits = List.reduce(benefits_array, 0, {|acc, value| acc + value}) +  total_benefits +} +``` + +So for example, if we [feed](https://www.squiggle-language.com/playground/#code=eNq9kk1rhDAQhv9K8KQ0LRaWHhZ66KGFhdJd8BqQWR1tIE5sTLZd1P%2FeqLWwXXePvU3m43nnI23QvOvPxFUVmGOwtsYhH13PubTazB5J0kpQyYeTZakwsUZSGayDAyiH22JDFs0ByUpNG9rp2ikY7JBcle6RsJCZBCOx4az%2BjaZQYprLxsP2bnBwNuXaF212aBpN2%2BKpxIg9slYQY0O%2BLk6BPtZAVSt8C6%2BSzzqJBuKPXpOCMXD0rFdfdFdBHS5pcdZ20C03GULUj0irLah0Bs9Ig7nLMDzV4ywekVnG2bjJjnmb3UyPJZygXpCgs2G8zH0cx4Ku7GDM8Ty28nlKFpjiV42ZBcqGyVdj7MHHLsw3HeJv4e1wFUH%2F8xGC%2Fhu08wBc) the following variables to our function: + +``` +num_beneficiaries = 1000 +population_age_distribution = 10 to 40 +life_expectancy = 40 to 60 +benefitForPersonOfAge(age) = life_expectancy - age +valueOfInterventionInPopulation(num_beneficiaries, population_age_distribution, benefitForPersonOfAge) +``` + +Then we are saying that we are reaching 1000 people, whose age distribution looks like this: + + + +This could use a bit more work to resemble an actual population pyramid. + +and that the benefit is just the remaining life expectancy. This produces the following estimate, in person-years: + + + +But the assumptions we have used aren’t very realistic. We are essentially assuming that we are creating clones of people at different ages, and that they wouldn’t die until the end of their natural 40 to 50-year lifespan. + +To shed these unrealistic assumptions, and produce something we can use to estimate the value of the AMF, we have to: + +1. Add uncertainty about the shape of the population, i.e., uncertainty about how the population pyramid looks +2. Add uncertainty about how many people a distribution reaches +3. Change the shape of the uncertainty about the benefit to more closely resemble the effects of bednet distribution + +The first two are relatively easy to do. + +For uncertainty about the number of beneficiaries, we could naïvely write: + +``` +valueWithUncertaintyAboutNumBeneficiaries(num_beneficiaries_dist, population_age_distribution, benefitForPersonOfAge) = { +  numSamples = 1000 +  num_beneficiaries_samples_list = sampleN(num_beneficiaries_dist, numSamples) +  benefits_list = List.map(num_beneficiaries_samples_list, {|n| valueOfInterventionInPopulation(n, population_age_distribution, benefitForPersonOfAge)}) +  result = mixture(benefits_list) +  result +} +``` + + +However, that would be very slow, because we would be repeating an expensive calculation unnecessarily. Instead, we can do [this](https://www.squiggle-language.com/playground/#code=eNrFk9tKw0AQhl9lyVWqUVPwAIIXFRQKUoWg3gTCNp2kC5vZuAe1tH13dzc9m1bQC%2B82M7PfzM7%2FZxqosfhITFVROQmutTQQ%2BdDdiGkhlxGGTDPKkzfDypJDoiXDMrgO3ik38Fj0UYN8B9RMYB%2BfRG04decQTZUNAaFgOaOSgYpIvcpmtIRsxJSFDY0LRKSp1fdCPoFUAh%2BLXgkdckOmKRLi6kWxDbQ5RauawyA8SP42SccRF%2F1URqWkE8t6sJdOK1qHbb0iMp3RWfuQIe3MPVILTXm2BC%2BREkYmh3C7X0Rij8zziPhNzog9k%2BPmow2X4jzFFH3%2BlenxM%2BYgNWWoJ72hMHpgqtvNib8L4NfyJxUkFCDBNh7YSDeO463gixvNJn5yxpryq2HcFogTNfHiq2aU7iK48%2BTGICrjFrzhl327WVO3PbK4vrLI4UZOWZztLOYIz9YPbwSWoAx33Ip9aiM3POIoGyUL7VNsn9sSruLYGYZ0L5woB7bqt%2BUqz20dZwVk8FlDrinm7ic497lLm9tj9cYNuxdP3A%2F6T%2BYM5l83mcoE): + +``` +valueWithUncertaintyAboutNumBeneficiaries(num_beneficiaries_dist, population_age_distribution, benefitForPersonOfAge) = { +  referenceN = 1000 +  referenceValue = valueOfInterventionInPopulation(referenceN, population_age_distribution, benefitForPersonOfAge) + +  numSamples = 1001 +  num_beneficiaries_samples_list = sampleN(num_beneficiaries_dist, numSamples) +  benefits_list = List.map(num_beneficiaries_samples_list, {|n| referenceValue*n/referenceN}) +  result = mixture(benefits_list) +  result +} +``` + + +That is, we are calculating the value for a beneficiary population of 1000, and then we are scaling this up. This takes about 6 seconds to compute in Squiggle. + +Now, when adding uncertainty about the shape of the population, we are not going to be able to use that trick, and computation will become more expensive. In the limit, maybe I would want to have a distribution of distributions. But in the meantime, I’ll just have a list of possible population shapes, and [compute the shape of uncertainty over those](https://www.squiggle-language.com/playground/#code=eNrFVN9r2zAQ%2FleEn5zOa5VRd1DoQwctFEZaMFsf5mEU5%2BwI5JMnS11DnP99ku3UTpofkD307Xz36btP95219Kq5%2FBuZomBq4V1rZSBoUnczrqVaZzhyzZmI%2Fhie5wIirTjm3rX3woSBx%2BwBNagXQM0lPuCTLI1gLvbRFMkUEDKecqY4VAEp36oJyyGZ8cqSTY1LBKTF6nupnkBVEh%2Bz2xxG5IYsYyTE4WW2SWhrFStKARP%2FIPM7JSPH2PWrEqYUW1iu7%2FbQecFKf1evgCxrVu8W6bPRqqHUUjORrInXlApmJgV%2Fs19AaEOZpgFpJlkTG5NP7ccuuhhXMcbY1J%2B5nv%2FAFJRmHPXidiqNnpji21DxewOasfyXCwoyUGAbT2xmTCndSP500mzh2Gb0LCeJcVMgztSoMb9qpYy75NaV2wWpEmGJB%2FuybzY96%2BaOdMffVuRwI%2Bcs1luDOcOL%2FuKtwQoqIxxvwV%2B1UYMdcSwDyDHv%2B%2BFGc1bCKdZ3wg%2F7v3cex4mX9QFMTT5krU9z4dhdLdMvLf0vAbmko4C0YdiHV10Y9oCwB4Q9YEx7hIvDQWwxv2PcPQsr4Cul7v0g49D9o4JnkMBrCalmmLq37pK66pWt7XnRWtO3D3527%2FAH7mGM3uofoJ96rQ%3D%3D): + +``` +valueWithUncertaintyAboutPopulationShape( + num_beneficiaries_dist, + population_age_distribution_list, + benefitForPersonOfAge + ) = { +  benefits_list = List.map(population_age_distribution_list, + {|population_age_distribution| + valueWithUncertaintyAboutNumBeneficiaries( + num_beneficiaries_dist, + population_age_distribution, + benefitForPersonOfAge + )}) +  result = mixture(benefits_list) +  result +} +population_age_distribution_list = [ + to(2, 40), to(2, 50), to(2, 60), + to(5, 40), to(5, 50), to(5, 60), + to(10, 40), to(10, 50), to(10, 60) +] +``` + + +We still have to tweak the benefits to better capture the benefits of distributing of malaria nets. One first attempt might look as follows: + +``` +benefitForPersonOfAge(age) = { +  result = age > 5 ? mixture(0) : { +    counterfactual_child_mortality = SampleSet.fromDist(0.01 to 0.07) +    // https://apps.who.int/gho/data/view.searo.61200?lang=en +    child_mortality_after_intervention = counterfactual_child_mortality/2 +    chance_live_before = (1-(counterfactual_child_mortality))^(5-age) +    chance_live_after = (1-(child_mortality_after_intervention))^(5-age) +    value = (chance_live_after - chance_live_before) * (life_expectancy - age) +    value +  } +  result +} +``` + + +That is, we are modelling this example intervention of halving child mortality, and for child mortality to be pretty high. The final result looks like [this](https://www.squiggle-language.com/playground/#code=eNrFVFFvmzAQ%2FiunPEFHwDRJK0Xqqk7bpEpTWyna9rBsyCUHWAKbGTttleS%2FzwZSkjRNpO6hTxx35%2B8%2B33e%2BRa%2FKxMNEFwWVT72xkhq92vVlxpSQaw%2FjTDGaT%2F5qlqY5TpRkPO2Ne3Oaa7xNrrlCOUeumODX%2FE6UOqfWdrguonvkmLCYUcmw8qB8jkY0xWjGKgN2r63DgyZXfRXyDmUl%2BG1ylaILF7CYcgCbL5JtQBOraFHmeOMcRH7BxLWIbb0qolLSJ4P1zRzyC1o6%2B2p5sFjS5X6SDnVXNaQSiubRGngNKXGmY3S263lAasg49qDu5BKMDR%2Ban31wU76a8imv4z%2BZyr7zGKWijKunq3uh1Y0uPm0yfilA3Zb%2FUkFighJN4RvjCQkhW84flpoJHJuMDuVNZGwXwIo6qcWvGiph69y5cjMgVZQb4I15ea03Her2jLTHn0fkcCGrLF%2FuNOaEB93FG4ElVjq3uAV7VFpuzIhF2Ug5pn3X3ElGS3yL9C3xw%2Fq%2F2o%2FjwIvlgZwlvMtYv02FY3c1SL%2BUcEIPBsT1Gmv4bJ0Zy7xs57QJt%2BawM9cJgy5h0CUMuoShByPi%2Fp7y%2FR0xNM4JsVsEwpF9qTlLMMLHEmNFeWw33pDY6JmJvbLXtp5%2B2yXjhI8wgksoHh3iwriJA8RC2xef0Fhps7fijOWzqBBGzJwpW655WRNUfiJF8dlwdIhPQsvBfM%2FdBiYIIFOqrMZBQMuy8h8y4ZtpCNJMBDOqaDBn%2BOBXSKXwz8JTQi5zytML5C2L7bIRTQyniG3sIsPkMNXgdA1l%2BoRG0zma%2FiZC2uXmhH3n8HHX%2FeOM%2BrTeVLswNRuDAg3OUa67WPN2xTovQft7%2BLpwAs6u7n3YwbPmamfM323VTHlv9Q%2B7JiAl) (archived [here](https://gist.github.com/NunoSempere/715fd697ff3ebbb704e4a239e559d148)). The output unit is years of life saved. As is, it doesn’t particularly correspond to the impact of any actual intervention, but hopefully, it could be a template that GiveWell could use, after some research.  + +But for reference, the distribution’s impact looks as follows: + + + +### Calculate optimal allocation given diminishing marginal values + +Suppose that we have some diminishing marginal return functions. Then, we may want to estimate the optimal allocation to each opportunity. + +We can express diminishing marginal returns functions using two possible syntaxes: + +``` +diminishingMarginalReturns1(funds) = 1/funds +diminishingMarginalReturns2 = {|funds| 1/(funds^2)} +``` + +The first syntax is more readable, but the second one can be used without a function definition, which is useful for manipulating functions as objects and defining them programmatically, as explained in this footnote ⤻ [\[1\]](#fnmxi7a8ll6t). + +Once we have a few diminishing marginal return curves, we can put them in a list/array: + +``` +diminishingMarginalReturns1(funds) = 1/(100 + funds) +diminishingMarginalReturns2 = {|funds| 1/(funds^1.1)} +diminishingMarginalReturns3 = {|funds| 100/(1k + funds^1.5)} +diminishingMarginalReturns4 = {|funds| 200/(funds^2.2)} +diminishingMarginalReturns5 = {|funds| 2/(100*funds + 1)} +uselessDistribution(funds) = 0 +negativeOpportunity(funds) = 0 + +listOfDiminishingMarginalReturns = [ +  diminishingMarginalReturns1, diminishingMarginalReturns2, +  diminishingMarginalReturns3, diminishingMarginalReturns4, +  diminishingMarginalReturns5, uselessDistribution, +  negativeOpportunity, {|funds| {1/(1 + funds + funds^2)}} +] +``` + +And then we can specify our amount of funds; + +``` +availableFunds = 1M // dollars +calculationIncrement  = 1 // calculate dollar by dollar +Danger.optimalAllocationGivenDiminishingMarginalReturnsForManyFunctions(listOfDiminishingMarginalReturns, availableFunds, calculationIncrement) +``` + +So in this case, the difficulty comes not from applying a function, but from adding that function to Squiggle. This can be seen [here](https://github.com/quantified-uncertainty/squiggle/blob/develop/packages/squiggle-lang/src/rescript/FR/FR_Danger.res#L278). + +Other software (e.g,. Python, R) could also do this, but the usefulness of the above comes from integrating that into Squiggle. For example, we could have an uncertain function produced by some other program, then take its mean (representing its expected value), and feed it to that calculator. + +The [Survival and Flourishing Fund](https://survivalandflourishing.fund/) has some [software](https://youtu.be/jWivz6KidkI?t=487) to do something like this. It has a graphical interface which people can tweak, at the expense of being a bit more simple—their diminishing marginal returns are only determined by three points + +### Defining a toy world + +Lastly, we will define a simple toy world, which has some population growth and some economic growth, as well as some chance of extinction each year. And its value is defined as a function of the consumption of each person, times the chance that the world is still standing.  + +For practical purposes, after some set point we stop calculations, and we calculate the remaining value as some function of the current value. We can understand this as either a) the heat death of the universe, or b) an arbitrary limit such that we are interested in the behaviour of the system as that limit goes to infinity, but we can only extend that limit with more computation. + +This setup allows us to coarsely compare an increase in consumption vs an increase in economic growth vs a reduction in existential risk. In particular, given this setup, existential risk and economic growth would be valued less than in the infinite horizon case, so if their value is greater than some increase in consumption in this toy world, we will have reason to think that this would also be the case in the real world. + +The code is a bit too large to simply paste into an EA Forum post, but it can be seen [here](https://github.com/quantified-uncertainty/squiggle-models/blob/master/toy-world/toy-world.squiggle). For a further tweak, you can see leaner code [here](https://github.com/quantified-uncertainty/squiggle-models/blob/master/toy-world/toy-world.squiggleU) which relies on the import functionality of the [squiggle-cli-experimental package](https://github.com/quantified-uncertainty/squiggle/tree/develop/packages/cli). + +We can also look at the impact that various interventions have on our toy world, with further details [here](https://docs.google.com/spreadsheets/d/1WnplTYJJMeh0zXVUTPBaihE7n1kneW5LDidLvJcGcv4/edit?usp=sharing): + + + +We see that of the sample interventions, increasing population growth by 0.5% has the highest impact. But 0.5%/year is a pretty large amount, and it would be pretty difficult to engineer. So further work could look at the relative difficulty of each of those interventions. Still, that table may serve to make a qualitative argument that interventions such as increasing population growth, economic growth, or reducing existential risk, are probably more valuable than directly increasing consumption. + +## Conclusion + +I presented a few more advanced Squiggle models.  + +A running theme was that expressing estimates as functions—e.g., the chance of AGI at every point in time, the impact of an intervention for all possible ages, a list of diminishing marginal return functions for a list of interventions, a toy world with a population assigned some value at every point in time—might allow us to come up with better and more accurate estimates. Squiggle is not the only software that can do this, but hopefully it will make such estimation easier. + +1. **[^](#fnrefmxi7a8ll6t)** + + We can write: + + ``` + listOfFunctions = [ {|funds| 1/(funds^2)},  {|funds| 1/(funds^3)}] + ``` + + or even  + + ``` + multiplyByI(i) = {|x| x*i} + listOfFunctions = List.map(List.upTo(0,10), {|i| multiplyByI(i)}) + ``` + + ``` + or without the need for a helper: + listOfFunctions2 = List.map(List.upTo(0,10), {|i| {|x| x*i}}) + listOfFunctions2[4](2) // 4 * 2 = 8 + ``` + + This is standard functional programming stuff, and some functionality is missing from Squiggle, such as _List.length_ function. But still. diff --git a/blog/2022/10/12/forecasting-newsletter-september-2022/.images/b3204abd62b736a6e6f183574c20b5c4fef3bc52.png b/blog/2022/10/12/forecasting-newsletter-september-2022/.images/b3204abd62b736a6e6f183574c20b5c4fef3bc52.png new file mode 100644 index 0000000..c2f71b2 Binary files /dev/null and b/blog/2022/10/12/forecasting-newsletter-september-2022/.images/b3204abd62b736a6e6f183574c20b5c4fef3bc52.png differ diff --git a/blog/2022/10/12/forecasting-newsletter-september-2022/.images/e59e88a893bcad65921f367ca07305f714b1faa8.jpg b/blog/2022/10/12/forecasting-newsletter-september-2022/.images/e59e88a893bcad65921f367ca07305f714b1faa8.jpg new file mode 100644 index 0000000..e3c57c3 Binary files /dev/null and b/blog/2022/10/12/forecasting-newsletter-september-2022/.images/e59e88a893bcad65921f367ca07305f714b1faa8.jpg differ diff --git a/blog/2022/10/12/forecasting-newsletter-september-2022/index.md b/blog/2022/10/12/forecasting-newsletter-september-2022/index.md new file mode 100644 index 0000000..c9065e9 --- /dev/null +++ b/blog/2022/10/12/forecasting-newsletter-september-2022/index.md @@ -0,0 +1,188 @@ +Forecasting Newsletter: September 2022. +============== + +## Highlights + +* PredictIt vs Kalshi vs CFTC saga [continues](https://comments.cftc.gov/Handlers/PdfHandler.ashx?id=34691#?w=sapqmnxoxn) +* Future Fund announces [$1M+ prize](https://forum.effectivealtruism.org/posts/W7C5hwq7sjdpTdrQF/announcing-the-future-fund-s-ai-worldview-prize#comments) for arguments which shift their probabilities about AI timelines and dangers +* Dan Luu [looks at the track record of futurists](https://danluu.com/futurist-predictions/) + +## Index + +* Prediction Markets, Forecasting Platforms &co + * PredictIt, Kalshi & the CFTC + * Metaculus + * Manifold Markets + * Squiggle + * Odds and Ends +* Research + * Shortform + * Longform + +Browse past newsletters [here](https://forecasting.substack.com/), or view this newsletter on substack [here.](https://forecasting.substack.com/p/forecasting-newsletter-september-57b) If you have a content suggestion or want to reach out, you can leave a comment or find me on [Twitter](https://twitter.com/NunoSempere). + +## Prediction Markets and Forecasting Platforms + +### PredictIt, Kalshi & the CFTC + + + +America, Land of the Free + +Previously: + +* Kalshi hired a former [CFTC commissioner](https://kalshi.com/blog/former-cftc-commissioner-brian-quintenz-joins-our-board) ([a](https://web.archive.org/web/20220201175613/https://kalshi.com/blog/former-cftc-commissioner-brian-quintenz-joins-our-board)). +* The CFTC [withdrew its no-action letter](https://www.cftc.gov/PressRoom/PressReleases/8567-22) ([a](https://web.archive.org/web/20220805010244/https://www.cftc.gov/PressRoom/PressReleases/8567-22)) from PredictIt +* Kalshi applied to the CFTC for permission to host a market on which party will control the US Congress after the 2022 mid-term elections. The CFTC [asked the public for comments](https://comments.cftc.gov/PublicComments/CommentList.aspx?id=7311) ([a](https://web.archive.org/web/20220828210656/https://comments.cftc.gov/PublicComments/CommentList.aspx?id=7311)) ([secondary source](https://www.politico.com/news/2022/09/05/voters-betting-elections-trading-00054723), ([a](https://web.archive.org/web/20220924141931/https://www.politico.com/news/2022/09/05/voters-betting-elections-trading-00054723))).  + +Since then, on September the 9th [PredictIt sued the CFTC](https://www.jdsupra.com/legalnews/unpredictable-future-of-political-1333136/) ([a](http://web.archive.org/web/20220925015149/https://www.jdsupra.com/legalnews/unpredictable-future-of-political-1333136/)). Richard Hanania comments [why he is joining the lawsuit](https://richardhanania.substack.com/p/why-im-suing-the-federal-government) ([a](http://web.archive.org/web/20221001194707/https://richardhanania.substack.com/p/why-im-suing-the-federal-government)).  + +Solomon Sia and Pratik Chougule—in collaboration with others like myself—wrote [this extremely thorough letter to the CFTC](https://comments.cftc.gov/Handlers/PdfHandler.ashx?id=34691#?w=sapqmnxoxn) ([a](https://web.archive.org/web/20221012143802/https://comments.cftc.gov/Handlers/PdfHandler.ashx?id=34691#w=sapqmnxoxn)), examining many aspects of the decision.  + +There has been [a range of newspaper articles](https://news.google.com/search?q=PredictIt%20CFTC&hl=en-GB&gl=GB&ceid=GB%3Aen) ([a](https://archive.ph/uQEvL)) commenting on the PredictIt spat (e.g., [1](https://www.wsj.com/articles/why-wont-the-cftc-let-you-take-a-position-on-the-election-11582933734), [2](https://slate.com/business/2022/08/predictit-cftc-shut-down-politics-forecasting-gambling.html), [3](https://www.coindesk.com/policy/2021/10/28/the-cftc-vs-the-truth/), [4](https://www.chicagotribune.com/opinion/commentary/ct-opinion-political-prediction-markets-public-discourse-20220906-lfuvziy3fnfkfgw33lzhsno4h4-story.html), etc.), and on Kalshi’s. I particularly liked [this article](https://www.chicagotribune.com/opinion/commentary/ct-opinion-political-prediction-markets-public-discourse-20220906-lfuvziy3fnfkfgw33lzhsno4h4-story.html) ([a](http://web.archive.org/web/20220907164742/https://www.chicagotribune.com/opinion/commentary/ct-opinion-political-prediction-markets-public-discourse-20220906-lfuvziy3fnfkfgw33lzhsno4h4-story.html)) on the Chicago Tribune on how prediction markets are an antidote to degraded public discourse.  + +Kalshi has an [interesting newsletter issue](https://www.kalshikit.co/p/obamas-cabinet-used-prediction-markets) ([a](https://web.archive.org/web/20221012110423/https://www.kalshikit.co/p/obamas-cabinet-used-prediction-markets)) in which they briefly report on how the Obama administration used prediction markets for their decision-making. Note that these would have probably been PredictIt's markets. + +### Metaculus + +Per their newsletter, Metaculus reached 1 million predictions. They have also [reorganized](https://nitter.privacy.com.de/fianxu/status/1569537636917825536) as a [public benefit corporation](https://en.wikipedia.org/wiki/Benefit_corporation) ([a](http://web.archive.org/web/20221001234507/https://en.wikipedia.org/wiki/Benefit_corporation)), i.e., a for-profit entity that aims to pursue some positive impact, as distinct from shareholder value. I think this leaves Metaculus in a better position, and decreases the (already pretty small) chance that Metaculus starts doing some damaging gatekeeping, etc. + +Metaculus is also building an AI Forecasting team, and hiring for [a number of positions](https://apply.workable.com/metaculus/) ([a](http://web.archive.org/web/20220913093930/https://apply.workable.com/metaculus/)), growing its 12-person [strong team](https://www.metaculus.com/about/) ([a](http://web.archive.org/web/20220925082358/https://www.metaculus.com/about/)), presumably using its [2022 Open Philanthropy Grant](https://www.openphilanthropy.org/grants/metaculus-platform-development/) ([a](http://web.archive.org/web/20220929072721/https://www.openphilanthropy.org/grants/metaculus-platform-development/)). + +### Manifold Markets + +Manifold continued having a high development speed, e.g., they added a [Twitch bot](https://manifold.markets/twitch) ([a](http://web.archive.org/web/20221005181649/https://manifold.markets/twitch)) and ran their [first tournaments](https://manifold.markets/tournaments) ([a](https://web.archive.org/web/20221012144555/https://manifold.markets/tournaments)), which I was really glad to see. They have an experimental projects page at [manifold.markets/labs](https://manifold.markets/labs) ([a](http://web.archive.org/web/20221005182149/https://manifold.markets/labs)) And they have added a few reputational features: + +> If a resolved market receives enough reports relative to the number of traders, it will be considered a “bad” market. Creators with enough bad markets will have a warning next to their name on any of their markets. This is just a first step towards reputational features which is a highly requested feature. + +Manifold Markets removed and deprioritized their [numeric markets](https://news.manifold.markets/p/above-the-fold-updates-and-join-our) ([a](http://web.archive.org/web/20220908215157/https://news.manifold.markets/p/above-the-fold-updates-and-join-our)), citing difficulties in user usage. But from the post, the decision to do so seems like it was evaluated on the wrong grounds: It's not that numeric markets will immediately prove popular and intuitive, it's that experimenting with them is a public good that could unlock value in the medium term. + +More generally, as I’ve been seeing in these past few years, I think that there is a huge attractor of sports and wall-street-type bets. And new prediction-market startups tend to flirt with these a bit. I think this is a mistake, because it’s hard to differentiate oneself from competitors on the basis of better sports betting: traditional sports betting houses like Betfair in Europe or DraftKings in the US are already catering to a similar user base. Instead, my recommendation would be to target virgin communities, to which already existing betting houses don’t already cater.  + +You can also see their job board [here](https://www.notion.so/Manifold-Markets-Job-Board-e1b932b3bb2c4ec2b5a95865ec8f0f61) ([a](https://web.archive.org/web/20221012093824/https://www.notion.so/Manifold-Markets-Job-Board-e1b932b3bb2c4ec2b5a95865ec8f0f61)). + +### Squiggle + +[Squiggle](https://www.squiggle-language.com/#code=eNqrVirOyC8PLs3NTSyqVLIqKSpN1QELuaZkluQXQURqARlkDng%3D) is a web-capable language for manipulating probabilities and probability distributions that we at the [Quantified Uncertainty Research Institute](https://quantifieduncertainty.org/) have been working on. In August, we announced a $1k [Squiggle experimentation prize](https://forum.effectivealtruism.org/posts/ZrWuy2oAxa6Yh3eAw/usd1-000-squiggle-experimentation-challenge), which has now been resolved. Winners are: + +* 1st prize of $600 to [Tanae Rao](https://twitter.com/tanaerao?lang=en-GB) for [Adding Quantified Uncertainty to GiveWell's Cost Effectiveness Analysis of the Against Malaria Foundation](https://forum.effectivealtruism.org/posts/4Qdjkf8PatGBsBExK/adding-quantified-uncertainty-to-givewell-s-cost) +* 2nd prize of $300 to [Dan Wahl](https://danwahl.net/) for [CEA LEEP Malawi](https://forum.effectivealtruism.org/posts/BK7ze3FWYu38YbHwo/squiggle-experimentation-challenge-cea-leep-malawi) +* 3rd prize of $100 to [Erich Grunewald](https://www.erichgrunewald.com/posts/how-many-effective-altruist-billionaires-five-years-from-now/) for [How many EA billionaires five years from now?](https://forum.effectivealtruism.org/posts/Ze2Je5GCLBDj3nDzK/how-many-ea-billionaires-five-years-from-now) + +Congrats!  + +We also announced a larger [$5k challenge to quantify the impact of 80,000 hours' top career paths](https://forum.effectivealtruism.org/posts/noDYmqoDxYk5TXoNm/usd5k-challenge-to-quantify-the-impact-of-80-000-hours-top). I think that participation in this contest has a fairly high value, but also a fairly high expected monetary value: I invite readers to do a quick estimation, e.g.: the contest will have 3 to 15 participants, implying each participant will get between ~$300 and $1.6k. + +I also wrote two posts introducing Squiggle: [Simple estimation examples in Squiggle](https://forum.effectivealtruism.org/posts/vh3YvCKnCBp6jDDFd/simple-estimation-examples-in-squiggle) and a follow-up at [Five slightly more hardcore Squiggle models.](https://forum.effectivealtruism.org/posts/BDXnNdBm6jwj6o5nc/five-slightly-more-hardcore-squiggle-models) + +### Odds and Ends + +The FTX Future Fund announces a [$1M+ prize](https://forum.effectivealtruism.org/posts/W7C5hwq7sjdpTdrQF/announcing-the-future-fund-s-ai-worldview-prize) ([a](https://web.archive.org/web/20221002051012/https://forum.effectivealtruism.org/posts/W7C5hwq7sjdpTdrQF/announcing-the-future-fund-s-ai-worldview-prize)) for arguments that shift their probabilities around AGI timelines and dangers. + +Friend of the newsletter Walter Frick has started a [newsletter](https://nonrival.pub/) ([a](https://web.archive.org/web/20221005181423/https://nonrival.pub/))  that combines analysis of a newsworthy topic with an invitation and a prompt for readers to forecast on a related event. The newsletter then reports readers’ forecasts and resolves them when time comes due. Readers might remember Walter from [his excellent coverage of the shutdown of Facebook’s Forecast platform](https://qz.com/2069284/facebook-is-shutting-down-its-experimental-app-forecast/) ([a](https://web.archive.org/web/20220730061335/https://qz.com/2069284/facebook-is-shutting-down-its-experimental-app-forecast/)) at Quartz. + +The [Autocast competition](https://forecasting.mlsafety.org/) ([a](http://web.archive.org/web/20221011173753/https://forecasting.mlsafety.org/)) offers $625k in prizes for improving the forecasting abilities of machine learning models. This builds on the [Autocast](https://arxiv.org/abs/2206.15474) ([a](http://web.archive.org/web/20220914001702/https://arxiv.org/abs/2206.15474)) paper. It might be that the contest has a connection to AI safety, but I'm not really seeing it. The deadline to submit results for the warmup round is February 10th. + +Adam Sherman reports on his frustrations with the [UMA project](https://twitter.com/Squee451/status/1579647834957451264) ([a](https://archive.org/details/uma-unreliable-market-assumption-protocol)). These rhyme somewhat with previous complaints about [Kleros](https://deepfivalue.substack.com/p/the-kleros-experiment-has-failed) ([a](https://web.archive.org/web/20220701003955/https://deepfivalue.substack.com/p/the-kleros-experiment-has-failed)). Abstracting away from the specifics, the UMA oracle is a [Keynesian Beauty Contest](https://en.wikipedia.org/wiki/Keynesian_beauty_contest), meaning that consensus is valued over truth. In this case, a powerful but not dictatorial participant announced that he was going to vote one way, and because the protocol rewards people who vote with the consensus, he convinced others to vote with him. My sense is that a Keynesian Beauty Contest might still be a worthy tradeoff for some crypto protocols because of the added decentralization. But if too many of these events happen, the tradeoff might stop being worth it. + +[Quantified Intuitions](https://www.quantifiedintuitions.org/) is an [epistemics training website](https://forum.effectivealtruism.org/posts/W6gGKCm6yEXRW5nJu/quantified-intuitions-an-epistemics-training-website). Readers might be familiar with the [pastcasting](https://www.pastcasting.com/) app, by the same group. + +The Social Science prediction platform has [added some large-for-graduate-students forecaster incentives](https://socialscienceprediction.org/forecaster_incentives) ([a](http://web.archive.org/web/20220916011552/https://socialscienceprediction.org/forecaster_incentives)). They are offering $100 per 10 surveys completed—a survey is usually just a set of predictions that will be used in a future paper. I welcome this development. I used to view it as annoying that participation was restricted to graduate students and faculty. But the thought came to mind that restriction to academics is just a socially acceptable—if coarse—way of selecting for intelligence without saying as much. + +Reddit has [r/polls/predictions](https://www.reddit.com/r/polls/predictions/) ([a](http://web.archive.org/web/20220709055805/https://www.reddit.com/r/polls/predictions/)), an embryonic implementation of a prediction market tournament inside Reddit. This builds on Reddit's past prediction functionality, as reported [previously](https://forecasting.substack.com/p/forecasting-newsletter-july-2021) ([a](http://web.archive.org/web/20211229170227/https://forecasting.substack.com/p/forecasting-newsletter-july-2021)) in [this newsletter](https://forecasting.substack.com/p/forecasting-newsletter-october-2021) ([a](http://web.archive.org/web/20220217162710/https://forecasting.substack.com/p/forecasting-newsletter-october-2021)). It would be useful to talk to whoever is building this functionality at Reddit. They probably have some different goals, more geared towards being a social media site. But some cross-pollination might still be interesting. + +The Swift Centre has an analysis of [Biden's chances in the 2024 election](https://www.swiftcentre.org/can-biden-win-in-2024/) ([a](http://web.archive.org/web/20220916112924/https://www.swiftcentre.org/can-biden-win-in-2024/)). See also some other forecasts on [Metaforecast](https://metaforecast.org/?query=US+president) ([a](https://archive.ph/4n30X#from=https://metaforecast.org/?query=US+president)), e.g., on [Polymarket](https://polymarket.com/market/will-joe-biden-win-the-us-2024-democratic-presidential-nomination) ([a](http://web.archive.org/web/20220128214008/https://polymarket.com/market/will-joe-biden-win-the-us-2024-democratic-presidential-nomination)) or on [Betfair](https://www.betfair.com/exchange/plus/politics/market/1.178176964) ([a](http://web.archive.org/web/20210831231714/https://www.betfair.com/exchange/plus/politics/market/1.178176964)). + +[Craze](https://www.ycombinator.com/companies/craze) ([a](https://web.archive.org/web/20221012093558/https://www.ycombinator.com/companies/craze)) is a Y-Combinator-funded company which brings predictions markets to India. + +I was surprised to see that famous rapper Nicki Minaj has [partnered](https://maximbet.com/nicki-minaj) ([a](http://web.archive.org/web/20220531210904/https://maximbet.com/nicki-minaj)) with a [sports](https://nitter.privacy.com.de/nickiminaj/status/1531670747399065600) ([a](https://web.archive.org/web/20221012110813/https://nitter.privacy.com.de/nickiminaj/status/1531670747399065600)) betting [site](https://nitter.privacy.com.de/nickiminaj/status/1531670747399065600) ([a](https://web.archive.org/web/20221012110813/https://nitter.privacy.com.de/nickiminaj/status/1531670747399065600)). Curious + +INFER continues to have small-money incentives for forecasters, and sending me [mildly cringy emails](https://i.imgur.com/j0Ar3BH.png) ([a](http://web.archive.org/web/20221012093838/https://i.imgur.com/j0Ar3BH.png)), and talking about a ["Global AI Race"](https://mailchi.mp/cultivatelabs/cset-foretell-launch-9372521) ([a](http://web.archive.org/web/20221012112754/https://mailchi.mp/cultivatelabs/cset-foretell-launch-9372521)). I'd continue to recommend it for university students, because it's one of the few sites that have a team functionality, though. + +On Good Judgment Open, [Will Amazon.com begin to accept any cryptocurrency for purchases on the US site before 1 October 2022?](https://www.gjopen.com/questions/2090-will-amazon-com-begin-to-accept-any-cryptocurrency-for-purchases-on-the-us-site-before-1-october-2022) ([a](http://web.archive.org/web/20220529175114/https://www.gjopen.com/questions/2090-will-amazon-com-begin-to-accept-any-cryptocurrency-for-purchases-on-the-us-site-before-1-october-2022)) just resolved negatively. I remember it being at 30% a year ago. Crazy times. + +## Research + +### Shortform + +Nostalgebraist looks at [AI forecasting one year in](https://nostalgebraist.tumblr.com/post/695521414035406848/on-ai-forecasting-one-year-in) ([a](http://web.archive.org/web/20220917144833/https://nostalgebraist.tumblr.com/post/695521414035406848/on-ai-forecasting-one-year-in)) and warns against taking it as a [stylized fact](https://en.wikipedia.org/wiki/Stylized_fact) ([a](http://web.archive.org/web/20220927235855/https://en.wikipedia.org/wiki/Stylized_fact)) that AI progress is going faster than forecasters expected. + +[Samotsvety Forecasting](https://samotsvety.org/), my forecasting group, looks at the probability of [various AI catastrophes](https://forum.effectivealtruism.org/posts/EG9xDM8YRz4JN4wMN/samotsvety-s-ai-risk-forecasts) in the future, and at the [risk of a nuclear bomb being used](https://forum.effectivealtruism.org/posts/2nDTrDPZJBEerZGrk/samotsvety-nuclear-risk-update-october-2022) ([a](https://web.archive.org/web/20221012124008/https://forum.effectivealtruism.org/posts/2nDTrDPZJBEerZGrk/samotsvety-nuclear-risk-update-october-2022)) in the coming months (see also a [follow-up](https://forum.effectivealtruism.org/posts/8k9iebTHjdRCmzR5i/overreacting-to-current-events-can-be-very-costly) by Kelsey Piper). + + + +Taken from [Polymarket](https://polymarket.com/market/will-russia-use-a-nuclear-weapon-before-2023). Note that money is worth less in the event of a nuclear war. + +Some researchers at the University of Pennsylvania are [looking for forecasters to predict replication outcomes](https://nitter.privacy.com.de/rajtmajer_sarah/status/1573465138300059649) ([a](https://web.archive.org/web/20221012114519/https://nitter.privacy.com.de/rajtmajer_sarah/status/1573465138300059649)). They are paying a $20 base incentive and $25 per market. This is low in absolute terms, but high if you enjoy doing this kind of thing anyways. h/t Ago Lajko. + +Richard Hanania argues that [the problem with polling might be unfixable](https://richardhanania.substack.com/p/the-problem-with-polling-might-be/), i.e,. that Republican nonresponse bias might be very hard to estimate. I left a comment with some suggestions, but I agree that the situation [looks grim](https://richardhanania.substack.com/p/the-problem-with-polling-might-be/comment/9327296) ([a](http://web.archive.org/web/20220927183755/https://richardhanania.substack.com/p/the-problem-with-polling-might-be/comment/9327296)). + +[Two](https://www.lesswrong.com/posts/YQ8H4e7z3q8ngev7J/raising-the-forecasting-waterline-part-1) ([a](http://web.archive.org/web/20220710073545/https://www.lesswrong.com/posts/YQ8H4e7z3q8ngev7J/raising-the-forecasting-waterline-part-1)) old [posts](https://www.lesswrong.com/posts/YEKHh5nyqhpE3E4Bm/raising-the-forecasting-waterline-part-2) ([a](http://web.archive.org/web/20220927155721/https://www.lesswrong.com/posts/YEKHh5nyqhpE3E4Bm/raising-the-forecasting-waterline-part-2)) from ten years ago look at the lessons learnt by someone who was participating in the IARPA forecasting tournament which led to the Superforecasting book. + +### Longform + +Dan Luu looks at the track record of futurists, and finds that their track record is generally poor. Readers of this newsletter should [read the post](https://danluu.com/futurist-predictions/) ([a](https://archive.ph/WJEBd#from=https://danluu.com/futurist-predictions/)). + +For some background points: + +* The AI safety community has been advocating that future artificial intelligence systems (AI) might be so intelligent as to be world-ending dangers. +* Open Philanthropy, a large foundation, is giving some weight to AI safety, and has been donating large amounts of money to that cause. +* As part of their decision-making, Open Philanthropy commissioned research by [friends of the newsletter Arb research](https://arbresearch.com/) ([a](http://web.archive.org/web/20221011153414/https://arbresearch.com/)) on the [track record of the three biggest science-fiction authors of the 20th century](https://arbresearch.com/files/big_three.pdf) ([a](http://web.archive.org/web/20220711161231/https://arbresearch.com/files/big_three.pdf)) (Asimov, Heinlein and Clarke) +* The CEO of Open Philanthropy later [used that analysis](https://www.cold-takes.com/the-track-record-of-futurists-seems-fine/) ([a](http://web.archive.org/web/20220914130350/https://www.cold-takes.com/the-track-record-of-futurists-seems-fine/)), as well as other research Open Philanthropy had been doing, to justify and explain Open Philanthropy's investments in AI Safety. + +In his own analysis of futurists' track record, Dan Luu seems to point out that this process has some characteristics of a shit show. Here is a long extract from Luu's post, minimally edited for readability: + +> We've seen, when evaluating futurists with an eye towards evaluating longtermists, Karnofsky heavily rounds up in the same way Kurzweil and other futurists do, to paint the picture they want to create.  +> +> There's also the matter of his summary of a report on Kurzweil's predictions being incorrect because he didn't notice the author of that report used a methodology that produced nonsense numbers that were favorable to the conclusion that Karnofsky favors.  +> +> It's true that Karnofsky and the reports he cites do the superficial things that the forecasting literature notes is associated with more accurate predictions, like stating probabilities. But for this to work, the probabilities need to come from understanding the data.  +> +> If you take a pile of data, incorrectly interpret it and then round up the interpretation further to support a particular conclusion, throwing a probability on it at the end is not likely to make it accurate.  +> +> Although he doesn't use these words, a key thing Tetlock notes in his work is that people who round things up or down to conform to a particular agenda produce low accuracy predictions. Since Karnofsky's errors and rounding heavily lean in one direction, that seems to be happening here. + +> We can see this in other analyses as well. Although digging into material other than futurist predictions is outside of the scope of this post, nostalgebraist has done this and he said (in a private communication that he gave me permission to mention) that Karnofsky's summary of [Could Advanced AI Drive Explosive Economic Growth?](https://www.openphilanthropy.org/research/could-advanced-ai-drive-explosive-economic-growth/) is substantially more optimistic about AI timelines than the underlying report in that there's at least one major concern raised in the report that's not brought up as a "con" in Karnofsky's summary. +> +> And nostalgebraist later wrote [this post](https://nostalgebraist.tumblr.com/post/693718279721730048/on-bio-anchors) ([a](http://web.archive.org/web/20221004024842/https://nostalgebraist.tumblr.com/post/693718279721730048/on-bio-anchors)), where he (implicitly) notes that the methodology used in a report he examined in detail is fundamentally not so different than what the futurists we discussed used. There are quite a few things that may make the report appear credible (it's hundreds of pages of research, there's a complex model, etc.), but when it comes down to it, the model boils down to a few simple variables.  +> +> In particular, a huge fraction of the variance of whether or not TAI is likely or not likely comes down to the amount of improvement will occur in terms of hardware cost, particularly FLOPS/$. The output of the model can range from 34% to 88% depending how much improvement we get in FLOPS/$ after 2025. Putting in arbitrarily large FLOPS/$ amounts into the model, i.e., the scenario where infinite computational power is free (since other dimensions, like storage and network aren't in the model, let's assume that FLOPS/$ is a proxy for those as well), only pushes the probability of TAI up to 88%, which I would rate as too pessimistic, although it's hard to have a good intuition about what would actually happen if infinite computational power were on tap for free.  +> +> Conversely, with no performance improvement in computers, the probability of TAI is 34%, which I would rate as overly optimistic without a strong case for it. But I'm just some random person who doesn't work in AI risk and hasn't thought about too much, so your guess on this is as good as mine (and likely better if you're the equivalent of Yegge or Gates and work in the area). + +I'm sympathetic to both sides of this.  + +On the one hand, I worry that the side concerned about AI safety acts like a machine that predictably surfaces and amplifies arguments in favor of its side, and predictably discounts arguments for the other side.  + +On the other hand, I also see Luu's analysis as perhaps too harsh, e.g.: + +* not giving partial credit for predictions that are missed by a few years or that only happen in rich countries rather than worldwide, +* considering predictions that have a "may" as unfalsifiable (instead of e.g., assigning a probability of 50% and looking at the resulting Brier or log score),  +* evaluating two propositions connected by an "and" as one failed prediction instead of one correct and one incorrect prediction. +* evaluating predictions about the "twenty-first century" as having already failed +* generally being on the harsh side of things + +Overall, it seems like there is a garden of forking paths with regards to the more specific question of how accurate past futurists were, but also with regards to the more general question about the degree to which it is possible to make predictions about future events, particularly about transformative technologies.  + +One way to navigate that garden of forking paths would be an [adversarial collaboration](https://en.wikipedia.org/wiki/Adversarial_collaboration) ([a](http://web.archive.org/web/20220725190412/https://en.wikipedia.org/wiki/Adversarial_collaboration)). Funding for this would probably be available, if not from Open Philanthropy itself then from [the FTX Future Fund](https://ftxfuturefund.org/) ([a](http://web.archive.org/web/20221011034322/https://ftxfuturefund.org/)), from [Nonlinear](https://www.super-linear.org/#list2) ([a](https://web.archive.org/web/20221012112602/https://www.super-linear.org/#list2)), or even from [myself](https://nitter.privacy.com.de/NunoSempere). I mention funding because I personally view cold hard cash as an honest signal that some work is truly perceived to be valuable. But one could also choose to carry out an adversarial collaboration pro bono, for the sake of curiosity, etc. + +[Price Formation in Field Prediction Markets](https://arxiv.org/abs/2209.08778) is an arxiv preprint which discusses where the accuracy of prediction markets comes from. The two hypotheses it considers are: + +1. from averaging the different pieces of information that each participant has +2. from traders which are able to individually do more research than everyone else, and profit from this. + +They have a method I'm not completely convinced by in order to identify "price sensitive" traders, whom they identify with informed traders, and they use their dataset to conclude that hypothesis 2 is mostly what’s going on. They use data from [Almanis](https://www.almanisprivate.com/) ([a](http://web.archive.org/web/20220202051215/https://www.almanisprivate.com/)), one of the smaller prediction market sites that still have some liquidity. + +The paper has some interesting elements. And for all I know, it's better than 99% of the papers in its field. But I'm left with the impression that the topic of research is a bit of a bad fit for academic investigation, because one could get a better idea of the dynamics of prediction markets by listening to the [Star Spangled Gamblers](https://starspangledgamblers.com/) ([a](http://web.archive.org/web/20221001143818/https://starspangledgamblers.com/)) guys. + +--- + +Note to the future: All links are added automatically to the Internet Archive, using this [tool](https://github.com/NunoSempere/longNowForMd) ([a](http://web.archive.org/web/20220711161908/https://github.com/NunoSempere/longNowForMd)). "(a)" for archived links was inspired by [Milan Griffes](https://www.flightfromperfection.com/) ([a](http://web.archive.org/web/20220814131834/https://www.flightfromperfection.com/)), [Andrew Zuckerman](https://www.andzuck.com/) ([a](http://web.archive.org/web/20220316214638/https://www.andzuck.com/)), and [Alexey Guzey](https://guzey.com/) ([a](http://web.archive.org/web/20220901135024/https://guzey.com/)). + +--- + +> — What are you waiting for? +> — I don't know... Something amazing, I guess. +> — Me too, kid + +[The Incredibles](https://en.wikipedia.org/wiki/The_Incredibles), 30'50'' \ No newline at end of file diff --git a/blog/2022/10/13/legalize-acetylcysteine/.images/cold_years_per_year.png b/blog/2022/10/13/legalize-acetylcysteine/.images/cold_years_per_year.png deleted file mode 100644 index 84a4427..0000000 Binary files a/blog/2022/10/13/legalize-acetylcysteine/.images/cold_years_per_year.png and /dev/null differ diff --git a/blog/2022/10/13/legalize-acetylcysteine/.images/gains-to-be-had.png b/blog/2022/10/13/legalize-acetylcysteine/.images/gains-to-be-had.png deleted file mode 100644 index afb21d7..0000000 Binary files a/blog/2022/10/13/legalize-acetylcysteine/.images/gains-to-be-had.png and /dev/null differ diff --git a/blog/2022/10/13/legalize-acetylcysteine/.images/population-pyramid.png b/blog/2022/10/13/legalize-acetylcysteine/.images/population-pyramid.png deleted file mode 100644 index 0807352..0000000 Binary files a/blog/2022/10/13/legalize-acetylcysteine/.images/population-pyramid.png and /dev/null differ diff --git a/blog/2022/10/13/legalize-acetylcysteine/.old/index.old.md b/blog/2022/10/13/legalize-acetylcysteine/.old/index.old.md deleted file mode 100644 index a4a83da..0000000 --- a/blog/2022/10/13/legalize-acetylcysteine/.old/index.old.md +++ /dev/null @@ -1,90 +0,0 @@ -Legalize acetylcysteine: An open letter to the UK's MHRA -======================================================== - -## Part I: Demagoguery - -This is the map of maximum Celtic expansion, in circa 270 BC, per [Wikipedia](https://upload.wikimedia.org/wikipedia/commons/0/08/Celtic_expansion_in_Europe.svg): - -![](https://upload.wikimedia.org/wikipedia/commons/0/08/Celtic_expansion_in_Europe.svg) - -Since then, the Spaniards have further developed into Gazpacho-drinking siesta-sleepers and the Britons have developed into tea-drinking weather-contemplators[^1]. Still, my understanding is that population differences are to a great degree cultural, and that the basic plumbing remains pretty much the same. My understanding is also that - -Imagine, then, my surprise, when in the middle of being sick in the UK, I find out that an extremely common medicine used to treat the cold in Spain throughout my childhood just wasn't commonly available in the UK. This medicine is [acetylcysteine](https://en.wikipedia.org/wiki/Acetylcysteine)—known in Spain under the brand name "Fluomicil". It's purpose is to decrease the thickness of the mucus so that it can be expelled, so that the patient can better breathe. In my experience, this is particularly crucial at night, because if the nose is blocked, you will breathe through the mouth and end up having a sore throat, and generally not sleep as well. - -Instead of using acetylcysteine, the UK uses other less efficaceous medicaments, such as nose sprays, which don't work as well through the night. They aren't as useful once the nose is already blocked. And they are more annoying to use, which means that people may forget or use them less. - -## Part II: Cost-effectiveness analysis - -I work as a forecaster, not as a doctor or as a medical researcher. So there are surely factors I'm missing. For instance, maybe living for two milenia under lousy weather has maybe made the population of Britain more immune to having blocked noses, and this could mean that nose sprays are a better tradeoff than acetylcysteine. I really wouldn't know, though it would surprise me. - -Still, as a forecaster I can offer the following estimation: - -Per the [NHS inform website](https://www.nhsinform.scot/illnesses-and-conditions/infections-and-poisoning/common-cold#colds-in-children): - -> Children get colds far more often than adults. While adults usually have two to four colds a year, children can catch as many as 8 to 12. - -According to the [latest data from the Office of National Statistics](https://www.ons.gov.uk/peoplepopulationandcommunity/populationandmigration/populationestimates/bulletins/annualmidyearpopulationestimates/mid202), the population pyramid of the UK looks as follows: - -![](./population.png) - -meaning that there are 67,081,234 people, of which 20.1% (13,468,262) is under 16. I also estimate that acetylcysteine makes an illness somewhere between 1% and 10% better. - -Putting this together, I can [estimate the following](https://www.squiggle-language.com/playground#code=eNqFkV9rwjAUxb%2FKpbChnfPfnAPBR5%2FGXiZ7C4TYxBpIb7r0dlLE775E7aa1c2%2Fl9OR3TnJ2UbGx22WZZcJV0YxcqXoHaSE1WVcrGjVpYZafpU5To5bkNKbRLGI4GMCiIJ0JUrAqnVQIdg1SF0oUimFu89II0ha5XfOPV5jD9OXN687m1h30ZKONdP7cHIb98XAEHjke9kd3DMmSMFzI0hBPrJEFz5XjlRLOmztjIAuTLsRwlRJDZwSP0JLSram10AKeBPBo%2FAe5BfrDDChPuNH7AW7GM5Sl%2B8kL%2F8KLHfuEh5GiKuq08M23mja177xEDE3QeclD1m%2FTBmkAT9Nnhpfj6iwXCYVxRaKoMklVkNKowGLb8N7u7JfKFNKR3DgVxvZb%2B4v5qRmmQmPByfKV4htxeZlT2Rj%2BYZ4avysqHcJ96JIbUTHcMYQr3uxK6QVbo8isKXjTnmG0%2FwYbRy4m): - -``` -// Estimate burden of disease -population_of_UK = 67M -proportion_children = 0.201 // 20.1% -total_adult_colds_per_year = (2 to 4) * population_of_UK * (1 - proportion_children) -total_children_colds_per_year = (4 to 12) * population_of_UK * proportion_children -total_colds = total_adult_colds_per_year + total_children_colds_per_year -duration_of_cold = 6 to 12 // days -total_days_with_cold = total_colds * duration_of_cold -total_cold_years = total_days_with_cold / 365 - -// Estimate impact of acetylcysteine on burden of disease -improvement_with_acetylcysteine = 0.01 to 0.1 -gains_to_be_had = total_cold_years * improvement_with_acetylcysteine - -// Return & display -{ - total_cold_years: total_cold_years, - gains_to_be_had: gains_to_be_had, -} -``` - -That is, I arrive at an estimate of 6M (1.7M to 9.2M) cummulative person-years spent having a cold in Britain: - -![](./cold_years_per_year.png) - -and a potential improvement from adopting acetylcysteine of 250,000 (53,000 to 640,000) "quality-adjusted-sickness-years"—an intutitive, ad-hoc unit that I just made up: - -![](./gains-to-be-had.png) - -The weakness of the method is that my subjective estimates of the 1% to 10% quality of life improvement might be off, or that my estimates of how often people are sick might be inaccurate—6M years of cold per year does seem a bit high. I'm also not really familiar with how potential alternatives, such as carbocisteine, are used in the UK. Still, I think that this rough calculation does show that having better medicaments is of great importance. And the Spanish doctors I've spoken expressed shock and disbelief that acetylcysteine was not available in the UK. - -But while an abstract argument may have been made, the action and followup remains. And it falls on the brave and hardworking souls at the [MHRA](https://www.gov.uk/government/organisations/medicines-and-healthcare-products-regulatory-agency) to send [a Message to Garcia](https://courses.csail.mit.edu/6.803/pdf/hubbard1899.pdf): Legalize acetylcysteine. - ---- - -Sources - -https://www.nhs.uk/medicines/carbocisteine/#:~:text=A%20mucolytic%20helps%20you%20cough,chronic%20obstructive%20pulmonary%20disease%20(COPD) -https://www.cochrane.org/CD003124/ARI_acetylcysteine-and-carbocysteine-to-treat-acute-upper-and-lower-respiratory-tract-infections-in-children-without-chronic-broncho-pulmonary-disease -https://www.medicines.org.uk/emc/product/2916/smpc -https://www.gov.uk/government/organisations/medicines-and-healthcare-products-regulatory-agency -https://www.ons.gov.uk/peoplepopulationandcommunity/populationandmigration/populationestimates/bulletins/annualmidyearpopulationestimates/mid202 -https://www.boots.com/sitesearch?searchTerm=mucolytics -https://www.drugs.com/drug-interactions/acetylcysteine-with-fenesin-dm-97-0-846-8166.html -https://www.amazon.co.uk/N-Acetyl-Cysteine-Nutritional-Supplements/b?ie=UTF8&node=5977697031 -https://www.dbth.nhs.uk/wp-content/uploads/2017/10/Patient-Information-Leaflet-Acetylcysteine.pdf -https://bnf.nice.org.uk/drugs/acetylcysteine/ -https://www.waymade.co.uk/contact-us/ -https://www.medicines.org.uk/emc/product/11366/smpc -https://www.medicines.org.uk/emc/product/11366/smpc -https://www.boots.com/boots-pharmaceuticals-effervescent-powder-10-sachets-10049868 -https://www.waymade.co.uk/wp-content/uploads/2021/05/PIL-Acetylcysteine-200mg.pdf -https://mhraproducts4853.blob.core.windows.net/docs/1a9869bfeac300ad6cb531ff4434f1cf90243624 -https://products.mhra.gov.uk/search/?search=acetylcysteine&page=1&ter=UK&rerouteType=0 - -[^1]: As in, "Isn't the weather nice today, darling?" diff --git a/blog/2022/10/13/legalize-acetylcysteine/index.md b/blog/2022/10/13/legalize-acetylcysteine/index.md deleted file mode 100644 index 945d424..0000000 --- a/blog/2022/10/13/legalize-acetylcysteine/index.md +++ /dev/null @@ -1,95 +0,0 @@ -Legalize acetylcysteine: An open letter to the UK's MHRA -======================================================== - -Executive summary: Acetylcysteine is a common medicine used in Spain without prescription that I believe is a better alternative than the medications used in the UK to relieve some symptoms of the flu. There is a legal framework for importing it from Europe, but it onerous enough that I'm not going to personally do it, so the idea is there for the taking. A few companies have bothered to go through the process already, so it might make sense to partner with them. It might also be valuable to streamline the process of importing medicines into the UK from the EU, but this seems harder. - -## Part I: Demagoguery - -This is the map of maximum Celtic expansion, in circa 270 BC, per [Wikipedia](https://upload.wikimedia.org/wikipedia/commons/0/08/Celtic_expansion_in_Europe.svg): - -![](https://upload.wikimedia.org/wikipedia/commons/0/08/Celtic_expansion_in_Europe.svg) - -Since then, the Spaniards have further developed into Gazpacho-drinking siesta-sleepers and the Britons have developed into tea-drinking weather-contemplators[^1]. Still, my understanding is that population differences are to a great degree cultural, and that the basic plumbing remains pretty much the same[^2]. - -Imagine, then, my surprise, when in the middle of being sick in the UK, I find out that an extremely common medicine used to treat the cold in Spain throughout my childhood just wasn't commonly available in the UK. This medicine is [acetylcysteine](https://en.wikipedia.org/wiki/Acetylcysteine)—known in Spain under the brand name "Fluimicil"[^3]. It's purpose is to decrease the thickness of the mucus so that it can be expelled, so that the patient can better breathe. In my experience, this is particularly crucial at night, because if the nose is blocked, you will breathe through the mouth and end up having a sore throat, and generally not sleep as well. - -Instead of using acetylcysteine, the UK uses other less efficaceous medicaments, such as nose sprays, which don't work as well through the night. They aren't as useful once the nose is already blocked. And they are more annoying to use, which means that people may forget them completely—or just use them less. Brits also have access to [Carbocysteine](https://www.nhs.uk/medicines/carbocisteine/#:~:text=A%20mucolytic%20helps%20you%20cough,chronic%20obstructive%20pulmonary%20disease%20), though only with a prescription, and in practice it doesn't seem to be standard of care. - -## Part II: Cost-effectiveness analysis - -I work as a forecaster, not as a doctor or as a medical researcher. So there are surely factors I'm missing. For instance, maybe living for two milenia under lousy weather has maybe made the population of Britain more immune to having blocked noses, and this could mean that nose sprays are a better tradeoff than acetylcysteine. I really wouldn't know, though it would surprise me. - -Still, as a forecaster I can offer the following estimation: - -Per the [NHS inform website](https://www.nhsinform.scot/illnesses-and-conditions/infections-and-poisoning/common-cold#colds-in-children): - -> Children get colds far more often than adults. While adults usually have two to four colds a year, children can catch as many as 8 to 12. - -According to the [latest data from the Office of National Statistics](https://www.ons.gov.uk/peoplepopulationandcommunity/populationandmigration/populationestimates/timeseries/ukpop/pop), the population pyramid of the UK looks as follows: - -![](https://i.imgur.com/lWeyggC.png) - -meaning that there are 67,081,234 people, of which 20.1% (13,468,262) is under 16. I also estimate that acetylcysteine makes an illness somewhere between 1% and 10% better. - -Putting this together, I can [estimate the following](https://www.squiggle-language.com/playground#code=eNqFkV9rwjAUxb%2FKpbChnfPfnAPBR5%2FGXiZ7C4TYxBpIb7r0dlLE775E7aa1c2%2Fl9OR3TnJ2UbGx22WZZcJV0YxcqXoHaSE1WVcrGjVpYZafpU5To5bkNKbRLGI4GMCiIJ0JUrAqnVQIdg1SF0oUimFu89II0ha5XfOPV5jD9OXN687m1h30ZKONdP7cHIb98XAEHjke9kd3DMmSMFzI0hBPrJEFz5XjlRLOmztjIAuTLsRwlRJDZwSP0JLSram10AKeBPBo%2FAe5BfrDDChPuNH7AW7GM5Sl%2B8kL%2F8KLHfuEh5GiKuq08M23mja177xEDE3QeclD1m%2FTBmkAT9Nnhpfj6iwXCYVxRaKoMklVkNKowGLb8N7u7JfKFNKR3DgVxvZb%2B4v5qRmmQmPByfKV4htxeZlT2Rj%2BYZ4avysqHcJ96JIbUTHcMYQr3uxK6QVbo8isKXjTnmG0%2FwYbRy4m): - -``` -// Estimate burden of disease -population_of_UK = 67M -proportion_children = 0.201 // 20.1% -total_adult_colds_per_year = (2 to 4) * population_of_UK * (1 - proportion_children) -total_children_colds_per_year = (4 to 12) * population_of_UK * proportion_children -total_colds = total_adult_colds_per_year + total_children_colds_per_year -duration_of_cold = 6 to 12 // days -total_days_with_cold = total_colds * duration_of_cold -total_cold_years = total_days_with_cold / 365 - -// Estimate impact of acetylcysteine on burden of disease -improvement_with_acetylcysteine = 0.01 to 0.1 -gains_to_be_had = total_cold_years * improvement_with_acetylcysteine - -// Return & display -{ - total_cold_years: total_cold_years, - gains_to_be_had: gains_to_be_had, -} -``` - -That is, I arrive at an estimate of 6M (1.7M to 9.2M) cummulative person-years per year spent having a cold in Britain: - -![](https://i.imgur.com/pdaXZN1.png) - -and a potential improvement from adopting acetylcysteine of 250,000 (53,000 to 640,000) "quality-adjusted-sickness-years"—an intutitive, ad-hoc unit that I just made up: - -![](https://i.imgur.com/79KLmnU.png) - -The weakness of the method is that my subjective estimates of the 1% to 10% quality of life improvement might be off, or that my estimates of how often people are sick might be inaccurate—6M years of cold per year does seem a bit high. I'm also not really familiar with how potential alternatives, such as carbocisteine, are used in the UK. Still, I think that this rough calculation does show that having better medicaments is of great importance. And the Spanish doctors I've spoken expressed shock and disbelief that acetylcysteine was not available in the UK. - -One particular way my estimate could be wrong is if patients are taking carbocysteine instead of acetylcysteine, and if the two medicaments closely resemble each other. If that is the case, the above estimates might be much lower. Still,, they still point to the broader correct point that really nailing standard of care for the flu is likely to be very valuable. - -But while an abstract argument may have been made, the action and followup remains. And it falls on the brave and hardworking souls at the [MHRA](https://www.gov.uk/government/organisations/medicines-and-healthcare-products-regulatory-agency) to send [a Message to Garcia](https://courses.csail.mit.edu/6.803/pdf/hubbard1899.pdf): Legalize acetylcysteine. - -## Part III: The invisible hand defeated - -But in fact, acetylcysteine is already legal in the UK. Well, pseudo-legal. Quasi-legal. Legal in name, but not legal enough for the invisible hand of the market to do its work. - -By this I mean that you could in theory sell acetylcysteine if you have a number of licenses which look very annoying to get. Per the MHRA's website: - -> If you want to parallel import a product you must make sure that: -> - the product is manufactured to [good manufacturing practice (GMP) standards](https://www.gov.uk/guidance/good-manufacturing-practice-and-good-distribution-practice) -> - you hold a [wholesale dealer’s licence](https://www.gov.uk/guidance/apply-for-manufacturer-or-wholesaler-of-medicines-licences) covering importing, storage and sale for each product -> - you hold the correct parallel import licence -> -> To assemble and repackage the product you will also need to have an [manufacturer’s licence](https://www.gov.uk/guidance/apply-for-manufacturer-or-wholesaler-of-medicines-licences) covering product assembly. - -You know what this prevents me from doing? This prevents me from buying 1 000 packages of acetylcysteine and selling them to friends on the side, and then relying on word of mouth. I would have been the invisible hand of the market, if only I hadn't been stymied by government regulations. - -In fact, regulations aren't so bad. It seems conceivable that I could figure these requirements out during a summer. Though I'm probably not going to, so this idea is free for the taking. In practice, there are already [a few companies](https://products.mhra.gov.uk/search/?search=acetylcysteine&page=1&ter=UK&rerouteType=0) that have gone through the trouble, like [Waymade](https://www.waymade.co.uk/), and so it might make more sense to partner with them. - -And yet, the situation remains suboptimal. Ideally, the regulatory framework of the UK would be such that importing medicines from the EU would be painless. But that would be a much larger project. - -[^1]: As in, "Isn't the weather nice today, darling?" - -[^2]: My understanding is that in general there are *some* differences in the efficacy of medical treatments across ethnic group. I previously knew that [lactose intolerance](https://en.wikipedia.org/wiki/Lactose_intolerance#Frequency) is more common across people of East Asian descent. And some brief Googling leads me to a few papers on the topic ([1](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2594139/), [2](https://www.degruyter.com/document/doi/10.1515/DMDI.1995.12.2.77/html), [3](https://www.tandfonline.com/doi/abs/10.1517/17425255.2011.585969). So it's conceivable that that consideration is part of what is going on. But it would surprise me. - -[^3] Also Normofludil, etc. diff --git a/blog/2022/10/17/the-commons/.images/eiwMyEI.jpg b/blog/2022/10/17/the-commons/.images/eiwMyEI.jpg new file mode 100644 index 0000000..6cf17ef Binary files /dev/null and b/blog/2022/10/17/the-commons/.images/eiwMyEI.jpg differ diff --git a/blog/2022/10/17/the-commons/index.md b/blog/2022/10/17/the-commons/index.md new file mode 100644 index 0000000..f21ca0b --- /dev/null +++ b/blog/2022/10/17/the-commons/index.md @@ -0,0 +1,23 @@ +Sometimes you give to the commons, and sometimes you take from the commons +========================================================================== + +Sometimes you give to the commons, and sometimes you trade from the commons. And through this giving and taking, people are able to smooth consumption. This is good because getting more ressources from the commons when you temporarily have fewer of them is more positive than giving ressources away when you temporarily have more of them.

Engraving depicting the curse of [Tantalus](https://en.wikipedia.org/wiki/Tantalus)
+ +Anyways, a phenomenon I've noticed is that sometimes, you can only give to the commons, but you can't take from the commons. This is dysfunctional, and defeats the whole purpose of the commons. + + +Some examples, vaguely based on real life: + +- You generally have thoughtful opinions, but sometimes you make mistakes. Your aggregate effect is to make a group's models of the world better. One day you have an opinion that is wrong, and people pile on against you, without remembering previous times that you added information to the shared pool. +- You generally give emotional support to people. But when you need emotional support, people don't give it to you. +- You are glad to help people with your time, but when you need other people to lend you their time, they don't. +- There is a shared pool of ressources that status-poor people are expected to fill, and high-status people are welcome to partake of. +- Taking from the commons is socially punished, such that people *can't even think* of the idea of taking from the commons as an option that they have. + +Overall, there might be reasons for these kinds of dynamics. For example: + +- maybe there are types of people who would predictably take too much from the commons, and a group prevents those kinds of people from taking any part of the commons, as a preventative measure. Maybe people can smell the desesperation. +- Or maybe there was a veil-of-ignorance type of deal going on, where some people only give to the commons, but would have received if they had had worse luck. +- Or maybe there is a totally reasonable period between where one starts giving to the commons and when one can start taking from it, to disallow free-riders. + +But in practice, I think that the reasonable explanations aren't what's going on. And instead there are really weird effects where "for he that hath, to him shall be given: and he that hath not, from him shall be taken even that which he hath". So now, when I see this kind of dynamic around a supposed commons, I tend to run. And after seeing this kind of dynamic happening a few times, I've become more sympathetic about a cluster of ideas around self-sufficiency and libertarianism. diff --git a/blog/2022/10/21/brief-evaluations-of-top-10-billionnaires/index.md b/blog/2022/10/21/brief-evaluations-of-top-10-billionnaires/index.md new file mode 100644 index 0000000..6b30883 --- /dev/null +++ b/blog/2022/10/21/brief-evaluations-of-top-10-billionnaires/index.md @@ -0,0 +1,144 @@ +Brief evaluations of top-10 billionnaires +============== + +As part of my work with the [Quantified Uncertainty Research Institute](https://quantifieduncertainty.org/), I am experimenting with speculative evaluations that could be potentially scalable. Billionaires were an interesting evaluation target because there are a fair number of them, and at least some ate nominally aiming to do good. + +For now, for each top 10 billionaire, I have tried to get an idea of: + +1. How much value have they created through their business activities? +2. How much impact have they created through their philanthropic activities + +I then assigned a subjective score based on my understanding of the answers to the above questions. + +### Elon Musk (B) + +Elon Musk changes the world through: + +* His businesses: Tesla, SpaceX, The Boring Company and Neuralink. Tesla makes better cars, SpaceX advances interplanetary expansion. +* His cultural influence: Twitter shitposting, conceiving and pushing for brighter futures, etc. +* His philanthropy: Little of it is publicly known so far. He will probably end up buying Twitter, partially with the intention of making it a better public good. OpenAI, which he helped found, might end up having greatly positive or greatly negative impact. + +Overall Musk seems like has produced large amounts of value, and might produce even more through SpaceX. But he also seems to be oddly nonstrategic at times. + +### Bernard Arnault (D-) + +> I see myself as an ambassador of French heritage and French culture. What we create is emblematic. It's linked to Versailles, to Marie Antoinette. -- [Bernard Arnault, as quoted in Forbes](https://www.forbes.com/profile/bernard-arnault/?sh=2d3799e966fa) + +His business produces little counterfactual value, and instead serves as a vehicle for conspicuous consumption. That is, if his luxury brands didn't exist, his customers would simply buy from the others, and the world would look extremely similar. + +His company describes its philanthropy as ["an ideal expression of financial success"](https://www.lvmh.com/group/lvmh-commitments/art-culture/lvmh-corporate-philanthropy/), and supports art installations, or students attending concerts. Thus his philanthropy seems non-strategic, aimed at the display of wealth rather than at the firm pursuit of improving and fortifying French culture. He also donated $200M to restore the Notre Dame, which probably saved French tax-payers a similar amount. + +### Jeff Bezos (B) + +The market value of Amazon is circa $1T, meaning that it has managed to capture at least that much value, and likely produced much more consumer surplus. His other ventures, like Blue Origin, are as of yet nowhere as valuable. + +### Gautam Adani (B?) + +Adani seems to be a skilled manager, administrator, and deal-maker who, working in a developing country, has unlocked heaps of value. + +He has ties to Narenda Modi, and their fortunes have risen together. From the outside, it's hard to say to what extent he has [created his wealth](https://nairametrics.com/2022/05/11/how-gautam-adani-went-from-being-a-school-dropout-to-becoming-the-richest-man-in-india/): + +> The governor of Gujarat announced managerial outsourcing of the Mundra Port in 1994, and Adani got the contract in 1995, after which he set up the first jetty, which was originally run by Mundra Port & Special Economic Zone but was later transferred to Adani Ports and SEZ (APSEZ). + +> Adani decided to turn it into a commercial port, building rail and road links to it by individually negotiating with over 50 0 landowners across India to create the largest port in India. + +or [simply aquired it](https://asiatimes.com/2022/05/gautam-adani-master-of-the-art-of-modern-monopolies/): + +> Mr Adani’s friendship with Prime Minister Narendra Modi is time-tested. Their friendship goes back to 2003, when none of the country’s leading businessmen publicly stood by Modi’s side because of the handling of the Gujarat riots. But Adani broke ranks with the old business elite, potentially risking his future. And this gamble paid off. + +> Gautam Adani is today one of the most visible tycoons in the country, whose prominence has accelerated in the years since Narendra Modi was elected prime minister in 2014. Since Modi came into office, Adani’s net worth has increased 17.5 times in less than eight years, from $7 billion to $125 billion + +### Bill Gates (A) + +It's unclear whether Microsoft itself has had a positive or negative impact on the world over what would have counterfactually happened (e.g., Apple and Linux would be more popular). However, Gates' impact-focused philanthropy has helped millions. He moreover started the [Giving Pledge](https://givingpledge.org/), which probably multiplied his impact. Too bad that he couldn't prevent the covid pandemic. + +### Warren Buffett (A) + +Berkshire Hathaway is probably a force for good in the capitalist ecosystem. He has also contributed $32+ billion to the [Gates Foundation](https://www.gatesfoundation.org/about/leadership/warren-buffett). In addition, Buffett created the Giving Pledge, which probably multiplied his impact. + +### Larry Ellison (C?) + +Ellison has made his wealth by selling [universally-reviled database software](https://libreddit.foss.wtf/r/business/comments/di5j2/im_always_surprised_to_see_the_oracle_chieflarry/) and other products that work at Fortune 500 and government scale. + +He has signed the [Giving Pledge](https://givingpledge.org/pledger?pledgerId=192), though his giving may have been [erratic at times](https://www.vox.com/recode/2020/9/2/21409530/larry-ellison-foundation-disband-london-philanthropy-coronavirus). It's also possible that his closeness to Trump at times improved the quality of Trump's decision-making while in office. + +### Mukesh Ambani (B?) + +His wealth originally came from a vertically integrated commodity business, but has since expanded. Although skilled at navigating government bureaucracies, he also ate his own brother alive in the competitive communications business, providing millions of Indian consumers with cheaper internet access. Overall most of his impact is going to come from his contribution to Indian economic growth, and that contribution is probably highly positive. + +### Larry Page (B) + +By making a better search engine and providing other Google products for free to millions, he has provided heaps of value. However, in recent times, he has disengaged from Google, and Google has abandoned its "don't be evil" motto. His philanthropy, while [large](https://www.vox.com/recode/2019/12/18/21010108/larry-page-philanthropy-foundation-donor-advised-fund-christmas), is somewhat secretive. + +### Sergei Brin (B-) + +Like Larry Page, by making a better search engine and providing other Google products for free to millions, he has provided heaps of value. However, in recent times, he has disengaged from Google, and Google has abandoned its "don't be evil" motto. He has donated at least [$1.4 billion](https://www.influencewatch.org/non-profit/sergey-brin-family-foundation/) to his family foundation, and seems to donate to left-of-center causes. + +## Reflections + +### Comparisons with other alternatives + +From some brief Googling, two other rankings are the [Forbes 400](https://www.forbes.com/forbes-400), which assigns a philanthropy score to America's 400 richest people, and the [philanthropy 50](https://www.philanthropy.com/article/the-philanthropy-50/#id=browse_2021), which is paywalled. + +**Forbes' Philanthropy score** + +The methodology for the Forbes 400 philanthropy score can be seen [here](https://www.forbes.com/sites/rachelsandler/2022/09/27/the-forbes-philanthropy-score-2022-how-charitable-are-the-richest-americans/?sh=587daeea0980). In short, Forbes does some [intensive investigative work](https://www.forbes.com/sites/chasewithorn/2022/09/27/2022-forbes-400-methodology-how-we-crunch-the-numbers/?sh=1f88cfe5d0eb) to determine what billionaire's wealth actually _is_. Then, + +> To see how philanthropic the ultrawealthy are, Forbes dug into their known charitable giving and assigned a philanthropy score, ranging from 1 to 5, to each member of The Forbes 400. If we couldn’t find any information about a person’s giving and they declined to provide details, they received a score of N/A. + +> To calculate the scores, we added the value of each person’s total out-the-door lifetime giving to their 2022 Forbes 400 net worth, then divided their lifetime giving by that number. Each score corresponds to a range of giving as a percentage of a person’s net worth. We once again counted only out-the-door giving, rather than cash sitting in billionaires’ private foundations or tax-advantaged donor-advised funds that have not yet made it to those in need. We reached out to every list member for feedback + +This is already fairly sophisticated. If I had to suggest one improvement, it would be to incorporate whether billionaires have signed the [Giving Pledge](https://givingpledge.org/). + +Personally, I would also: + +* Score individuals on the _amount_ of money donated, rather than on the _percentage_ +* Accommodate [patient philanthropy](https://80000hours.org/podcast/episodes/phil-trammell-patient-philanthropy/) (see also [1](https://docs.google.com/document/d/1NcfTgZsqT9k30ngeQbappYyn-UO4vltjkm64n4or5r4/edit), [2](https://globalprioritiesinstitute.org/wp-content/uploads/Trammell-Dynamic-Public-Good-Provision-under-Time-Preference-Heterogeneity.pdf)), and not look only at money out the door. + +### Possible further work + +If I had access to a legion of researchers, I would try to move first towards a legible rubric and then to a quantified impact estimate. + +**An initial rubric** + +An initial rubric might incorporate: + +Some subjective estimate of how much value the individual has created through business + +* Are the business activities more like value creation or like resource extraction +* How much value has the individual created? + +Some mechanistic estimate of how much value the individual will create through philanthropy + +* How much money will the individual end up donating? +* How much has the individual donated so far? +* Has the individual joined the [Giving Pledge](https://givingpledge.org/)? +* Are the individual's donations done with some reference to impact? + * This would require some finesse in order to incorporate different philosophical stances. But there is certainly a substantial difference between Bill Gates' and Bernard Arnault's giving. + +Possibly, some estimate of additional sources of impact, like cultural influence or using a position of prominence to positively impact the world. + +Crucially, the above categories could be complementary. For instance, a skilled administrator and industrialist like Mukesh Ambani is already creating heaps of value through business in India, and he probably creates more value through deploying his capital through business than he would through philanthropy. So an individual could get top marks by being excellent in any one domain. + +**A quantified estimate** + +Eventually, a quantified estimate might move beyond being a rubric and directly attempt to estimate each part of an individual's impact, and then put them all together in a common linear unit. + +For example, in the case of Elon Musk, I would estimate how valuable each of his ventures is, either in an impact unit like [Open Philanthropy dollars](https://www.openphilanthropy.org/research/update-on-our-planned-allocation-to-givewells-recommended-charities-in-2022/#f+9715+1+6)—$1 dollar given to someone earning $50k a year—or in terms of [relative values](https://forum.effectivealtruism.org/posts/9hQFfmbEiAoodstDA/simple-comparison-polling-to-create-utility-functions)—where you compare how much each element is worth to other elements, and you don't need a unit or can easily construct one once you've done that. + +### Things I personally struggled with + +Some billionaires were harder to estimate than others. I particularly struggled with Gautam Adani and Mukesh Ambani. I'm probably lacking a whole lot of context there. Thanks to Chinmay Ingalavi for giving me some context. + +I am also uncertain about Larry Ellison. [Here](https://teddit.nunosempere.com/r/linux/comments/2e2c1o/what_do_we_hate_oracle_for/) is a thread on shady Oracle corporate practices. But [here](https://givingpledge.org/pledger?pledgerId=192) is Ellison's Giving Pledge letter. I'm unclear on how to square the two. + +The whole exercise took longer than I was expecting. + +I'm also unclear on whether to use gossip and private information, and ended up not doing so. + +I was also unclear on which philosophical assumptions to use. For instance, + +* I'm partial to [Patient Philanthropy](https://docs.google.com/document/d/1NcfTgZsqT9k30ngeQbappYyn-UO4vltjkm64n4or5r4/edit) +* I think it's plausible that most of a billionaires impact could come from business rather than from philanthropy. +* I think that Amazon's [union busting](https://www.commondreams.org/news/2022/10/18/following-brutal-union-busting-campaign-albany-amazon-workers-reject-unionization) is an evil practice but not nearly enough to move the needle on my overall evaluation of Amazon overall having produced very large heaps of value. +* I didn't incorporate Mackenzie Bezos' giving into Jeffrey Bezos' estimate, although one could argue that he created a big chunk of that wealth. \ No newline at end of file diff --git a/blog/2022/10/27/are-flimsy-evaluations-worth-it/index.md b/blog/2022/10/27/are-flimsy-evaluations-worth-it/index.md new file mode 100644 index 0000000..ca39546 --- /dev/null +++ b/blog/2022/10/27/are-flimsy-evaluations-worth-it/index.md @@ -0,0 +1,81 @@ +Are flimsy evaluations worth it? +================================ + +*Status: Draft. I'll cross-post this to the [EA Forum](https://forum.effectivealtruism.org/) in a few days. In the meantime, I've enabled comments below.* + +I recently received a bit of grief over a [brief evaluations of the impact of the top-10 billionnaires](https://nunosempere.com/blog/2022/10/21/brief-evaluations-of-top-10-billionnaires/). It seems possible that this topic is worth discussing. In what follows I outline a few non-exhaustive considerations, as well as a few questions of interest. + +

"Duty Calls" +, by xkcd
+ +### Value of flimsy evaluations + +Right now, I see the value of flimsy evaluations or estimations as coming from: + +#### 1. Value of experimentation + +There are many things we don't have estimates or evaluations for. Trying different evaluation methods and topics can be informative about which are more valuable. Individual flimsy evaluations can serve as a proof of concept that can be built upon if the preliminary version appears valuable, and as testing grounds for new evaluation methodologies. + +#### 2. Flimsy evaluations considered better than no evaluation + +When estimating a probability or a quantity, sometimes a quick BOTEC (back of the envelope calculation) or a Fermi estimate might be worth having despite its imprecision, because there isn't time or it isn't worth the effort to conjure a more complex estimate. + +For evaluations, oftentimes the tradeoff isn't between a flimsy evaluation and a more accurate in-depth evaluation, but rather between a flimsy evaluation and no evaluation at all. + +In particular, I don't think that the case of a ranking of billionnaires was that important. But the case of evaluations of EA organizations is. For example, a longstanding [annual evaluation of AI safety organizations](https://forum.effectivealtruism.org/posts/qdKhLcJmGQuYmzBoz/larks-s-shortform?commentId=e4h2yjCrK9kncfGTf) by Larks is not happening partly because it would be too expensive to produce. But then in that case we are getting no evaluation rather than an flimsier evaluation. + +#### 3. Less sure: The world being complicated enough that epistemics is for now a community effort + +I consider myself a reasonably knowledgeable individual, but I still regularly read things in the EA Forum and elsewhere that surprise me. Similarly, when forecasting, one usually gets a better result when combining different individual perspectives. + +Adjacently, [Cunningham's law](https://meta.wikimedia.org/wiki/Cunningham%27s_Law) states that: + +> the best way to get the right answer on the internet is not to ask a question; it's to post the wrong answer + +So it doesn't seem crazy that for a given number of hours of research, a better answer can be found by posting a flimsy evaluation and relying on commenters to point flaws that would have been hard for the author to identify on their own. + +This feels true, but too adversarial for my taste. If I was relying on this, I would explicitly signpost it. + +### Disvalue of flimsy evaluation + +#### 1. Reduced epistemics + +In a previous post, a commenter mentioned: + +> I think posting this was probably net negative EV but it was really funny +> ... +> Your methodology looks pretty flimsy but it looks like other EAs are taking it seriously +> ... +> I think the harm from posting things with flimsy methodology and get a lot of upvotes/uncritical comments is something like "lower epistemic rigour on the forum in general", rather than this article in particular causing a great deal of harm. I think the impact of this article whether positive or negative is likely to be small. + +It's possible that factors such as these could be present for flimsy evaluations. + +#### 2. People and organizations are really touchy about evaluations + +People and organizations tend to get a bit angsty when being evaluated. I think this is a real cost. I also think that generally, it's a cost worth paying for communities to have better models of the world. But for very flimsy evaluations, it's very possible that the cost is just not worth paying. + +#### 3. Evaluations having some chance of error + +Evaluations have some rate of error that rises the flimsier they are. It's possible that negative errors are fairly harmful, e.g., by reducing an organization's ability to fundraise through no fault of their own. + +### Discussion + +Some questions: + +1. In which context are flimsy evaluations worth it? +2. How should one signal that an evaluation could be flimsy? +3. Is there inflation of words going on? Open Philanthropy uses "shallow evaluations" for documents that can be a bit comprehensive +4. What is the expected error rate before it's not worth publishing a flimsy evaluation? 1 in 20 seems to low, 1 in 2 too high. + +### Personal thoughts + +Perhaps one likely conclusion could be that flimsy evaluations might be valuable if they clearly signal how much research has gone into them, and give an accurate impression of how flimsy they are. + +One possible way of doing this would be to have a prediction about what the expected error rate is. For instance, one could have a prediction like: "I expect that there is a 5% chance of an eggregious error that switches the main conclusion, and 1 to 4 minor errors that flip secondary considerations". + +--- + +
+ +
+