master
Nuno Sempere 2 years ago
parent 395344b1b3
commit 654f17abb4

@ -1,5 +1,7 @@
## Changelog
Page was getting too annoying to update.
### Most recent pieces
- 2022/04/17: [Simple Squiggle](https://nunosempere.com/blog/2022/04/17/simple-squiggle/)

@ -0,0 +1,6 @@
Werc bounties
=============
[werc](https://werc.cat-v.org/) is the web technology that I am using to run this blog. It is relatively solid, but it has some annoyances that I don't always have the time to figure out. I intend to use this post as a reference for myself, and eventually send it to the werc people with monetary bounties attached once I have a few elements.

@ -1,6 +1,6 @@
<br class="doNotDisplay doNotPrint" />
<div style="margin-right: auto;">Powered by <a href="http://werc.cat-v.org/">werc</a>, <a href="https://alpinelinux.org/">Alpine Linux</a> and <a href="https://nginx.org/en/">nginx</a></div>
<div style="margin-right: auto;">Powered by <a href="http://werc.cat-v.org/">werc</a>, <a href="https://alpinelinux.org/">alpine</a> and <a href="https://nginx.org/en/">nginx</a></div>
<!-- TODO: wait until duckduckgo indexes site
<form action="https://duckduckgo.com/" method="get">

@ -0,0 +1,29 @@
<!DOCTYPE HTML>
<html>
<head>
<title>%($pageTitle%)</title>
<link rel="stylesheet" href="/pub/style/style.css" type="text/css" media="screen, handheld" title="default">
<link rel="shortcut icon" href="/favicon.ico" type="image/vnd.microsoft.icon">
% if(test -f $sitedir/_werc/pub/style.css)
% echo ' <link rel="stylesheet" href="/_werc/pub/style.css" type="text/css" media="screen" title="default">'
<meta charset="UTF-8">
% # Legacy charset declaration for backards compatibility with non-html5 browsers.
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
% if(! ~ $#meta_description 0)
% echo ' <meta name="description" content="'$"meta_description'">'
% if(! ~ $#meta_keywords 0)
% echo ' <meta name="keywords" content="'$"meta_keywords'">'
% h = `{get_lib_file headers.inc}
% if(! ~ $#h 0)
% cat $h
%($"extraHeaders%)
</head>
<body>

@ -1,10 +0,0 @@
<div class="hidden-mobile">
<div>
<a href="https://forum.effectivealtruism.org/users/nunosempere">EA forum</a> |
<a href="https://forecasting.substack.com/">forecasting newsletter</a> |
<a href="https://github.com/NunoSempere">github</a> |
<a href="https://metaforecast.org/">metaforecast</a> |
<a href="https://quantifieduncertainty.org/">quantified uncertainty</a> |
<a href="https://twitter.com/NunoSempere">twitter</a>
</div>
</div>

@ -1,15 +0,0 @@
<div class="hidden-mobile">
<div>
<a href="http://gsoc.cat-v.org">gsoc</a> |
<a href="http://doc.cat-v.org">doc archive</a> |
<a href="http://repo.cat-v.org">software repo</a> |
<a href="http://ninetimes.cat-v.org">ninetimes</a> |
<a href="http://harmful.cat-v.org">harmful</a> |
<a href="http://9p.cat-v.org/">9P</a> |
<a href="http://cat-v.org">cat-v.org</a>
</div>
<div>
<a href="http://cat-v.org/update_log">site updates</a> |
<a href="/sitemap">site map</a>
</div>
</div>

@ -115,32 +115,32 @@ Three most important actionables:
## B. Exploratory plots.
![](images/8a73163df20e38856be2ab2212d484b1a58b7fe9.png)
![](images/b8025a0103dc70a976ecf604a12de6ddf2cd0ddb.png)
![](images/02891164325de4d34fcd3ca026b4c8c74b26fdcc.png)
![](images/df0dd280b2cd33327cfc81eef8de696e015c7562.png)
![](images/18ba7e3dce520c9137538bab707b1b585b2f671c.png)
![](images/aa717ffb66b96e2edc8a032dd179f8f17f33181a.png)
![](images/8f9995e951ce75147421f5233794555542db867a.png)
![](images/918999de00da42973f752ecb7d47a68f3c599a99.png)
![](images/b44c2ad3d57250ceec6a5f3eede0d375c649ab16.png)
![](images/d4989159c8011be94310198f52367c3d61189fcc.png)
![](images/05f019dc1b238486c93e78b31e2262ba34ac80b7.png)
![](images/86a9fc7e89b8a6e69c47c8e5fef0b79a2ad045d8.png)
![](images/bd015fe73fb51b04114e4c87931792a2100e7674.png)
![](images/06c70e8eeb9d81d0ecf08a9ab21f8f821f4a5980.png)
![](images/42ca65e3812fb1007f2070e11ca5e4bab0ed9c9e.png)
![](images/3540e439e51010f97f9423445b47440c8643ea1a.png)
![](images/f52e2963f147e74f31654bf13e3ab6b3098d2996.png)
![](images/1364361431dfd66aec4fb7fb9fba0473bd359e67.png)
![](images/b65235999c982b9414bcaf59e48ce39827bfe201.png)
![](images/38b656906584ea6736e321d7dc14c7c4be7b0b4d.png)
![](images/13dd0da9f9a4007915097af5919e45d72a278c66.png)
![](images/3e772f2cd88e85a0658b2b7498b9ff1311d50c58.png)
![](images/b9fd7bdd1c95cf0d8429312b77cbceae85db049f.png)
![](images/502b5cc72b9cee7b6330d5e01cf2894c5283a67a.png)
![](images/8ff3e88439796d0868b305fa0c5eaef294df7c5f.png)
![](images/ec1ab2e026861dc21df76876c5254bc94f13f10e.png)
![](.images/8a73163df20e38856be2ab2212d484b1a58b7fe9.png)
![](.images/b8025a0103dc70a976ecf604a12de6ddf2cd0ddb.png)
![](.images/02891164325de4d34fcd3ca026b4c8c74b26fdcc.png)
![](.images/df0dd280b2cd33327cfc81eef8de696e015c7562.png)
![](.images/18ba7e3dce520c9137538bab707b1b585b2f671c.png)
![](.images/aa717ffb66b96e2edc8a032dd179f8f17f33181a.png)
![](.images/8f9995e951ce75147421f5233794555542db867a.png)
![](.images/918999de00da42973f752ecb7d47a68f3c599a99.png)
![](.images/b44c2ad3d57250ceec6a5f3eede0d375c649ab16.png)
![](.images/d4989159c8011be94310198f52367c3d61189fcc.png)
![](.images/05f019dc1b238486c93e78b31e2262ba34ac80b7.png)
![](.images/86a9fc7e89b8a6e69c47c8e5fef0b79a2ad045d8.png)
![](.images/bd015fe73fb51b04114e4c87931792a2100e7674.png)
![](.images/06c70e8eeb9d81d0ecf08a9ab21f8f821f4a5980.png)
![](.images/42ca65e3812fb1007f2070e11ca5e4bab0ed9c9e.png)
![](.images/3540e439e51010f97f9423445b47440c8643ea1a.png)
![](.images/f52e2963f147e74f31654bf13e3ab6b3098d2996.png)
![](.images/1364361431dfd66aec4fb7fb9fba0473bd359e67.png)
![](.images/b65235999c982b9414bcaf59e48ce39827bfe201.png)
![](.images/38b656906584ea6736e321d7dc14c7c4be7b0b4d.png)
![](.images/13dd0da9f9a4007915097af5919e45d72a278c66.png)
![](.images/3e772f2cd88e85a0658b2b7498b9ff1311d50c58.png)
![](.images/b9fd7bdd1c95cf0d8429312b77cbceae85db049f.png)
![](.images/502b5cc72b9cee7b6330d5e01cf2894c5283a67a.png)
![](.images/8ff3e88439796d0868b305fa0c5eaef294df7c5f.png)
![](.images/ec1ab2e026861dc21df76876c5254bc94f13f10e.png)
The code to produce them can be found [here](https://nunosempere.github.io/rat/eamentalhealth/)
@ -150,11 +150,11 @@ The code to produce them can be found [here](https://nunosempere.github.io/rat/e
#### 1.1. According to age
![](images/e8466ca8f7555d37b25e7046bf22f45c575e7c95.png)
![](.images/e8466ca8f7555d37b25e7046bf22f45c575e7c95.png)
#### 1.2. According to country.
![](images/ff22d1678c0868146825ca03f26c1352d11bb850.png)
![](.images/ff22d1678c0868146825ca03f26c1352d11bb850.png)
#### 1.3. According to gender.
@ -172,7 +172,7 @@ Editorial bottom line: With respect to age, country, and gender, there seem to b
Initially, I was intending to find out the different mental disorder rates in the different countries, and combine that with the distribution in the data. The webpage [Our World in Data](https://ourworldindata.org/mental-health) provides the necessary data:
![](images/241ccf3251fdd6e74c882b31aa9985cb5898c44e.png)
![](.images/241ccf3251fdd6e74c882b31aa9985cb5898c44e.png)
We see that the distribution is broadly similar across the countries among which EA has a presence. Most importantly, it doesn't surpass ~25% in any country, whereas among survey respondents:
@ -290,7 +290,7 @@ The effect is very robust to different modelizations: regressing instead on empt
Here is the above, presented visually
![](images/67f6b5d117bdbcb6c7b56e5afd798d859ee7966c.png)
![](.images/67f6b5d117bdbcb6c7b56e5afd798d859ee7966c.png)
The following table is the cold, raw, hard data for the graphic.
@ -1357,7 +1357,7 @@ With regards to pathway 2, we have a rough upwards cross-sectional estimate of 2
* People who recover from a mental illness because of help from the EA community might not donate 10% of the counterfactual gain, but more.
* The gain is not only in the form of hours missed but in productivity gained while working.
![](images/0d7aea62609944b3674d36602b1dd50f2d91c8fe.png)
![](.images/0d7aea62609944b3674d36602b1dd50f2d91c8fe.png)
With regards to pathway 3, it is my impression that people in top charities and EA organizations already get good mental healthcare, though about rogue effective altruists I cannot say much.
@ -1397,6 +1397,6 @@ SlateStarCodex's list of [mental health professionals](https://slatestarcodex.co
The point being that there are a lot of mental health resources and information online, if only one knew where to find them, and >10% of survey respondents answered that finding information on mental health resources was hard or very hard.
![](images/3540e439e51010f97f9423445b47440c8643ea1a.png)
![](.images/3540e439e51010f97f9423445b47440c8643ea1a.png)
Thus, it is a high tractability, high neglectedness, medium impact task to create such a list of mental resources for the EA community, even if collated and scavenged from other sources. If such a resource already exists, plenty of respondents do not seem to know about it.

@ -106,7 +106,7 @@ And an extra property:
are enough to force the Shapley value function to take the form it takes:
![](images/aeef390de90cb7c6fe07fcad852578fbebe162b1.svg)
![](.images/aeef390de90cb7c6fe07fcad852578fbebe162b1.svg)
At this point, the reader may want to consult [Wikipedia](https://en.wikipedia.org/wiki/Shapley_value) to familiarize themselves with the mathematical formalism, or, for a book-length treatment, [_The Shapley value: Essays in honor of Lloyd S. Shapley_](http://www.library.fa.ru/files/Roth2.pdf). Ultimately, a quick way to understand it is as "the function uniquely determined by the properties above".

@ -39,11 +39,11 @@ A two-sentence version would be:
> Forecasters predicted the conclusions that would be reached by Elizabeth van Norstrand, a generalist researcher, before she conducted a study on the accuracy of various historical claims. We randomly sampled a subset of research claims for her to evaluate, and since we can set that probability arbitrarily low this method is not bottlenecked by her time.
![](images/4a235d14d0177ec92050af5b2551cdbc337f2d1e.png)
![](.images/4a235d14d0177ec92050af5b2551cdbc337f2d1e.png)
The below graph shows the evolution of the accuracy of the crowd prediction over time, starting from Elizabeth Van Nostrands prior. Predictions were submitted separately by two groups of forecasters: one based on a mailing list with participants interested in participating in forecasting experiments (recruited from effective altruism-adjacent events and other forecasting platforms), and one recruited from Positly, an online platform for crowdworkers.
![](images/c7e041d8fab837233a9cc4d03c6166c54da04020.png)
![](.images/c7e041d8fab837233a9cc4d03c6166c54da04020.png)
The y-axis shows the accuracy score on a logarithmic scale, and the x-axis shows how far along the experiment is. For example, 14 out of 28 days would correspond to 50%. The thick lines show the average score of the aggregate prediction, across all questions, at each time-point. The shaded areas show the standard error of the scores, so that the graph might be interpreted as a guess of how the two communities would predict a random new question.
@ -61,7 +61,7 @@ We measured “value provided” as the reduction in uncertainty weighted by the
Results were as follows.
![](images/50453b84385fa25f5a934570cfa2bc6702869748.png)
![](.images/50453b84385fa25f5a934570cfa2bc6702869748.png)
In other words, each unit of resource invested in the network-adjacent forecasters provided 72% as much returns as investing it in Elizabeth directly, and each unit invested in the crowdworkers provided negative returns, as they tended to be less accurate than Elizabeths prior.
@ -229,7 +229,7 @@ For future experiments, were considering obtaining an objective data-set with
In order to participate in the experiment, a forecaster has to turn their mental models (represented in whichever way the human brain represents models) into quantitative distributions (which is a format quite unlike that native to our brains), as shown in the following diagram:
![](images/fd632951009d3c978277000a6ba9f3834cb4922a.png)
![](.images/fd632951009d3c978277000a6ba9f3834cb4922a.png)
Each step in this chain is quite challenging, requires much practice to master, and can result in a loss of information.
@ -307,4 +307,4 @@ Funding for this project was provided by the Berkeley Existential Risk Initiativ
We thank Beth Barnes and Owain Evans for helpful discussion.
We are also very thankful to all the participants.
We are also very thankful to all the participants.

@ -37,7 +37,7 @@ A two-sentence version would be:
> Forecasters predicted the conclusions that would be reached by Elizabeth van Norstrand, a generalist researcher, before she conducted a study on the accuracy of various historical claims. We randomly sampled a subset of research claims for her to evaluate, and since we can set that probability arbitrarily low this method is not bottlenecked by her time.
![](images/4a235d14d0177ec92050af5b2551cdbc337f2d1e.png)
![](.images/4a235d14d0177ec92050af5b2551cdbc337f2d1e.png)
**1\. Evaluator extracts claims from the book and submits priors**
@ -49,7 +49,7 @@ All claims were assigned an importance rating from 1-10 based on their relevance
Elizabeth also spent 3 minutes per claim submitting an initial estimate (referred to as a “prior”).
![](images/533cbee1908697a3c0338e0f5c83b7f960d73551.png)
![](.images/533cbee1908697a3c0338e0f5c83b7f960d73551.png)
Beliefs were typically encoded as distributions over the range 0% to 100%, representing where Elizabeth expected the mean of her posterior credence in the claim to be after 10 more hours of research_._ For more explanation, see this footnote \[4\].
@ -63,7 +63,7 @@ A key part of the design was that that forecasters _did_ _not know_ which questi
Two groups of forecasters participated in the experiment: one based on a mailing list with participants interested in participating in forecasting experiments (recruited from effective altruism-adjacent events and other forecasting platforms) \[6\], and one recruited from Positly, an online platform for crowdworkers. The former group is here called “Network-adjacent forecasters” and the latter “Online crowdworkers”.
![](images/5e456cfc58967fc5c074c0287c806653d978b84b.png)
![](.images/5e456cfc58967fc5c074c0287c806653d978b84b.png)
**3\. The evaluator judges the claims**
@ -125,17 +125,17 @@ The aggregate prediction was computed as the average of all forecasters' final p
The following graph shows how the aggregate performed on each question:
![](images/f6ebc350474cbb51b21c0fa5184716c6f2e3eceb.png)
![](.images/f6ebc350474cbb51b21c0fa5184716c6f2e3eceb.png)
The opaque bars represent the scores from the crowdworkers, and the translucent bars, which have higher scores throughout, represent the scores from the network-adjacent forecasters. It's interesting that the order is preserved, that is, that the question difficulty was the same for both groups. Finally we dont see any correlation between question difficulty and the importance weights Elizabeth assigned to the questions.
However, the comparison is confounded by the fact that more effort was spent from the network-adjacent forecasters. The above graph also doesnt compare performance to Elizabeths priors. Hence we also plot the evolution of the aggregate score over prediction number and time (the first data-point in the below graphs represent Elizabeths priors):
![](images/53f456d57fe63c3f65b05b21cffa42b69d45ed87.png)
![](.images/53f456d57fe63c3f65b05b21cffa42b69d45ed87.png)
![](images/a547c3816ddef37ee3d560ac2d05ec50071df615.png)
![](.images/a547c3816ddef37ee3d560ac2d05ec50071df615.png)
![](images/c7e041d8fab837233a9cc4d03c6166c54da04020.png)
![](.images/c7e041d8fab837233a9cc4d03c6166c54da04020.png)
For the last graph, the y-axis shows the score on a logarithmic scale, and the x-axis shows how far along the experiment is. For example, 14 out of 28 days would correspond to 50%. The thick lines show the average score of the aggregate prediction, across all questions, at each time-point. The shaded areas show the standard error of the scores, so that the graph might be interpreted as a guess of how the two communities would predict a random new question \[10\].
@ -147,9 +147,9 @@ One way to see this qualitatively is by observing the graphs below, where we dis
The x-axis \[12\] refers to the Elizabeths best estimate of the accuracy of a claim, from 0% to 100% (see section “Mechanism design, 1. Evaluator extracts claims” for more detail).
![](images/b10eb78f4b874e299a1a14ef331821ebb47b042a.png)
![](.images/b10eb78f4b874e299a1a14ef331821ebb47b042a.png)
![](images/449cbaaa18d85ac7e1fbf3e7d70defc290367b94.png)
![](.images/449cbaaa18d85ac7e1fbf3e7d70defc290367b94.png)
Another way to understand the performance of the aggregate is to note that the aggregate of network-adjacent forecasters had an average log score of -0.5. To get a rough sense of what that means, it's the score you'd get by being 70% confident in a binary event, and being correct (though note that this binary comparison merely serves to provide intuition, there are technical details making the comparison to a distributional setting a bit tricky).
@ -189,7 +189,7 @@ The value is computed using the following model (interactive calculation linked
Results were as follows.
![](images/50453b84385fa25f5a934570cfa2bc6702869748.png)
![](.images/50453b84385fa25f5a934570cfa2bc6702869748.png)
_(Links to models: network-adjacent_ _[cost ratio](https://www.getguesstimate.com/models/14521)_ _and_ _[value ratio](https://observablehq.com/@jjj/amplification-effectiveness), online crowdworker_ _[cost ratio](https://www.getguesstimate.com/models/14614)_ _and_ _[value ratio](https://observablehq.com/@jjj/amplification-effectiveness-positly).)_
@ -199,7 +199,7 @@ This observation is in tension with the some of the above graphs, which show a t
Another question to consider when thinking about cost-effectiveness is diminishing returns. The following graph shows how the information gain from additional predictions diminished over time.
![](images/6181a43348526199a4746e6da4bbdc96afdd823b.png)
![](.images/6181a43348526199a4746e6da4bbdc96afdd823b.png)
The x-axis shows the number of predictions after Elizabeths prior (which would be prediction number 0). The y-axis shows how much closer to a perfect score each prediction moved the aggregate, as a percentage of the distance between the previous aggregate and the perfect log score of 0 \[15\].

@ -25,9 +25,9 @@ When the professors deviate from that general structure, they become less convin
The question presented is whether there is a nutrition based poverty trap. A mathematical formalism for a poverty trap is presented, in which wealth at time t+1 depends on wealth at time t. A poverty trap appears if falling below a wealth threshold leads to a further sliding down, that is, if the relationship between wealth at time t and t+1 looks like:
![](images/be7f99df5b8aa33ef7aadd37a7560aa24505e5d9.png) as opposed to like
![](.images/be7f99df5b8aa33ef7aadd37a7560aa24505e5d9.png) as opposed to like
![](images/8b0e4c8fbb9b0400998fbbc58158fa7c79aaeebb.png)
![](.images/8b0e4c8fbb9b0400998fbbc58158fa7c79aaeebb.png)
(In both cases, you start at y0 at time 0, move to y1 at time 1, to y2 at time 2, etc.)
@ -83,4 +83,4 @@ The [MIT edx courses on Data, Economics, and Development Policy](https://microma
Regarding my epistemic status, I can vouch for the quality of the content and of the pedagogy, but not for the signalling value. Models of which kinds of EAs are likely to get career capital from this kind of online course are very welcome; Charity Entrepeneurship mentions [taking online courses](https://forum.effectivealtruism.org/posts/QxCpXjGmHbpX45nxo/how-to-increase-your-odds-of-starting-a-career-in-charity#Possible_actions) as a factor which would increase your odds of starting a career in charity entrepreneurship
I also strongly suspect that this post is the result of a selection effect, both in terms of liking these courses in particular, and online courses in general, more than average. For example, I much prefer the flexibility of online courses and I'm happy to provide my own motivation structures, perhaps to an unusual degree. I thought that the review was worth posting anyways.
I also strongly suspect that this post is the result of a selection effect, both in terms of liking these courses in particular, and online courses in general, more than average. For example, I much prefer the flexibility of online courses and I'm happy to provide my own motivation structures, perhaps to an unusual degree. I thought that the review was worth posting anyways.

@ -17,7 +17,7 @@ In [Shapley values: Better than counterfactuals](https://forum.effectivealtruism
Because this post is long, images will be interspersed throughout to clearly separate sections and provide rest for tired eyes. This is an habit I have from my blogging days, though which I have not seen used in this forum.
![](images/bc4a6add2d82c0297031b883af215a4dff297d94.png)
![](.images/bc4a6add2d82c0297031b883af215a4dff297d94.png)
## Philantropic Coordination Theory:
@ -42,7 +42,7 @@ If we try to calculate the Shapley value in this case, we notice that it depends
In any case, their Shapley values are:
![](images/0b51c9065e85902d93abc4de5c676a162431fd9e.png) ![](images/48294ee8a66ec565a22576254823d2292d1d6f7b.png)
![](.images/0b51c9065e85902d93abc4de5c676a162431fd9e.png) ![](.images/48294ee8a66ec565a22576254823d2292d1d6f7b.png)
One can understand this as follows: Player A has Value({A}) already in their hand, and Player B has Value({B}), and they're considering whether to cooperate. If they do, then the surplus from cooperating is Value({A,B}) - (Value({A}) + Value({B})), and it get's shared equally:
@ -136,7 +136,7 @@ Note that, because GiveWell's alternative is known, GiveWell doesn't have to see
* When buying a certificate of impact, the donor would in fact be willing to pay _more_ than $1, because $1 dollar can't get him that much value any more, due to diminishing returns. Similarly, GiveWell would be willing to sell it for _less_ than $1, because of the same reasons; once diminishing returns start setting in, they would have to donate less than $1 to their best alternative to get the equivalent of $1 dollar of donations to SCI. I've thus pretended that in this market with one seller and one buyer, the price is agreed to be $1. Another solution would be to have an efficient market in certificates of impact.
* The value certificate equilibrium is very similar regardless of whether one is thinking in terms of Shapley values or counterfactuals. I feel, but can't prove, that Shapley values add a kind of clarity and crispness to the reasoning, if only because they force you to consider all the moving parts.
![](images/14578566a70fe4029fdf4fc0b37253dce1b735d2.jpg)
![](.images/14578566a70fe4029fdf4fc0b37253dce1b735d2.jpg)
## Shapley values of forecasters.
@ -202,7 +202,7 @@ while the new forecaster gets rewarded in proportion to
This still preserves some of the same incentives as above, though in this case, attribution becomes more tricky. Further, anecdotically, seeing someone's distribution before updating gives more information than seeing someone's distribution after they've updated, so just seeing the contrast between g(0) and g(1) might be useful to future forecasters.
![](images/4bf82fbcce46ac1e5520ce4070c8883525c80bde.jpg)
![](.images/4bf82fbcce46ac1e5520ce4070c8883525c80bde.jpg)
## A value attribution impossibility theorem.
@ -279,7 +279,7 @@ With that in mind, one of the most interesting facts about Arrow's impossibility
As such, I'm hedging my bets: impossibility theorems must be taken with a grain of salt; they can be stepped over if their assumptions do not hold.
![](images/3edb1377c2726d6123865c43360d67c08adb9aca.jpg)
![](.images/3edb1377c2726d6123865c43360d67c08adb9aca.jpg)
## Parfit's [_Five Mistakes in Moral Mathematics_](http://www.stafforini.com/docs/Parfit%20-%20Five%20mistakes%20in%20moral%20mathematics.pdf).
@ -289,7 +289,7 @@ The Indian Mathematician Brahmagupta describes the solution to the quadratic equ
to describe the solution to the quadratic equation ax^2 + bx = c.
![](images/5c7efddde4aa1a7119a15d93783f6c682d7212cf.svg)
![](.images/5c7efddde4aa1a7119a15d93783f6c682d7212cf.svg)
I read Parfit's piece with the same admiration, sadness and sorrow with which I read the above paragraph. On the one hand, he is oftent clearly right. On the other hand, he's just working with very rudimentary tools: mere words.
@ -356,7 +356,7 @@ Overall, I think that Shapley values do pretty well on the problems posed by Par
> (C7) Even if an act harms no one, this act may be wrong because it is one of a set of acts that together harm other people. Similarly, even if some act benefits no one, it can be what someone ought to do, because it is one of a set of acts that together benefit other people.
![](images/c1c6470e1f4712d38d21ce31a6a2a4baa3671ab9.jpg)
![](.images/c1c6470e1f4712d38d21ce31a6a2a4baa3671ab9.jpg)
## Shapley value puzzles
@ -433,7 +433,7 @@ Calculate the Shapley values for:
* You (Emma)
* Lucy
![](images/16fb6f6948ae9600cee6ae878a46ea21172e3821.jpg)
![](.images/16fb6f6948ae9600cee6ae878a46ea21172e3821.jpg)
## Speculative Shapley extensions.
@ -478,7 +478,7 @@ If you the agent's information for those expected values, this allows you to pun
However, Shapley values are probably be too unsophisticated to be used in situations which are primarily about social incentives.
![](images/8260e1c392a2a710e9421817175e84156922de3d.jpeg)
![](.images/8260e1c392a2a710e9421817175e84156922de3d.jpeg)
## Some complimentary (and yet obligatory) ramblings about Shapley Values, Goodhart's law, and Stanovich's disrationalia.
@ -537,7 +537,7 @@ Suppose that you have two players, player a and Player B, and three charities: G
Or, in graph form:
![](images/b640f8bfd4ce1e54a632d7bb4a3c4bd9e9695418.jpg)
![](.images/b640f8bfd4ce1e54a632d7bb4a3c4bd9e9695418.jpg)
Caveat:

@ -105,7 +105,7 @@ This month they've organized a flurry of activities, most notably:
PredictIt is a prediction platform restricted to US citizens, but also accessible with a VPN. This month, they present a map about the electoral college result in the USA. States are colored according to the market prices:
![](images/654d6212cd170b9287738a89bd6b4535248ed6e1.png)
![](..images/654d6212cd170b9287738a89bd6b4535248ed6e1.png)
Some of the predictions I found most interesting follow. The market probabilities can be found below; the engaged reader might want to write down their own probabilities and then compare.
@ -259,7 +259,7 @@ This section contains items which have recently come to my attention, but which
> Our judges in this study were eight individuals, carefully selected for their expertise as handicappers. Each judge was presented with a list of 88 variables culled from the past performance charts. He was asked to indicate which five variables out of the 88 he would wish to use when handicapping a race, if all he could have was five variables. He was then asked to indicate which 10, which 20, and which 40 he would use if 10, 20, or 40 were available to him.
> We see that accuracy was as good with five variables as it was with 10, 20, or 40. The flat curve is an average over eight subjects and is somewhat misleading. Three of the eight actually showed a decrease in accuracy with more information, two improved, and three stayed about the same. All of the handicappers became more confident in their judgments as information increased.
> ![](images/e8ac191e43364ff35bdc19361dd92c9a74e7109a.png)
> ![](..images/e8ac191e43364ff35bdc19361dd92c9a74e7109a.png)
* The study contains other nuggets, such as:
* An experiment on trying to predict the outcome of a given equation. When the feedback has a margin of error, this confuses respondents.
@ -275,4 +275,4 @@ This section contains items which have recently come to my attention, but which
Vale.
Conflicts of interest: Marked as (c.o.i) throughout the text.
Note to the future: All links are automatically added to the Internet Archive. In case of link rot, go [there](https://archive.org/).
Note to the future: All links are automatically added to the Internet Archive. In case of link rot, go [there](https://archive.org/).

@ -115,7 +115,7 @@ Ordered in subjective order of importance:
* The International Energy Agency had terrible forecasts on solar photo-voltaic energy production, until [recently](https://pv-magazine-usa.com/2020/07/12/has-the-international-energy-agency-finally-improved-at-forecasting-solar-growth/):
> ![](images/7244132c6380f86a5fc5327b5c6abb70e741097a.jpg)
> ![](.images/7244132c6380f86a5fc5327b5c6abb70e741097a.jpg)
> ...Its a scenario assuming current policies are kept and no new policies are added.
@ -241,7 +241,7 @@ Ordered in subjective order of importance:
* [Taleb](https://forecasters.org/blog/2020/06/14/on-single-point-forecasts-for-fat-tailed-variables/): _On single point forecasts for fat tailed variables_. Leitmotiv: Pandemics are fat-tailed.
> ![](images/d263195904a7942604599ff703fcb71f28d0a156.png) ![](images/860ccc6875dd7044a884708cd8c34c6bb3d70506.png)
> ![](.images/d263195904a7942604599ff703fcb71f28d0a156.png) ![](.images/860ccc6875dd7044a884708cd8c34c6bb3d70506.png)
> We do not need more evidence under fat tailed distributions — it is there in the properties themselves (properties for which we have ample evidence) and these clearly represent risk that must be killed in the egg (when it is still cheap to do so). Secondly, unreliable data — or any source of uncertainty — should make us follow the most paranoid route. \[...\] more uncertainty in a system makes precautionary decisions very easy to make (if I am uncertain about the skills of the pilot, I get off the plane).
@ -300,4 +300,4 @@ Note to the future: All links are added automatically to the Internet Archive. I
> "horses for courses, the way you do the forecast, the way you present it depends on what you're trying to achieve with it"
---
---

@ -179,7 +179,7 @@ The Foresight Insitute organizes weekly talks; here is one with Samo Burja on [l
Last, but not least, Ozzie Gooen on [Multivariate estimation & the Squiggly language](https://www.lesswrong.com/posts/kTzADPE26xh3dyTEu/multivariate-estimation-and-the-squiggly-language):
![](images/fef39d9a14a8ca8986c984ba2f8227d1581d9421.jpg)
![](.images/fef39d9a14a8ca8986c984ba2f8227d1581d9421.jpg)
---
@ -189,4 +189,4 @@ Note to the future: All links are added automatically to the Internet Archive. I
> [Littlewood's law](https://en.wikipedia.org/wiki/Littlewood%27s_law) states that a person can expect to experience events with odds of one in a million (defined by the law as a "miracle") at the rate of about one per month."
---
---

@ -70,7 +70,7 @@ Boeing [releases](https://www.fool.com/investing/2020/10/15/boeings-commercial-m
The [World Agricultural Supply and Demand Estimates](https://www.usda.gov/oce/commodity/wasde) is a monthly report by the US Department of Agriculture. It provides monthly estimates and past figures for crops worldwide, and for livestock production in the US specifically (meat, poultry, dairy), which might be of interest to the animal suffering movement. It also provides estimates of the past reliability of those forecasts. The October report can be found [here](https://www.usda.gov/oce/commodity/wasde/wasde1020.pdf), along with a summary [here](https://www.feedstuffs.com/markets/usda-raises-meat-poultry-production-forecast). The image below presents the 2020 and 2021 predictions, as well as the 2019 numbers:
![](images/c1fa5c2d58e4b0b76d0c7afab896ee89a7289559.png)
![](.images/c1fa5c2d58e4b0b76d0c7afab896ee89a7289559.png)
The Atlantic considers scenarios under which [Trump refuses to concede](https://www.theatlantic.com/magazine/archive/2020/11/what-if-trump-refuses-concede/616424/). Warning: very long, very chilling.
@ -105,4 +105,4 @@ Note to the future: All links are added automatically to the Internet Archive. I
Using actuarial life tables and an adjustment for covid, the implied probability that all 246 readers of this newsletter drop dead before the next month is at least 10^(-900) (if they were uncorrelated). See [this Wikipedia page](https://en.wikipedia.org/wiki/Orders_of_magnitude_(probability)) or [this xkcd comic](https://xkcd.com/2379/) for a comparison with other low probability events, such as asteroid impacts.
---
---

@ -13,7 +13,7 @@ For example if only the top forecaster wins a prize, you might want to predict a
Consider for example [this prediction contest](https://predictingpolitics.com/2020/08/02/the-predictions-are-in/), which only had a prize for #1. The following question asks about the margin Biden would win or lose Georgia by:
![](images/f736a63590b98d329a456b5ff1cc055da86d416c.png)
![](.images/f736a63590b98d329a456b5ff1cc055da86d416c.png)
Then the most likely scenario might be a close race, but the prediction which would have maximized your odds of coming in #1st might be much more extremized, because other predictors are more tightly packed in the middle.
@ -73,4 +73,4 @@ In this case, information gatherers might then be upvoted by prediction producer
* One might explicitly reward reasoning rather than accuracy (this has been tried on Metaculus for the insight tournament, and also for the [El Paso series).](https://pandemic.metaculus.com/contests/?selected=el-paso) This has its own downsides, notably that its not obvious that reasoning which looks good/reads well is actually correct.
* One might make objectives more fuzzy, like the Metaculus points system, hoping this would make it more difficult to hack.
* One might reward activity, i.e., frequency of updates, or some other proxy expected to correlate with forecasting accuracy. This might work better if the correlation is causal (i.e., better forecasters have higher accuracy because they forecast more often), rather than due to a confounding factor. The obvious danger with any such strategy is that rewarding the proxy [is likely to break the correlation](https://www.lesswrong.com/posts/YtvZxRpZjcFNwJecS/the-importance-of-goodhart-s-law#:~:text=Goodhart's%20law%20states%20that%20once,to%20the%20Bank%20of%20England.).
* One might reward activity, i.e., frequency of updates, or some other proxy expected to correlate with forecasting accuracy. This might work better if the correlation is causal (i.e., better forecasters have higher accuracy because they forecast more often), rather than due to a confounding factor. The obvious danger with any such strategy is that rewarding the proxy [is likely to break the correlation](https://www.lesswrong.com/posts/YtvZxRpZjcFNwJecS/the-importance-of-goodhart-s-law#:~:text=Goodhart's%20law%20states%20that%20once,to%20the%20Bank%20of%20England.).

@ -73,7 +73,7 @@ I also had a payout for insightful comments, such that the nth most upvoted comm
Note that the projects were _not_ chosen so as to maximize impact, but rather as to maximize information about whether their value could be predicted.
![](images/0e2bfcbc0fa993098421b08b67af411ec30e8ffb.png)
![](.images/0e2bfcbc0fa993098421b08b67af411ec30e8ffb.png)
We observe that:
@ -96,7 +96,7 @@ Predictions:
* Centered 50% confidence interval: 33 to 74
* Centered 95% confidence interval: 11 to 111
![](images/4a4419c6711561edc7b61a8c0633da2717c0913f.png)
![](.images/4a4419c6711561edc7b61a8c0633da2717c0913f.png)
Actual upvotes after one month: 31
@ -114,7 +114,7 @@ Predictions:
* Centered 50% confidence interval: 28 to 68
* Centered 95% confidence interval: 9 to 106
![](images/a71da8f5dd6340d4f81828cb1cb5b20b292c3188.png)
![](.images/a71da8f5dd6340d4f81828cb1cb5b20b292c3188.png)
Actual upvotes after one month: 20 for the [first post](https://www.lesswrong.com/posts/yWmmYLCJft7u7XL5o/some-examples-of-technology-timelines), 49 for the [second one](https://www.lesswrong.com/posts/FaCqw2x59ZFhMXJr9/a-prior-for-technological-discontinuities). I'm taking 49 as the resolution.
@ -132,7 +132,7 @@ Predictions:
* Centered 50% confidence interval: 15 to 57
* Centered 95% confidence interval: 4 to 91
![](images/e1d6f9b8bfd27b8f9102143b112b9d9c7fc75fb7.png)
![](.images/e1d6f9b8bfd27b8f9102143b112b9d9c7fc75fb7.png)
Actual upvotes after one month: 22
@ -150,7 +150,7 @@ Predictions:
* Centered 50% confidence interval: 17 to 60
* Centered 95% confidence interval: 6 to 102
![](images/7da9d104143c060272f96cbd12bcc27e2c71dbf9.png)
![](.images/7da9d104143c060272f96cbd12bcc27e2c71dbf9.png)
Actual upvotes after one month: 20
@ -167,7 +167,7 @@ Predictions:
* Centered 50% confidence interval: 19 to 59
* Centered 95% confidence interval: 7 to 92
![](images/89ff6294dfe357bfebc3640584d42060934103c4.png)
![](.images/89ff6294dfe357bfebc3640584d42060934103c4.png)
Actual upvotes after one month: 27
@ -182,7 +182,7 @@ A further idea was my proposal for my Summer Research Fellowship at FHI, though
* Centered 50% confidence interval: 24 to 70
* Centered 95% confidence interval: 10 to 103
![](images/9e6a4434c603e9fd94e9a6819fff247011a5ab93.png)
![](.images/9e6a4434c603e9fd94e9a6819fff247011a5ab93.png)
I take these data-points as further evidence that this setup is interesting or worth it; arguably a major take-away for this project is “a fairly simple forecasting system is able to produce a project which gets accepted to the FHI summer fellowship.” Because the program got ~300 [applications](https://forum.effectivealtruism.org/posts/EPGdwe6vsCY7A9HPa/review-of-fhi-s-summer-research-fellowship-2020), but only 27 participants were accepted, this puts this forecasting setup on the top 9% of applicants in terms of some fuzzy “optimization power” (though this is a simplification, because the project proposal was probably one of many factors.)
@ -250,4 +250,4 @@ If you would be interested in participating in a scaled-up version of this exper
---
Conflict of interest: I've worked in the past as a paid contractor for foretold/Ozzie Gooen. Jacob Lagerros provided 2/3rds of the payout funding.
Conflict of interest: I've worked in the past as a paid contractor for foretold/Ozzie Gooen. Jacob Lagerros provided 2/3rds of the payout funding.

@ -11,7 +11,7 @@ As my primary conclusion, I found that assessing the relative value of projects
The reader can see my results [here](https://docs.google.com/spreadsheets/d/1BGmkDrQzDCYdMXEvIv7DhMusG-7EwBTkz8MiaEBq9VE/edit#gid=0), though Id like to note that theyre preliminary and experimental, as is the rest of this post. Towards the end of this post, there is a section which mentions more caveats.
![](images/16bb08699adfac73a6545a3e056f19480a486da7.png)
![](.images/16bb08699adfac73a6545a3e056f19480a486da7.png)
Related work:

Some files were not shown because too many files have changed in this diff Show More

Loading…
Cancel
Save