fix: rationalize image directories
Before Width: | Height: | Size: 196 KiB After Width: | Height: | Size: 196 KiB |
Before Width: | Height: | Size: 95 KiB After Width: | Height: | Size: 95 KiB |
Before Width: | Height: | Size: 52 KiB After Width: | Height: | Size: 52 KiB |
Before Width: | Height: | Size: 107 KiB After Width: | Height: | Size: 107 KiB |
Before Width: | Height: | Size: 38 KiB After Width: | Height: | Size: 38 KiB |
|
@ -57,11 +57,11 @@ CSET's Michael Page also published [Wisdom of the Crowd as Arbiter of Expert Dis
|
||||||
|
|
||||||
As in the first season, Samotsvety Forecasting, a team made up of Eli Lifland, Misha Yagudin, and myself, completely demolished the competition. We were around twice as good as the next-best team in terms of the relative Brier score.
|
As in the first season, Samotsvety Forecasting, a team made up of Eli Lifland, Misha Yagudin, and myself, completely demolished the competition. We were around twice as good as the next-best team in terms of the relative Brier score.
|
||||||
|
|
||||||
![](images/d26defbf40ccec176cb4c6234d4c03e4bce6f5dc.png)
|
![](.images/d26defbf40ccec176cb4c6234d4c03e4bce6f5dc.png)
|
||||||
|
|
||||||
Amusingly, all three of us are in the top 3 of all time (out of 1035 contenders.)
|
Amusingly, all three of us are in the top 3 of all time (out of 1035 contenders.)
|
||||||
|
|
||||||
![](images/50628176a24bba3c42a1b2e7888497a63a748502.png)
|
![](.images/50628176a24bba3c42a1b2e7888497a63a748502.png)
|
||||||
|
|
||||||
### Metaculus
|
### Metaculus
|
||||||
|
|
||||||
|
@ -94,7 +94,7 @@ In particular, most other prediction markets/forecasting platforms—like Hyperm
|
||||||
|
|
||||||
Moreover, the team has a couple of ex-Googlers, so beating everyone else in the technology front seems like a plausible pathway to dominance. I encourage people to [give it a try](https://manifold.markets/) ([a](https://web.archive.org/web/20220109142701/https://manifold.markets/)). Some of the markets are entertaining, and for now, it's just play money.
|
Moreover, the team has a couple of ex-Googlers, so beating everyone else in the technology front seems like a plausible pathway to dominance. I encourage people to [give it a try](https://manifold.markets/) ([a](https://web.archive.org/web/20220109142701/https://manifold.markets/)). Some of the markets are entertaining, and for now, it's just play money.
|
||||||
|
|
||||||
![](images/14ee30f907ad1872a88a012639eadefe0c30e0e3.png)
|
![](.images/14ee30f907ad1872a88a012639eadefe0c30e0e3.png)
|
||||||
|
|
||||||
For those curious, an explanation of Manifold Markets' tricky dynamic parimutuel betting system can be found [here](https://manifoldmarkets.notion.site/Technical-Overview-b9b48a09ea1f45b88d991231171730c5) ([a](https://web.archive.org/web/20220109142731/https://manifoldmarkets.notion.site/Technical-Overview-b9b48a09ea1f45b88d991231171730c5)).
|
For those curious, an explanation of Manifold Markets' tricky dynamic parimutuel betting system can be found [here](https://manifoldmarkets.notion.site/Technical-Overview-b9b48a09ea1f45b88d991231171730c5) ([a](https://web.archive.org/web/20220109142731/https://manifoldmarkets.notion.site/Technical-Overview-b9b48a09ea1f45b88d991231171730c5)).
|
||||||
|
|
||||||
|
@ -122,7 +122,7 @@ A [white paper by the Good Judgment project](https://goodjudgment.com/wp-content
|
||||||
|
|
||||||
The paper also doesn’t have the visceral impact of the comparison in the Superforecasting book, where the original Good Judgment project beat intelligence analysts with classified information. This time, the paper compares paid superforecasters against unpaid hobbyists. I guess I’d have liked to see a comparison between different platforms, e.g., a Good Judgment vs Metaculus or vs PredictIt head-to-head fight.
|
The paper also doesn’t have the visceral impact of the comparison in the Superforecasting book, where the original Good Judgment project beat intelligence analysts with classified information. This time, the paper compares paid superforecasters against unpaid hobbyists. I guess I’d have liked to see a comparison between different platforms, e.g., a Good Judgment vs Metaculus or vs PredictIt head-to-head fight.
|
||||||
|
|
||||||
![](images/8e30a71d8031dc5de68eb6fed3d5460400aa878e.png)
|
![](.images/8e30a71d8031dc5de68eb6fed3d5460400aa878e.png)
|
||||||
|
|
||||||
Jaime Sevilla looks at [aggregating forecasts in a principled way](https://forum.effectivealtruism.org/s/hjiBqAJNKhfJFq7kf/p/biL94PKfeHmgHY6qe) ([a](https://web.archive.org/web/20220109143140/https://forum.effectivealtruism.org/s/hjiBqAJNKhfJFq7kf/p/biL94PKfeHmgHY6qe)), building on his [previous work](https://forum.effectivealtruism.org/s/hjiBqAJNKhfJFq7kf) ([a](https://web.archive.org/web/20220109143221/https://forum.effectivealtruism.org/s/hjiBqAJNKhfJFq7kf)). This time, he explains [a result by Neyman et al.](https://arxiv.org/abs/2111.03153) ([a](https://web.archive.org/web/20220109143254/https://arxiv.org/abs/2111.03153)), and tests it on past Metaculus data. He finds that it beats Metaculus' own prediction, as well as all other aggregation methods commonly considered.
|
Jaime Sevilla looks at [aggregating forecasts in a principled way](https://forum.effectivealtruism.org/s/hjiBqAJNKhfJFq7kf/p/biL94PKfeHmgHY6qe) ([a](https://web.archive.org/web/20220109143140/https://forum.effectivealtruism.org/s/hjiBqAJNKhfJFq7kf/p/biL94PKfeHmgHY6qe)), building on his [previous work](https://forum.effectivealtruism.org/s/hjiBqAJNKhfJFq7kf) ([a](https://web.archive.org/web/20220109143221/https://forum.effectivealtruism.org/s/hjiBqAJNKhfJFq7kf)). This time, he explains [a result by Neyman et al.](https://arxiv.org/abs/2111.03153) ([a](https://web.archive.org/web/20220109143254/https://arxiv.org/abs/2111.03153)), and tests it on past Metaculus data. He finds that it beats Metaculus' own prediction, as well as all other aggregation methods commonly considered.
|
||||||
|
|
||||||
|
@ -157,7 +157,7 @@ Google has revealed the existence of a new [internal prediction market](https://
|
||||||
|
|
||||||
The Bank Al-Maghrib—the central bank of Morocco—has a very clearly written report on [monetary policy](https://www.bkam.ma/en/content/download/744906/8479246/RPM%20%20ang%20Q3%202021.pdf) ([a](https://web.archive.org/web/20220109144144/https://www.bkam.ma/en/content/download/744906/8479246/RPM%20%20ang%20Q3%202021.pdf)). I found the international comparisons on growth, interest and inflation rates to be particularly interesting.
|
The Bank Al-Maghrib—the central bank of Morocco—has a very clearly written report on [monetary policy](https://www.bkam.ma/en/content/download/744906/8479246/RPM%20%20ang%20Q3%202021.pdf) ([a](https://web.archive.org/web/20220109144144/https://www.bkam.ma/en/content/download/744906/8479246/RPM%20%20ang%20Q3%202021.pdf)). I found the international comparisons on growth, interest and inflation rates to be particularly interesting.
|
||||||
|
|
||||||
![](images/322406ae40fd3e2a7738dad8cfb22bc6f2ccd95f.png)
|
![](.images/322406ae40fd3e2a7738dad8cfb22bc6f2ccd95f.png)
|
||||||
|
|
||||||
Some news media & individuals wrote some quantified predictions for 2022: [Vox](https://www.vox.com/future-perfect/22824620/predicting-midterms-covid-roe-wade-oscars-2022) ([a](https://web.archive.org/web/20220109144227/https://www.vox.com/future-perfect/22824620/predicting-midterms-covid-roe-wade-oscars-2022)), [UnHerd](https://unherd.com/2022/01/my-predictions-for-2022/) ([a](https://web.archive.org/web/20220110184541/https://unherd.com/2022/01/my-predictions-for-2022/)), [The Economist](https://www.economist.com/graphic-detail/2022/01/01/what-prediction-markets-suggest-will-happen-in-2022) ([a](https://web.archive.org/web/20220110184632/https://www.economist.com/graphic-detail/2022/01/01/what-prediction-markets-suggest-will-happen-in-2022)), [Ipsos](https://www.ipsos.com/en/global-predictions-2022) ([a](https://web.archive.org/web/20220110184849/https://www.ipsos.com/en/global-predictions-2022)), [Matt Rickard](https://matt-rickard.com/2022-predictions/) ([a](https://web.archive.org/web/20220110185055/https://matt-rickard.com/2022-predictions/)), [Avraham Eisenberg](https://misinfounderload.substack.com/p/predictions-for-2022) ([a](https://web.archive.org/web/20220110185128/https://misinfounderload.substack.com/p/predictions-for-2022)), [Mathew Yglesias](https://www.slowboring.com/p/predictions-are-hard) ([a](https://web.archive.org/web/20220110185219/https://www.slowboring.com/p/predictions-are-hard)), [The Atlantic Council](https://www.atlanticcouncil.org/content-series/atlantic-council-strategy-paper-series/the-top-twelve-risks-and-opportunities-for-2022/) ([a](https://web.archive.org/web/20220110185255/https://www.atlanticcouncil.org/content-series/atlantic-council-strategy-paper-series/the-top-twelve-risks-and-opportunities-for-2022/)), and [Blackrock](https://www.blackrock.com/corporate/insights/blackrock-investment-institute/interactive-charts/geopolitical-risk-dashboard) ([a](https://web.archive.org/web/20220110185321/https://www.blackrock.com/corporate/insights/blackrock-investment-institute/interactive-charts/geopolitical-risk-dashboard)). h/t to Clay Graubard for [this longer list of 2022 predictions](https://docs.google.com/document/d/1cvKZZ6PKdJh6WIcdyQ-k8kNjq7tOLjV-t9IRaCHCp7Y/edit#), from which some of the aforementioned were taken. It feels like there are more of these than last year, and the Mathew Yglesias piece is by a particularly mainstream author, which might be indicative that forecasting is becoming something less niche.
|
Some news media & individuals wrote some quantified predictions for 2022: [Vox](https://www.vox.com/future-perfect/22824620/predicting-midterms-covid-roe-wade-oscars-2022) ([a](https://web.archive.org/web/20220109144227/https://www.vox.com/future-perfect/22824620/predicting-midterms-covid-roe-wade-oscars-2022)), [UnHerd](https://unherd.com/2022/01/my-predictions-for-2022/) ([a](https://web.archive.org/web/20220110184541/https://unherd.com/2022/01/my-predictions-for-2022/)), [The Economist](https://www.economist.com/graphic-detail/2022/01/01/what-prediction-markets-suggest-will-happen-in-2022) ([a](https://web.archive.org/web/20220110184632/https://www.economist.com/graphic-detail/2022/01/01/what-prediction-markets-suggest-will-happen-in-2022)), [Ipsos](https://www.ipsos.com/en/global-predictions-2022) ([a](https://web.archive.org/web/20220110184849/https://www.ipsos.com/en/global-predictions-2022)), [Matt Rickard](https://matt-rickard.com/2022-predictions/) ([a](https://web.archive.org/web/20220110185055/https://matt-rickard.com/2022-predictions/)), [Avraham Eisenberg](https://misinfounderload.substack.com/p/predictions-for-2022) ([a](https://web.archive.org/web/20220110185128/https://misinfounderload.substack.com/p/predictions-for-2022)), [Mathew Yglesias](https://www.slowboring.com/p/predictions-are-hard) ([a](https://web.archive.org/web/20220110185219/https://www.slowboring.com/p/predictions-are-hard)), [The Atlantic Council](https://www.atlanticcouncil.org/content-series/atlantic-council-strategy-paper-series/the-top-twelve-risks-and-opportunities-for-2022/) ([a](https://web.archive.org/web/20220110185255/https://www.atlanticcouncil.org/content-series/atlantic-council-strategy-paper-series/the-top-twelve-risks-and-opportunities-for-2022/)), and [Blackrock](https://www.blackrock.com/corporate/insights/blackrock-investment-institute/interactive-charts/geopolitical-risk-dashboard) ([a](https://web.archive.org/web/20220110185321/https://www.blackrock.com/corporate/insights/blackrock-investment-institute/interactive-charts/geopolitical-risk-dashboard)). h/t to Clay Graubard for [this longer list of 2022 predictions](https://docs.google.com/document/d/1cvKZZ6PKdJh6WIcdyQ-k8kNjq7tOLjV-t9IRaCHCp7Y/edit#), from which some of the aforementioned were taken. It feels like there are more of these than last year, and the Mathew Yglesias piece is by a particularly mainstream author, which might be indicative that forecasting is becoming something less niche.
|
||||||
|
|
||||||
|
|
Before Width: | Height: | Size: 35 KiB After Width: | Height: | Size: 35 KiB |
Before Width: | Height: | Size: 31 KiB After Width: | Height: | Size: 31 KiB |
|
@ -16,7 +16,7 @@ It is 1964. Sherman Kent is a senior intelligence analyst. While doing some rudi
|
||||||
|
|
||||||
It is the 30th of April of 1975. With the [fall of Saigon](https://en.wikipedia.org/wiki/Fall_of_Saigon), the US finally pulls out of a bloody war with Vietnam. There are embarrassing images of people flying out of the US embassy at the last moment. Biden is a newly-minted [senator from Delaware](https://en.wikipedia.org/wiki/US_Senate_career_of_Joe_Biden).
|
It is the 30th of April of 1975. With the [fall of Saigon](https://en.wikipedia.org/wiki/Fall_of_Saigon), the US finally pulls out of a bloody war with Vietnam. There are embarrassing images of people flying out of the US embassy at the last moment. Biden is a newly-minted [senator from Delaware](https://en.wikipedia.org/wiki/US_Senate_career_of_Joe_Biden).
|
||||||
|
|
||||||
![](images/dd78f1183249ae5dafd7578361c66a800c17f842.jpeg)
|
![](.images/dd78f1183249ae5dafd7578361c66a800c17f842.jpeg)
|
||||||
|
|
||||||
It is 2001. The US intelligence agencies are very embarrassed by not having been able to predict the September 11 attacks. The position of the [Director of National Intelligence](https://en.wikipedia.org/wiki/Director_of_National_Intelligence#Founding), and an associated [Office of the Director of National Intelligence](https://en.wikipedia.org/wiki/Director_of_National_Intelligence#Office_of_the_Director_of_National_Intelligence), is established to coordinate all intelligence agencies to do better in the future.
|
It is 2001. The US intelligence agencies are very embarrassed by not having been able to predict the September 11 attacks. The position of the [Director of National Intelligence](https://en.wikipedia.org/wiki/Director_of_National_Intelligence#Founding), and an associated [Office of the Director of National Intelligence](https://en.wikipedia.org/wiki/Director_of_National_Intelligence#Office_of_the_Director_of_National_Intelligence), is established to coordinate all intelligence agencies to do better in the future.
|
||||||
|
|
||||||
|
@ -34,7 +34,7 @@ It is the summer of 2021. Biden makes incredibly overconfident assertions about
|
||||||
|
|
||||||
> **“**There’s going to be no circumstance where you see people being lifted off the roof of an embassy in the — of the United States from Afghanistan. \[...\] the likelihood there’s going to be the Taliban overrunning everything and owning the whole country is highly unlikely.**”** — Biden, [July 08, 2021](https://www.whitehouse.gov/briefing-room/speeches-remarks/2021/07/08/remarks-by-president-biden-on-the-drawdown-of-u-s-forces-in-afghanistan/)
|
> **“**There’s going to be no circumstance where you see people being lifted off the roof of an embassy in the — of the United States from Afghanistan. \[...\] the likelihood there’s going to be the Taliban overrunning everything and owning the whole country is highly unlikely.**”** — Biden, [July 08, 2021](https://www.whitehouse.gov/briefing-room/speeches-remarks/2021/07/08/remarks-by-president-biden-on-the-drawdown-of-u-s-forces-in-afghanistan/)
|
||||||
|
|
||||||
![](images/300e18f673b8a4e05c5bd2b09b556ae77ffddea9.jpeg)
|
![](.images/300e18f673b8a4e05c5bd2b09b556ae77ffddea9.jpeg)
|
||||||
|
|
||||||
Come Christmas of 2021, the CFTC gives Americans the gift of disappointment by shutting down Polymarket in the US, one of the few places where real money was being traded around topics of extreme interest to Americans, like US covid cases.
|
Come Christmas of 2021, the CFTC gives Americans the gift of disappointment by shutting down Polymarket in the US, one of the few places where real money was being traded around topics of extreme interest to Americans, like US covid cases.
|
||||||
|
|
||||||
|
|
Before Width: | Height: | Size: 61 KiB After Width: | Height: | Size: 61 KiB |
Before Width: | Height: | Size: 38 KiB After Width: | Height: | Size: 38 KiB |
Before Width: | Height: | Size: 63 KiB After Width: | Height: | Size: 63 KiB |
|
@ -26,7 +26,7 @@ Since Polymarket [settled with the CFTC last month](https://www.cftc.gov/media/6
|
||||||
|
|
||||||
[Manifold Markets](https://manifold.markets/) ([a](https://web.archive.org/web/20220202214710/https://manifold.markets/)), a new play-money prediction market, has kept its frantic development pace. They will be "launching" on February the 8th, from their current open beta. They outline some of their hopes and plans in their [substack newsletter](https://manifold.markets/) ([a](https://web.archive.org/web/20220202214710/https://manifold.markets/)).
|
[Manifold Markets](https://manifold.markets/) ([a](https://web.archive.org/web/20220202214710/https://manifold.markets/)), a new play-money prediction market, has kept its frantic development pace. They will be "launching" on February the 8th, from their current open beta. They outline some of their hopes and plans in their [substack newsletter](https://manifold.markets/) ([a](https://web.archive.org/web/20220202214710/https://manifold.markets/)).
|
||||||
|
|
||||||
![](images/21abfb15da48ac2352f1b9902216f6549fbfcc58.png)
|
![](.images/21abfb15da48ac2352f1b9902216f6549fbfcc58.png)
|
||||||
|
|
||||||
I created some markets on whether I will consider the [Principles of Intelligent Behavior in Biological and Social Systems](https://www.pibbss.ai/) fellowship to be ["a success"](https://manifold.markets/Nu%C3%B1oSempere/will-i-find-that-the-pibbss-fellows) ([a](https://web.archive.org/web/20220202214733/https://manifold.markets/Nu%C3%B1oSempere/will-i-find-that-the-pibbss-fellows)), on [the number of subscribers to this newsletter](https://manifold.markets/Nu%C3%B1oSempere/how-many-additional-subscribers-wil) ([a](https://web.archive.org/web/20220202214804/https://manifold.markets/Nu%C3%B1oSempere/how-many-additional-subscribers-wil)), and on whether [my current employer will still be alive by the end of 2022](https://manifold.markets/Nu%C3%B1oSempere/will-the-quantified-uncertainty-res) ([a](https://web.archive.org/web/20220202214836/https://manifold.markets/Nu%C3%B1oSempere/will-the-quantified-uncertainty-res)). The process was extremely painless, and I recommend that readers [give it a try](https://manifold.markets/).
|
I created some markets on whether I will consider the [Principles of Intelligent Behavior in Biological and Social Systems](https://www.pibbss.ai/) fellowship to be ["a success"](https://manifold.markets/Nu%C3%B1oSempere/will-i-find-that-the-pibbss-fellows) ([a](https://web.archive.org/web/20220202214733/https://manifold.markets/Nu%C3%B1oSempere/will-i-find-that-the-pibbss-fellows)), on [the number of subscribers to this newsletter](https://manifold.markets/Nu%C3%B1oSempere/how-many-additional-subscribers-wil) ([a](https://web.archive.org/web/20220202214804/https://manifold.markets/Nu%C3%B1oSempere/how-many-additional-subscribers-wil)), and on whether [my current employer will still be alive by the end of 2022](https://manifold.markets/Nu%C3%B1oSempere/will-the-quantified-uncertainty-res) ([a](https://web.archive.org/web/20220202214836/https://manifold.markets/Nu%C3%B1oSempere/will-the-quantified-uncertainty-res)). The process was extremely painless, and I recommend that readers [give it a try](https://manifold.markets/).
|
||||||
|
|
||||||
|
@ -38,7 +38,7 @@ This is a fairly large amount of [EA](https://en.wikipedia.org/wiki/Effective_al
|
||||||
|
|
||||||
In [the December issue](https://forecasting.substack.com/p/forecasting-newsletter-december-2021) of this newsletter, I assessed that the move from CSET to ARLIS was probably a negative development, partially because I thought that funding from Open Philanthropy was much better than government funding. As it happens, ARLIS has just now received funding from Open Philanthropy as well. Multiple people also reached out to comment that the move was probably neutral or positive, on account of ARLIS' deeper involvement with the US government. My independent impression is that I still dislike the move, but my all-things-considered view is that it's probably ok and I was wrong. As to how wrong, we’ll see.
|
In [the December issue](https://forecasting.substack.com/p/forecasting-newsletter-december-2021) of this newsletter, I assessed that the move from CSET to ARLIS was probably a negative development, partially because I thought that funding from Open Philanthropy was much better than government funding. As it happens, ARLIS has just now received funding from Open Philanthropy as well. Multiple people also reached out to comment that the move was probably neutral or positive, on account of ARLIS' deeper involvement with the US government. My independent impression is that I still dislike the move, but my all-things-considered view is that it's probably ok and I was wrong. As to how wrong, we’ll see.
|
||||||
|
|
||||||
![](images/d26defbf40ccec176cb4c6234d4c03e4bce6f5dc.png)
|
![](.images/d26defbf40ccec176cb4c6234d4c03e4bce6f5dc.png)
|
||||||
|
|
||||||
Separately, because their Pro Forecaster program still only pays $20/hour, my team—Samotsvety Forecasting, which overwhelmingly won the last two seasons—might not be participating going forward, though we are trying to negotiate with them. I talked to a few super-forecasters about this, and $20/hour isn't going to get ARLIS the best forecasters. Their open call for pro forecasters can be found [here](https://www.infer-pub.com/open-call-pro-forecasters) ([a](https://web.archive.org/web/20220202214636/https://www.infer-pub.com/open-call-pro-forecasters)).
|
Separately, because their Pro Forecaster program still only pays $20/hour, my team—Samotsvety Forecasting, which overwhelmingly won the last two seasons—might not be participating going forward, though we are trying to negotiate with them. I talked to a few super-forecasters about this, and $20/hour isn't going to get ARLIS the best forecasters. Their open call for pro forecasters can be found [here](https://www.infer-pub.com/open-call-pro-forecasters) ([a](https://web.archive.org/web/20220202214636/https://www.infer-pub.com/open-call-pro-forecasters)).
|
||||||
|
|
||||||
|
@ -72,7 +72,7 @@ Insight Prediction has launched a real-money beta with limited access and is loo
|
||||||
|
|
||||||
On the negative side, Insight Prediction had previously been stuck in development for a long time. They were originally planning to launch in June or July 2021, though at the time one of the funders refused to take a $20 bet I offered on their timelines. Reliable anonymous sources have also expressed some skepticism about the project—not necessarily in the sense of being a scam, but rather in terms of their plans being unfocused.
|
On the negative side, Insight Prediction had previously been stuck in development for a long time. They were originally planning to launch in June or July 2021, though at the time one of the funders refused to take a $20 bet I offered on their timelines. Reliable anonymous sources have also expressed some skepticism about the project—not necessarily in the sense of being a scam, but rather in terms of their plans being unfocused.
|
||||||
|
|
||||||
![](images/dd7724278ea2aeafd57e743659444ff147c23f43.png)
|
![](.images/dd7724278ea2aeafd57e743659444ff147c23f43.png)
|
||||||
|
|
||||||
Prediction market players who want to participate in the early access beta can reach out per [email](mailto:insightprediction@gmail.com).
|
Prediction market players who want to participate in the early access beta can reach out per [email](mailto:insightprediction@gmail.com).
|
||||||
|
|
||||||
|
|
Before Width: | Height: | Size: 63 KiB After Width: | Height: | Size: 63 KiB |
Before Width: | Height: | Size: 65 KiB After Width: | Height: | Size: 65 KiB |
Before Width: | Height: | Size: 108 KiB After Width: | Height: | Size: 108 KiB |
Before Width: | Height: | Size: 35 KiB After Width: | Height: | Size: 35 KiB |
Before Width: | Height: | Size: 54 KiB After Width: | Height: | Size: 54 KiB |
|
@ -15,21 +15,21 @@ Under my understanding of the many-worlds interpretation of quantum mechanics, w
|
||||||
|
|
||||||
This in itself doesn't buy you much. If you choose which color to paint your house based on quantum randomness, you still split the timeline. But this doesn't affect the probability in any of the branches that an AI, asteroid or pandemic comes and kills you and everything you care about.
|
This in itself doesn't buy you much. If you choose which color to paint your house based on quantum randomness, you still split the timeline. But this doesn't affect the probability in any of the branches that an AI, asteroid or pandemic comes and kills you and everything you care about.
|
||||||
|
|
||||||
![](images/af599776389cd95e938e9bbbfe1c7871c8d10260.png)
|
![](.images/af599776389cd95e938e9bbbfe1c7871c8d10260.png)
|
||||||
|
|
||||||
Quantum randomness over small-scale choices doesn't buy you anything...
|
Quantum randomness over small-scale choices doesn't buy you anything...
|
||||||
|
|
||||||
![](images/31cfa9a47d9f974cafeb93b77e3985be35ff0722.png)
|
![](.images/31cfa9a47d9f974cafeb93b77e3985be35ff0722.png)
|
||||||
|
|
||||||
...because the extinction risk in both worlds is (almost perfectly) correlated
|
...because the extinction risk in both worlds is (almost perfectly) correlated
|
||||||
|
|
||||||
But we could also split the timeline such that the chances of extinction risk happening are less than perfectly correlated in each of the branches.
|
But we could also split the timeline such that the chances of extinction risk happening are less than perfectly correlated in each of the branches.
|
||||||
|
|
||||||
![](images/eafaeb98efca722131b5d2a97c37e31223f885c5.png)
|
![](.images/eafaeb98efca722131b5d2a97c37e31223f885c5.png)
|
||||||
|
|
||||||
We choose a significant event according to the result of a random quantum measurement, and we hope this makes extinction risk in each of the branches less correlated.
|
We choose a significant event according to the result of a random quantum measurement, and we hope this makes extinction risk in each of the branches less correlated.
|
||||||
|
|
||||||
![](images/889304c3cf810a29bcd65e9e1a3ba4a6fccca45e.png)
|
![](.images/889304c3cf810a29bcd65e9e1a3ba4a6fccca45e.png)
|
||||||
|
|
||||||
If the chances of an x-risk in each branch are less correlated, the chances of at least one branch surviving are higher.
|
If the chances of an x-risk in each branch are less correlated, the chances of at least one branch surviving are higher.
|
||||||
|
|
||||||
|
@ -79,7 +79,7 @@ I haven't really thought all that much about how this would interact with existe
|
||||||
|
|
||||||
## Acknowledgments
|
## Acknowledgments
|
||||||
|
|
||||||
<p><img src="images/77c41e9298562badddf94d2069d5fdf00cf0cf26.png" alt="QURI logo" class="img-frontpage-center"></p>
|
<p><img src=".images/77c41e9298562badddf94d2069d5fdf00cf0cf26.png" alt="QURI logo" class="img-frontpage-center"></p>
|
||||||
|
|
||||||
This article is a project by the [Quantified Uncertainty Research Institute](https://quantifieduncertainty.org/),
|
This article is a project by the [Quantified Uncertainty Research Institute](https://quantifieduncertainty.org/),
|
||||||
|
|
||||||
|
|
Before Width: | Height: | Size: 162 KiB After Width: | Height: | Size: 162 KiB |
Before Width: | Height: | Size: 34 KiB After Width: | Height: | Size: 34 KiB |
Before Width: | Height: | Size: 34 KiB After Width: | Height: | Size: 34 KiB |
Before Width: | Height: | Size: 97 KiB After Width: | Height: | Size: 97 KiB |
|
@ -76,7 +76,7 @@ As we generalize our units, our confidence intervals become wider. We might be v
|
||||||
|
|
||||||
It’s unclear whether this growing uncertainty will pose a practical problem. If uncertainty is extremely high, we might want to invest more time into applied global priorities research, or into [revelatory grantmaking](https://en.wikipedia.org/wiki/Multi-armed_bandit) (making funding decisions not only in terms of their raw expected value, but also in terms of the expected value of the information they provide). Conversely, if each distribution is spread across many orders of magnitude, it might still be relatively clear which among many options is optimal. I’d imagine we’d find a mix of the two, but also that quantification would be much better than human intuition at differentiating the two cases[\[3\]](#fntgic4g8u64r).
|
It’s unclear whether this growing uncertainty will pose a practical problem. If uncertainty is extremely high, we might want to invest more time into applied global priorities research, or into [revelatory grantmaking](https://en.wikipedia.org/wiki/Multi-armed_bandit) (making funding decisions not only in terms of their raw expected value, but also in terms of the expected value of the information they provide). Conversely, if each distribution is spread across many orders of magnitude, it might still be relatively clear which among many options is optimal. I’d imagine we’d find a mix of the two, but also that quantification would be much better than human intuition at differentiating the two cases[\[3\]](#fntgic4g8u64r).
|
||||||
|
|
||||||
![](images/64e9f944a88327f3c6aa05fe45b185ed654edd3e.png)
|
![](.images/64e9f944a88327f3c6aa05fe45b185ed654edd3e.png)
|
||||||
|
|
||||||
The red and the green distribution could range over many orders of magnitude, and it might still be clear which one is the better bet.
|
The red and the green distribution could range over many orders of magnitude, and it might still be clear which one is the better bet.
|
||||||
|
|
||||||
|
@ -102,13 +102,13 @@ As two interventions grow more and more different, the relevant considerations f
|
||||||
|
|
||||||
For the crucial considerations that are particularly value-dependent, quantification tooling could ask the user about their best guesses on some of these controversial parameters (e.g., the discount rate, the value drift rate, the probability of success of various interventions, the value of human vs animal lives, etc.), and then carry out calculations using those guesses. [Food Impacts has such a tool](https://foodimpacts.org/) for animal suffering prioritization:
|
For the crucial considerations that are particularly value-dependent, quantification tooling could ask the user about their best guesses on some of these controversial parameters (e.g., the discount rate, the value drift rate, the probability of success of various interventions, the value of human vs animal lives, etc.), and then carry out calculations using those guesses. [Food Impacts has such a tool](https://foodimpacts.org/) for animal suffering prioritization:
|
||||||
|
|
||||||
![](images/367408ed94cbddc8de5e35497238f5505aaebc27.png)
|
![](.images/367408ed94cbddc8de5e35497238f5505aaebc27.png)
|
||||||
|
|
||||||
[Food Impacts](https://foodimpacts.org/ ): Which animal products should we avoid? h/t Vivian Belenky.
|
[Food Impacts](https://foodimpacts.org/ ): Which animal products should we avoid? h/t Vivian Belenky.
|
||||||
|
|
||||||
More sophisticated versions of this kind of tool could let users bake in different assumptions. For example, we could add in the [time horizon](https://en.wikipedia.org/wiki/Global_warming_potential#Importance_of_time_horizon) over which greenhouse gases have a warming effect, the value of saving the life of a five-year-old child vs a ten-year-old child, or the ratio of the value of various animal lives to a human life.
|
More sophisticated versions of this kind of tool could let users bake in different assumptions. For example, we could add in the [time horizon](https://en.wikipedia.org/wiki/Global_warming_potential#Importance_of_time_horizon) over which greenhouse gases have a warming effect, the value of saving the life of a five-year-old child vs a ten-year-old child, or the ratio of the value of various animal lives to a human life.
|
||||||
|
|
||||||
![](images/4cfe3b0177090481ed39215e8ead0d9fc88cf035.png)
|
![](.images/4cfe3b0177090481ed39215e8ead0d9fc88cf035.png)
|
||||||
|
|
||||||
As another example, [GiveWell's spreadsheet](https://docs.google.com/spreadsheets/d/1B1fODKVbnGP4fejsZCVNvBm5zvI1jC7DhkaJpFk6zfo/edit#gid=1362437801) allows one to tweak how valuable "a statistical life saved from malaria" is compared to "a doubling in consumption". But it doesn't allow for changing the assumption that different doublings of consumption are differently valuable.
|
As another example, [GiveWell's spreadsheet](https://docs.google.com/spreadsheets/d/1B1fODKVbnGP4fejsZCVNvBm5zvI1jC7DhkaJpFk6zfo/edit#gid=1362437801) allows one to tweak how valuable "a statistical life saved from malaria" is compared to "a doubling in consumption". But it doesn't allow for changing the assumption that different doublings of consumption are differently valuable.
|
||||||
|
|
||||||
|
@ -146,7 +146,7 @@ I believe that a rigorous evaluation framework for quantified uncertainty is wel
|
||||||
|
|
||||||
## Acknowledgments
|
## Acknowledgments
|
||||||
|
|
||||||
<p><img src="images/7385a0f4bc3ff0ac194d9b0054b8a3b0fa9cae77.png" alt="QURI logo" class="img-frontpage-center"></p>
|
<p><img src=".images/7385a0f4bc3ff0ac194d9b0054b8a3b0fa9cae77.png" alt="QURI logo" class="img-frontpage-center"></p>
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
Before Width: | Height: | Size: 91 KiB After Width: | Height: | Size: 91 KiB |
Before Width: | Height: | Size: 588 KiB After Width: | Height: | Size: 588 KiB |
Before Width: | Height: | Size: 56 KiB After Width: | Height: | Size: 56 KiB |
Before Width: | Height: | Size: 55 KiB After Width: | Height: | Size: 55 KiB |
Before Width: | Height: | Size: 265 KiB After Width: | Height: | Size: 265 KiB |
Before Width: | Height: | Size: 127 KiB After Width: | Height: | Size: 127 KiB |
Before Width: | Height: | Size: 66 KiB After Width: | Height: | Size: 66 KiB |
Before Width: | Height: | Size: 123 KiB After Width: | Height: | Size: 123 KiB |
|
@ -1,4 +1,4 @@
|
||||||
Forecasting Newsletter: February 2022
|
casting Newsletter: February 2022
|
||||||
==============
|
==============
|
||||||
|
|
||||||
## Highlights
|
## Highlights
|
||||||
|
@ -73,7 +73,7 @@ In particular, they created the [Ukraine conflict tournament](https://www.metacu
|
||||||
|
|
||||||
Metaculus also added a feature allowing forecasters to predict on the same question at different points in time. So far, it seems to only be available on [questions](https://www.metaculus.com/questions/9869/flu-hospitalizations-for-ca/) ([a](http://web.archive.org/web/20220304223132/https://www.metaculus.com/questions/9869/flu-hospitalizations-for-ca/)) in the [Flu Sight](https://www.metaculus.com/tournament/flusight-challenge/) ([a](http://web.archive.org/web/20220224002831/https://www.metaculus.com/tournament/flusight-challenge/)) tournament.
|
Metaculus also added a feature allowing forecasters to predict on the same question at different points in time. So far, it seems to only be available on [questions](https://www.metaculus.com/questions/9869/flu-hospitalizations-for-ca/) ([a](http://web.archive.org/web/20220304223132/https://www.metaculus.com/questions/9869/flu-hospitalizations-for-ca/)) in the [Flu Sight](https://www.metaculus.com/tournament/flusight-challenge/) ([a](http://web.archive.org/web/20220224002831/https://www.metaculus.com/tournament/flusight-challenge/)) tournament.
|
||||||
|
|
||||||
![](images/dab3d20fd331f290a238bef8016976fb459748d4.png)
|
![](.images/dab3d20fd331f290a238bef8016976fb459748d4.png)
|
||||||
|
|
||||||
### Polymarket
|
### Polymarket
|
||||||
|
|
||||||
|
@ -91,7 +91,7 @@ Polymarket is currently using [UMA](https://medium.com/uma-project/polymarket-in
|
||||||
>
|
>
|
||||||
> Here's an example. The market for "below 100k cases before April 15" has a current proposed answer of "yes" (meaning it did in fact happen before April 15). If you click on the market you get to this page:
|
> Here's an example. The market for "below 100k cases before April 15" has a current proposed answer of "yes" (meaning it did in fact happen before April 15). If you click on the market you get to this page:
|
||||||
>
|
>
|
||||||
> ![](images/55c2d1e1672c04ccdecc2521cebdef89d6c41c16.png)
|
> ![](.images/55c2d1e1672c04ccdecc2521cebdef89d6c41c16.png)
|
||||||
>
|
>
|
||||||
> If nobody contests the proposed answer, the market will resolve "yes" in 42 minutes. Anyone can dispute an outcome but as you can see, it costs $11500 to contest, and of course you lose that amount if you're wrong (if you're right however, you get it back, and it's the original proposer who loses the 11k). This is quite expensive and should deter people from trying dumb contests, like the ones that plagued Augur during the 2020 election aftermath.
|
> If nobody contests the proposed answer, the market will resolve "yes" in 42 minutes. Anyone can dispute an outcome but as you can see, it costs $11500 to contest, and of course you lose that amount if you're wrong (if you're right however, you get it back, and it's the original proposer who loses the 11k). This is quite expensive and should deter people from trying dumb contests, like the ones that plagued Augur during the 2020 election aftermath.
|
||||||
>
|
>
|
||||||
|
@ -107,7 +107,7 @@ On account of reading this, I bought a medium amount of the UMA governance token
|
||||||
|
|
||||||
Manifold markets [received an EA grant](https://manifold.markets/AustinChen/will-manifold-markets-win-an-ea-gra) ([a](http://web.archive.org/web/20220304020932/https://manifold.markets/AustinChen/will-manifold-markets-win-an-ea-gra))
|
Manifold markets [received an EA grant](https://manifold.markets/AustinChen/will-manifold-markets-win-an-ea-gra) ([a](http://web.archive.org/web/20220304020932/https://manifold.markets/AustinChen/will-manifold-markets-win-an-ea-gra))
|
||||||
|
|
||||||
![](images/7d4ee6510852f996b9b365409185e5bc91cb3b81.png)
|
![](.images/7d4ee6510852f996b9b365409185e5bc91cb3b81.png)
|
||||||
|
|
||||||
They report over their updates at [Above the fold](https://manifoldmarkets.substack.com/) ([a](http://web.archive.org/web/20220304020959/https://manifoldmarkets.substack.com/)): they've been adding new features at a steadily fast pace. For instance, Manifold now supports free-form answers. So when betting on the 2024 election, one could have an initial lineup including the expected candidates, but if a dark horse candidate rises to prominence, it could later be added.
|
They report over their updates at [Above the fold](https://manifoldmarkets.substack.com/) ([a](http://web.archive.org/web/20220304020959/https://manifoldmarkets.substack.com/)): they've been adding new features at a steadily fast pace. For instance, Manifold now supports free-form answers. So when betting on the 2024 election, one could have an initial lineup including the expected candidates, but if a dark horse candidate rises to prominence, it could later be added.
|
||||||
|
|
||||||
|
@ -117,7 +117,7 @@ Manifold also released a [beautifully documented API](https://manifoldmarkets.no
|
||||||
|
|
||||||
INFER released a [few blogposts](https://www.infer-pub.com/the-pub) ([a](https://web.archive.org/web/20220305153942/https://www.infer-pub.com/the-pub)) outlining their current thinking and future plans. Of these, [Understanding strategic question decomposition](https://www.infer-pub.com/the-pub/question-issue-decomposition) ([a](http://web.archive.org/web/20220222154450/https://www.infer-pub.com/the-pub/question-issue-decomposition)) is worth reading as a cute illustrated recap of the [best current approach](https://cset.georgetown.edu/publication/future-indices/) ([a](http://web.archive.org/web/20211202012714/https://cset.georgetown.edu/publication/future-indices/)) for using forecasting systems to give insight on big picture questions.
|
INFER released a [few blogposts](https://www.infer-pub.com/the-pub) ([a](https://web.archive.org/web/20220305153942/https://www.infer-pub.com/the-pub)) outlining their current thinking and future plans. Of these, [Understanding strategic question decomposition](https://www.infer-pub.com/the-pub/question-issue-decomposition) ([a](http://web.archive.org/web/20220222154450/https://www.infer-pub.com/the-pub/question-issue-decomposition)) is worth reading as a cute illustrated recap of the [best current approach](https://cset.georgetown.edu/publication/future-indices/) ([a](http://web.archive.org/web/20211202012714/https://cset.georgetown.edu/publication/future-indices/)) for using forecasting systems to give insight on big picture questions.
|
||||||
|
|
||||||
![](images/fbd731a54e1406a25b48e0439eb4c90153f3dc0a.png)
|
![](.images/fbd731a54e1406a25b48e0439eb4c90153f3dc0a.png)
|
||||||
|
|
||||||
They are also running a lottery to give $2,000 to one lucky forecasting team. Teams have to be of 6 people, and the lottery is such that chances are maximized if they predict every day. Suppose that making a forecast one is not ashamed of takes 5 minutes and that 5 new teams are created. Then the expected prize winnings per hour are $2000 \* 60 mins per hour / ( 5 teams \* 5 mins per forecast per day \* 30 days \* 5 forecasters per team ) = $26 / hour, or not enough for me to do it.
|
They are also running a lottery to give $2,000 to one lucky forecasting team. Teams have to be of 6 people, and the lottery is such that chances are maximized if they predict every day. Suppose that making a forecast one is not ashamed of takes 5 minutes and that 5 new teams are created. Then the expected prize winnings per hour are $2000 \* 60 mins per hour / ( 5 teams \* 5 mins per forecast per day \* 30 days \* 5 forecasters per team ) = $26 / hour, or not enough for me to do it.
|
||||||
|
|
||||||
|
@ -139,7 +139,7 @@ Finally, an anonymous benefactor increased the size of this newsletter's [microg
|
||||||
|
|
||||||
Clay Graubard collects how the different forecasting platforms did at predicting the invasion of Ukraine. He [describes the situation](https://inews.co.uk/news/world/russia-ukraine-crisis-super-forecasters-putin-troops-1475721) ([a](http://web.archive.org/web/20220224205549/https://inews.co.uk/news/world/russia-ukraine-crisis-super-forecasters-putin-troops-1475721)) as "not the forecasting community’s finest hour". It's not clear to me that this is a fair assessment:
|
Clay Graubard collects how the different forecasting platforms did at predicting the invasion of Ukraine. He [describes the situation](https://inews.co.uk/news/world/russia-ukraine-crisis-super-forecasters-putin-troops-1475721) ([a](http://web.archive.org/web/20220224205549/https://inews.co.uk/news/world/russia-ukraine-crisis-super-forecasters-putin-troops-1475721)) as "not the forecasting community’s finest hour". It's not clear to me that this is a fair assessment:
|
||||||
|
|
||||||
![](images/4aacaf5a3ebf60ff1599f13b9034bb1749bb9282.png)
|
![](.images/4aacaf5a3ebf60ff1599f13b9034bb1749bb9282.png)
|
||||||
|
|
||||||
Not pictured there are prediction markets such as [Insight Markets](https://insightprediction.com/markets/129) ([a](http://web.archive.org/web/20220224130825/https://insightprediction.com/markets/129)), where my forecasting group and I won $20k betting on the Russian invasion, or [Futuur](https://futuur.com/q/tag/ukraine-conflict) ([a](http://web.archive.org/web/20220225193740/https://futuur.com/q/tag/ukraine-conflict)), which likewise has real money markets on Ukraine.
|
Not pictured there are prediction markets such as [Insight Markets](https://insightprediction.com/markets/129) ([a](http://web.archive.org/web/20220224130825/https://insightprediction.com/markets/129)), where my forecasting group and I won $20k betting on the Russian invasion, or [Futuur](https://futuur.com/q/tag/ukraine-conflict) ([a](http://web.archive.org/web/20220225193740/https://futuur.com/q/tag/ukraine-conflict)), which likewise has real money markets on Ukraine.
|
||||||
|
|
||||||
|
@ -147,11 +147,11 @@ Although I'm fairly sure they're not, they could yet be scams, so prospective pa
|
||||||
|
|
||||||
The forecasting community also saw a few over-the-counter bets on Ukraine:
|
The forecasting community also saw a few over-the-counter bets on Ukraine:
|
||||||
|
|
||||||
![](images/1f81697825db923e2ef6216d4bb991215543eabb.png)
|
![](.images/1f81697825db923e2ef6216d4bb991215543eabb.png)
|
||||||
|
|
||||||
![](images/d8d74b08f73c58593450a493bd44b3fcebb39548.png)
|
![](.images/d8d74b08f73c58593450a493bd44b3fcebb39548.png)
|
||||||
|
|
||||||
![](images/b4647161d86a19ee593162b4c59796e166c1ecf0.png)
|
![](.images/b4647161d86a19ee593162b4c59796e166c1ecf0.png)
|
||||||
|
|
||||||
TarasBob paid them all. He also happens to have a surprisingly interesting [website](https://taras.com/) ([a](https://web.archive.org/web/20220225093559/https://taras.com/)).
|
TarasBob paid them all. He also happens to have a surprisingly interesting [website](https://taras.com/) ([a](https://web.archive.org/web/20220225093559/https://taras.com/)).
|
||||||
|
|
||||||
|
|
Before Width: | Height: | Size: 41 KiB After Width: | Height: | Size: 41 KiB |
Before Width: | Height: | Size: 67 KiB After Width: | Height: | Size: 67 KiB |
Before Width: | Height: | Size: 119 KiB After Width: | Height: | Size: 119 KiB |
Before Width: | Height: | Size: 114 KiB After Width: | Height: | Size: 114 KiB |
|
@ -28,7 +28,7 @@ However, different forecasters preferred different decompositions. In particular
|
||||||
|
|
||||||
## Our aggregate forecast
|
## Our aggregate forecast
|
||||||
|
|
||||||
![](images/edc67c8614df4e8216a052ca5d623084edc5791c.png)
|
![](.images/edc67c8614df4e8216a052ca5d623084edc5791c.png)
|
||||||
|
|
||||||
We use the aggregate with min/max removed as our all-things-considered forecast for now given the extremity of outliers. We aggregated forecasts using the geometric mean of odds[\[8\]](#fnt1dm5d62pkl).
|
We use the aggregate with min/max removed as our all-things-considered forecast for now given the extremity of outliers. We aggregated forecasts using the geometric mean of odds[\[8\]](#fnt1dm5d62pkl).
|
||||||
|
|
||||||
|
@ -38,7 +38,7 @@ Note that we are forecasting one month ahead and it’s quite likely that the cr
|
||||||
|
|
||||||
We compared the decomposition of our forecast to [Jacob Hilton’s](https://docs.google.com/document/d/17q-Ok4EVV42IscLMFOLztht7i0iLiALx0DFcX3xLn-A/edit?pli=1#) to understand the main drivers of the difference. We compare to Jacob’s revised forecast he made after reading comments on his document. Note that Jacob forecasted on the time horizon of the whole crisis then estimated 10% of the risk was incurred in the upcoming week. We guess that he would put roughly 25% over the course of a month which we forecasted (adjusting down some from weekly \* 4), and assume so in the table below. The numbers we assign to him are also approximate in that our operationalizations are a bit different than his.
|
We compared the decomposition of our forecast to [Jacob Hilton’s](https://docs.google.com/document/d/17q-Ok4EVV42IscLMFOLztht7i0iLiALx0DFcX3xLn-A/edit?pli=1#) to understand the main drivers of the difference. We compare to Jacob’s revised forecast he made after reading comments on his document. Note that Jacob forecasted on the time horizon of the whole crisis then estimated 10% of the risk was incurred in the upcoming week. We guess that he would put roughly 25% over the course of a month which we forecasted (adjusting down some from weekly \* 4), and assume so in the table below. The numbers we assign to him are also approximate in that our operationalizations are a bit different than his.
|
||||||
|
|
||||||
![](images/afb763248cc2ad79ba5948d1c1f24ff644d33a5a.png)
|
![](.images/afb763248cc2ad79ba5948d1c1f24ff644d33a5a.png)
|
||||||
|
|
||||||
We are ~an order of magnitude lower than Jacob. This is primarily driven by (a) a ~4x lower chance of a nuclear exchange in the next month and (b) a ~2x lower chance of dying in London, given a nuclear exchange.
|
We are ~an order of magnitude lower than Jacob. This is primarily driven by (a) a ~4x lower chance of a nuclear exchange in the next month and (b) a ~2x lower chance of dying in London, given a nuclear exchange.
|
||||||
|
|
||||||
|
@ -107,7 +107,7 @@ lostHours=lostDays*24
|
||||||
lostHours ## Replace with mean(lostDays) to get an estimate in days instead
|
lostHours ## Replace with mean(lostDays) to get an estimate in days instead
|
||||||
```
|
```
|
||||||
|
|
||||||
![](images/5f001f3c45c8a48a083871362bde46eb55862e81.png)
|
![](.images/5f001f3c45c8a48a083871362bde46eb55862e81.png)
|
||||||
|
|
||||||
**Eli Lifland**
|
**Eli Lifland**
|
||||||
|
|
||||||
|
@ -128,7 +128,7 @@ lostHours=lostDays*24
|
||||||
lostHours ## Replace with mean(lostDays) to get an estimate in days instead
|
lostHours ## Replace with mean(lostDays) to get an estimate in days instead
|
||||||
```
|
```
|
||||||
|
|
||||||
![](images/71160022319350cdba174346ddbdfae0ac80b88e.png)
|
![](.images/71160022319350cdba174346ddbdfae0ac80b88e.png)
|
||||||
|
|
||||||
## Footnotes
|
## Footnotes
|
||||||
|
|
||||||
|
|
Before Width: | Height: | Size: 394 KiB After Width: | Height: | Size: 394 KiB |
Before Width: | Height: | Size: 62 KiB After Width: | Height: | Size: 62 KiB |
Before Width: | Height: | Size: 17 KiB After Width: | Height: | Size: 17 KiB |
Before Width: | Height: | Size: 155 KiB After Width: | Height: | Size: 155 KiB |
Before Width: | Height: | Size: 484 KiB After Width: | Height: | Size: 484 KiB |
Before Width: | Height: | Size: 426 KiB After Width: | Height: | Size: 426 KiB |
Before Width: | Height: | Size: 97 KiB After Width: | Height: | Size: 97 KiB |
Before Width: | Height: | Size: 116 KiB After Width: | Height: | Size: 116 KiB |
Before Width: | Height: | Size: 130 KiB After Width: | Height: | Size: 130 KiB |
|
@ -15,7 +15,7 @@ My guess is that EA funders also have inconsistent preferences and similarly wid
|
||||||
|
|
||||||
Current aggregate estimates look as follows:
|
Current aggregate estimates look as follows:
|
||||||
|
|
||||||
![](images/c7addcb750e48d25f9c34e2083d96bddffe2e300.png)
|
![](.images/c7addcb750e48d25f9c34e2083d96bddffe2e300.png)
|
||||||
|
|
||||||
## Motivation
|
## Motivation
|
||||||
|
|
||||||
|
@ -35,7 +35,7 @@ Further, core decision-makers might be similarly inconsistent and might be makin
|
||||||
|
|
||||||
I asked six researchers to use [the application](https://utility-function-extractor.quantifieduncertainty.org/research) described in [Simple comparison polling to create utility functions](https://forum.effectivealtruism.org/posts/9hQFfmbEiAoodstDA/simple-comparison-polling-to-create-utility-functions) to compare 15 pieces of research. These pieces ranged from [a comment](https://forum.effectivealtruism.org/posts/3PjNiLLkCMzAN2BSz/when-setting-up-a-charity-should-you-employ-a-lawyer?commentId=YNKNcp6nKqxqkZgCu) on the EA Forum to Shannon's foundational text, "The Mathematical Theory of Communication".
|
I asked six researchers to use [the application](https://utility-function-extractor.quantifieduncertainty.org/research) described in [Simple comparison polling to create utility functions](https://forum.effectivealtruism.org/posts/9hQFfmbEiAoodstDA/simple-comparison-polling-to-create-utility-functions) to compare 15 pieces of research. These pieces ranged from [a comment](https://forum.effectivealtruism.org/posts/3PjNiLLkCMzAN2BSz/when-setting-up-a-charity-should-you-employ-a-lawyer?commentId=YNKNcp6nKqxqkZgCu) on the EA Forum to Shannon's foundational text, "The Mathematical Theory of Communication".
|
||||||
|
|
||||||
![](images/09f6245837ac758b2d952e4a0c4a2b9613f43f7b.png)
|
![](.images/09f6245837ac758b2d952e4a0c4a2b9613f43f7b.png)
|
||||||
|
|
||||||
The app presents the user with pairwise comparisons. Each comparison asks the user how valuable the first element is, compared to the second (e.g., 10 times as valuable, 0.01 times as valuable). The app internally uses [merge sort](https://en.wikipedia.org/wiki/Merge_sort) to ensure that there can be no cyclical comparisons—so that the user cannot express a preference that A > B > C > A. Readers are encouraged to [play around with it](https://utility-function-extractor.quantifieduncertainty.org/research).
|
The app presents the user with pairwise comparisons. Each comparison asks the user how valuable the first element is, compared to the second (e.g., 10 times as valuable, 0.01 times as valuable). The app internally uses [merge sort](https://en.wikipedia.org/wiki/Merge_sort) to ensure that there can be no cyclical comparisons—so that the user cannot express a preference that A > B > C > A. Readers are encouraged to [play around with it](https://utility-function-extractor.quantifieduncertainty.org/research).
|
||||||
|
|
||||||
|
@ -45,13 +45,13 @@ The app presents the user with pairwise comparisons. Each comparison asks the us
|
||||||
|
|
||||||
For individual researchers, results can be visualized as follows:
|
For individual researchers, results can be visualized as follows:
|
||||||
|
|
||||||
![](images/4d8364726d8879bab0ace9ae58db66e3d0125da3.png)
|
![](.images/4d8364726d8879bab0ace9ae58db66e3d0125da3.png)
|
||||||
|
|
||||||
The green lines represent how much more valuable the element to the right is than the element to the left. The table below the graph uses the geometric mean to combine the user’s guesses into an average guess. See the appendix for the method behind this.
|
The green lines represent how much more valuable the element to the right is than the element to the left. The table below the graph uses the geometric mean to combine the user’s guesses into an average guess. See the appendix for the method behind this.
|
||||||
|
|
||||||
When combining the results of all the individuals using the geometric mean—see the appendix for the method—we get a table such as the following:
|
When combining the results of all the individuals using the geometric mean—see the appendix for the method—we get a table such as the following:
|
||||||
|
|
||||||
![](images/2b9614c75de9ed5262f8a3c25950917c3b951f4c.png)
|
![](.images/2b9614c75de9ed5262f8a3c25950917c3b951f4c.png)
|
||||||
|
|
||||||
The coefficient of variation is the standard deviation divided by the geometric mean. “OOM range” stands for “order of magnitude range”, where an order of magnitude is a difference of 10x. The method to calculate the relative values is in the first appendix.
|
The coefficient of variation is the standard deviation divided by the geometric mean. “OOM range” stands for “order of magnitude range”, where an order of magnitude is a difference of 10x. The method to calculate the relative values is in the first appendix.
|
||||||
|
|
||||||
|
@ -61,7 +61,7 @@ To create such a table, we need a reference element, which by construction has a
|
||||||
|
|
||||||
In the app, users stated their value ranges for the differences between elements. In a preliminary analysis, we simplified this data by simply calculating the ordering for each evaluator. The different orderings were as follows:
|
In the app, users stated their value ranges for the differences between elements. In a preliminary analysis, we simplified this data by simply calculating the ordering for each evaluator. The different orderings were as follows:
|
||||||
|
|
||||||
![](images/b2b65f1e73a16b3e65bd37352b647e7b7df2a672.png)
|
![](.images/b2b65f1e73a16b3e65bd37352b647e7b7df2a672.png)
|
||||||
|
|
||||||
These are pretty consistent. Some of the most salient differences are:
|
These are pretty consistent. Some of the most salient differences are:
|
||||||
|
|
||||||
|
@ -74,25 +74,25 @@ These are pretty consistent. Some of the most salient differences are:
|
||||||
|
|
||||||
Yet, what we care about is not relative ordinal position—A is in the first position, but B is in the fifth position. Instead, we care about relative value—A is 10x better than B. The results are as follows:
|
Yet, what we care about is not relative ordinal position—A is in the first position, but B is in the fifth position. Instead, we care about relative value—A is 10x better than B. The results are as follows:
|
||||||
|
|
||||||
![](images/2b9614c75de9ed5262f8a3c25950917c3b951f4c.png)
|
![](.images/2b9614c75de9ed5262f8a3c25950917c3b951f4c.png)
|
||||||
|
|
||||||
### Inconsistency within the same researcher
|
### Inconsistency within the same researcher
|
||||||
|
|
||||||
Consider Misha Yagudin’s results:
|
Consider Misha Yagudin’s results:
|
||||||
|
|
||||||
![](images/0789cbb6695c45b020ae9090b08fc6ba5d0bb0fe.png)
|
![](.images/0789cbb6695c45b020ae9090b08fc6ba5d0bb0fe.png)
|
||||||
|
|
||||||
Zooming in, we see that element #M is 2x as valuable as element #L, #L is 100x as valuable as #K, and #K is 2x as valuable as #J. So overall, #M should be 2\*100\*2 = 400x as valuable as #J. However, Yagudin evaluates it as only 33x as valuable in a face-to-face comparison.
|
Zooming in, we see that element #M is 2x as valuable as element #L, #L is 100x as valuable as #K, and #K is 2x as valuable as #J. So overall, #M should be 2\*100\*2 = 400x as valuable as #J. However, Yagudin evaluates it as only 33x as valuable in a face-to-face comparison.
|
||||||
|
|
||||||
![](images/0789cbb6695c45b020ae9090b08fc6ba5d0bb0fe.png)
|
![](.images/0789cbb6695c45b020ae9090b08fc6ba5d0bb0fe.png)
|
||||||
|
|
||||||
Gavin Leech was generally consistent.
|
Gavin Leech was generally consistent.
|
||||||
|
|
||||||
![](images/2bf104abc7763c2667dc0137d667ccca07b45277.png)
|
![](.images/2bf104abc7763c2667dc0137d667ccca07b45277.png)
|
||||||
|
|
||||||
This was because he was paying particular attention to producing consistent estimates. On the other hand, the distance between, for example, #H and #K, was 10 when calculated one way, but 10,000\*1\*5=50,000 when calculated another way.
|
This was because he was paying particular attention to producing consistent estimates. On the other hand, the distance between, for example, #H and #K, was 10 when calculated one way, but 10,000\*1\*5=50,000 when calculated another way.
|
||||||
|
|
||||||
![](images/2bf104abc7763c2667dc0137d667ccca07b45277.png)
|
![](.images/2bf104abc7763c2667dc0137d667ccca07b45277.png)
|
||||||
|
|
||||||
It would be interesting to calculate the coefficients of variations for each user in future iterations and see which user is the most inconsistent (or whether they are comparably so) and which item elicits the most inconsistency in the users.
|
It would be interesting to calculate the coefficients of variations for each user in future iterations and see which user is the most inconsistent (or whether they are comparably so) and which item elicits the most inconsistency in the users.
|
||||||
|
|
||||||
|
@ -135,7 +135,7 @@ Further work and clarification in this area could be highly valuable. We could d
|
||||||
|
|
||||||
## Acknowledgements
|
## Acknowledgements
|
||||||
|
|
||||||
<p><img src="images/7385a0f4bc3ff0ac194d9b0054b8a3b0fa9cae77.png" alt="QURI logo" class="img-frontpage-center"></p>
|
<p><img src=".images/7385a0f4bc3ff0ac194d9b0054b8a3b0fa9cae77.png" alt="QURI logo" class="img-frontpage-center"></p>
|
||||||
|
|
||||||
This post is a project by the [Quantified Uncertainty Research Institute](https://quantifieduncertainty.org/). It was written by Nuño Sempere. Thanks to Ozzie Gooen and Gavin Leech for comments and suggestions and Finn Moorhouse, Gavin Leech, Jaime Sevilla, Linch Zhang, Misha Yagudin and Ozzie Gooen for participation in this experiment, and for permission to share their results.
|
This post is a project by the [Quantified Uncertainty Research Institute](https://quantifieduncertainty.org/). It was written by Nuño Sempere. Thanks to Ozzie Gooen and Gavin Leech for comments and suggestions and Finn Moorhouse, Gavin Leech, Jaime Sevilla, Linch Zhang, Misha Yagudin and Ozzie Gooen for participation in this experiment, and for permission to share their results.
|
||||||
|
|
||||||
|
|
Before Width: | Height: | Size: 102 KiB After Width: | Height: | Size: 102 KiB |
Before Width: | Height: | Size: 106 KiB After Width: | Height: | Size: 106 KiB |
|
@ -39,7 +39,7 @@ As foreseen by prediction markets and pundits alike, Keine Davon has been electe
|
||||||
|
|
||||||
UN Secretary-General Yan Zhang vows to move prediction markets to at least a 30% implied probability that the Spanish military junta will not be in power by the end of the decade. Prediction markets rose to 35% upon announcement (source: Metacortex), up from an early estimate of 28%. The move is widely considered to be an attempt by Zhang to distract attention away from an embezzlement scandal, in which famine prediction systems were manipulated to show increasing risk in areas that were actually safe, leading to the deployment of additional funds which could then safely be stolen.
|
UN Secretary-General Yan Zhang vows to move prediction markets to at least a 30% implied probability that the Spanish military junta will not be in power by the end of the decade. Prediction markets rose to 35% upon announcement (source: Metacortex), up from an early estimate of 28%. The move is widely considered to be an attempt by Zhang to distract attention away from an embezzlement scandal, in which famine prediction systems were manipulated to show increasing risk in areas that were actually safe, leading to the deployment of additional funds which could then safely be stolen.
|
||||||
|
|
||||||
<p><img src="images/02034183e0de16f853fc004dab47ef5a01ca677f.png" alt="QURI logo" class="img-frontpage-center"></p>
|
<p><img src=".images/02034183e0de16f853fc004dab47ef5a01ca677f.png" alt="QURI logo" class="img-frontpage-center"></p>
|
||||||
|
|
||||||
Netflix releases a new Korean soap opera, [Forecasting Love and Weather](https://en.wikipedia.org/wiki/Forecasting_Love_and_Weather), which tells the gripping tale of how a young man with an affinity and talent for weather forecasting falls in love with an analytical woman of comparable forecasting prowess. "It was as if an occult hand had reached into Korean society and made forecasting cool and mainstream", mentions a spokesbeing for the Korean Forecasting Congregation. It further seems that a lot of [attention to detail](https://www.soompi.com/article/1512045wpp/real-life-meteorological-administration-spokesperson-explains-how-forecasting-love-and-weather-was-made-realistic) went into making the show realistic.
|
Netflix releases a new Korean soap opera, [Forecasting Love and Weather](https://en.wikipedia.org/wiki/Forecasting_Love_and_Weather), which tells the gripping tale of how a young man with an affinity and talent for weather forecasting falls in love with an analytical woman of comparable forecasting prowess. "It was as if an occult hand had reached into Korean society and made forecasting cool and mainstream", mentions a spokesbeing for the Korean Forecasting Congregation. It further seems that a lot of [attention to detail](https://www.soompi.com/article/1512045wpp/real-life-meteorological-administration-spokesperson-explains-how-forecasting-love-and-weather-was-made-realistic) went into making the show realistic.
|
||||||
|
|
||||||
|
@ -57,7 +57,7 @@ Rootclaim has a new feature analyzing the reasons for Peter Thiel's extraordinar
|
||||||
|
|
||||||
## Long Content
|
## Long Content
|
||||||
|
|
||||||
![](images/57ec1c1ce52c9b097c46dedb0ca9233fda286405.png)
|
![](.images/57ec1c1ce52c9b097c46dedb0ca9233fda286405.png)
|
||||||
|
|
||||||
[Robin Hanson To Represent Sweden At 2021 Olympic Games In Tokyo](https://calbears.com/news/2021/5/27/mens-swimming-diving-robin-hanson-heading-to-tokyo). To settle a bet about whether he would have found a career in sports more meaningful than his intellectual career, Robin Hanson has agreed to spin up universe afea6ef9628fcb91771abc9f799cf15. You can bet on the outcome [here](https://polymarket.com/market/will-robin-hanson-find-a-swimming-career-more-meaningful-than-an-intellectual-career). [United Nations Security Council Resolution 26280](https://en.wikipedia.org/wiki/United_Nations_Security_Council_Resolution_26280) requires us to inform you that if there are two or more Robin Hansons in your universe, you might be in a simulation (probability depends on the specific [anthropic question being asked](https://www.lesswrong.com/posts/LARmKTbpAkEYeG43u/anthropics-different-probabilities-different-questions) and on how much credence one lends to the [simulation hypothesis](https://www.simulation-argument.com/).)
|
[Robin Hanson To Represent Sweden At 2021 Olympic Games In Tokyo](https://calbears.com/news/2021/5/27/mens-swimming-diving-robin-hanson-heading-to-tokyo). To settle a bet about whether he would have found a career in sports more meaningful than his intellectual career, Robin Hanson has agreed to spin up universe afea6ef9628fcb91771abc9f799cf15. You can bet on the outcome [here](https://polymarket.com/market/will-robin-hanson-find-a-swimming-career-more-meaningful-than-an-intellectual-career). [United Nations Security Council Resolution 26280](https://en.wikipedia.org/wiki/United_Nations_Security_Council_Resolution_26280) requires us to inform you that if there are two or more Robin Hansons in your universe, you might be in a simulation (probability depends on the specific [anthropic question being asked](https://www.lesswrong.com/posts/LARmKTbpAkEYeG43u/anthropics-different-probabilities-different-questions) and on how much credence one lends to the [simulation hypothesis](https://www.simulation-argument.com/).)
|
||||||
|
|
||||||
|
|
Before Width: | Height: | Size: 116 KiB After Width: | Height: | Size: 116 KiB |
Before Width: | Height: | Size: 130 KiB After Width: | Height: | Size: 130 KiB |
Before Width: | Height: | Size: 58 KiB After Width: | Height: | Size: 58 KiB |
Before Width: | Height: | Size: 68 KiB After Width: | Height: | Size: 68 KiB |
Before Width: | Height: | Size: 61 KiB After Width: | Height: | Size: 61 KiB |
Before Width: | Height: | Size: 82 KiB After Width: | Height: | Size: 82 KiB |
Before Width: | Height: | Size: 78 KiB After Width: | Height: | Size: 78 KiB |
Before Width: | Height: | Size: 45 KiB After Width: | Height: | Size: 45 KiB |
Before Width: | Height: | Size: 118 KiB After Width: | Height: | Size: 118 KiB |
Before Width: | Height: | Size: 105 KiB After Width: | Height: | Size: 105 KiB |
|
@ -21,7 +21,7 @@ You can sign up for this newsletter on [substack](https://forecasting.substack.c
|
||||||
|
|
||||||
On account of getting a plug on one of Spain's most-read newspapers, this newsletter has reached 1,000 subscribers:
|
On account of getting a plug on one of Spain's most-read newspapers, this newsletter has reached 1,000 subscribers:
|
||||||
|
|
||||||
![](images/4d6eb12702b96fa17bc7715d995c4e9de83eba60.png)
|
![](.images/4d6eb12702b96fa17bc7715d995c4e9de83eba60.png)
|
||||||
|
|
||||||
You can find a market on when it will reach 2000 [here](https://manifold.markets/Nu%C3%B1oSempere/when-will-my-newsletter-reach-2000).
|
You can find a market on when it will reach 2000 [here](https://manifold.markets/Nu%C3%B1oSempere/when-will-my-newsletter-reach-2000).
|
||||||
|
|
||||||
|
@ -33,7 +33,7 @@ What is the alternative? The alternative is to develop better models of the worl
|
||||||
|
|
||||||
But how do we know which models of the world are good? How do we differentiate real understanding from fake understanding? It's tricky, but to a first approximation, we make our hypotheses about the world output predictions, and we [reduce our confidence in the hypotheses that make worse predictions](https://arbital.com/p/bayes_rule/?l=1zq) ([a](http://web.archive.org/web/20220402005820/https://arbital.com/p/bayes_rule/?l=1zq)). The book _Superforecasting_ is a neat introduction to the practices involved. E.T. Jaynes' _Probability Theory: The Logic of Science_ is a hardcore introduction to the math behind it. Both books are probably available for free in the [z library](https://b-ok.org/) ([a](http://web.archive.org/web/20210129172145/https://b-ok.org/)).
|
But how do we know which models of the world are good? How do we differentiate real understanding from fake understanding? It's tricky, but to a first approximation, we make our hypotheses about the world output predictions, and we [reduce our confidence in the hypotheses that make worse predictions](https://arbital.com/p/bayes_rule/?l=1zq) ([a](http://web.archive.org/web/20220402005820/https://arbital.com/p/bayes_rule/?l=1zq)). The book _Superforecasting_ is a neat introduction to the practices involved. E.T. Jaynes' _Probability Theory: The Logic of Science_ is a hardcore introduction to the math behind it. Both books are probably available for free in the [z library](https://b-ok.org/) ([a](http://web.archive.org/web/20210129172145/https://b-ok.org/)).
|
||||||
|
|
||||||
![](images/357eb016aaa462558a014aa83debc25d46192d28.png)
|
![](.images/357eb016aaa462558a014aa83debc25d46192d28.png)
|
||||||
|
|
||||||
A graphical representation of [Bayes' rule](https://en.wikipedia.org/wiki/Bayes%27_theorem), from [Arbital](https://arbital.com/p/bayes_waterfall_diagram/?l=1x1&pathId=84358).
|
A graphical representation of [Bayes' rule](https://en.wikipedia.org/wiki/Bayes%27_theorem), from [Arbital](https://arbital.com/p/bayes_waterfall_diagram/?l=1x1&pathId=84358).
|
||||||
|
|
||||||
|
@ -45,13 +45,13 @@ To my new Spanish readers, I would recommend that you start forecasting on [Meta
|
||||||
|
|
||||||
Something that has been on my mind is that forecasting platforms tend to either have institutional partnerships or be nice to use. But generally not both. I think this can be explained by older websites using worse technology but having had more time to develop partnerships:
|
Something that has been on my mind is that forecasting platforms tend to either have institutional partnerships or be nice to use. But generally not both. I think this can be explained by older websites using worse technology but having had more time to develop partnerships:
|
||||||
|
|
||||||
![](images/64c123da69f1a003597c0ce0f222e7c9989bb602.png)
|
![](.images/64c123da69f1a003597c0ce0f222e7c9989bb602.png)
|
||||||
|
|
||||||
I generally tend to take a _technology maximalist_ perspective toward that tradeoff in this newsletter. I tend to express the view that platforms with better technology will outcompete the others because they will be able to move and experiment faster, add new features, and retain more users.
|
I generally tend to take a _technology maximalist_ perspective toward that tradeoff in this newsletter. I tend to express the view that platforms with better technology will outcompete the others because they will be able to move and experiment faster, add new features, and retain more users.
|
||||||
|
|
||||||
Recently, two interesting developments have been affecting the forecasting ecosystem. First, the war between Russia and Ukraine has sparked broader interest in whether forecasting platforms or prediction markets have anything to say about it:
|
Recently, two interesting developments have been affecting the forecasting ecosystem. First, the war between Russia and Ukraine has sparked broader interest in whether forecasting platforms or prediction markets have anything to say about it:
|
||||||
|
|
||||||
![](images/c2892cda125e9c804203100d2016ab29cf72bde3.png)
|
![](.images/c2892cda125e9c804203100d2016ab29cf72bde3.png)
|
||||||
|
|
||||||
Popularity of the search term "Metaculus" in Google trends. h/t Metaculus user [UgandaMaximum](https://www.metaculus.com/accounts/profile/116440/)
|
Popularity of the search term "Metaculus" in Google trends. h/t Metaculus user [UgandaMaximum](https://www.metaculus.com/accounts/profile/116440/)
|
||||||
|
|
||||||
|
@ -67,7 +67,7 @@ And with this, we are left to discuss recent developments:
|
||||||
|
|
||||||
[Global Guessing](https://twitter.com/GlobalGuessing) continues to do a great job following developments in the Ukraine war through shifts in probabilities. For example:
|
[Global Guessing](https://twitter.com/GlobalGuessing) continues to do a great job following developments in the Ukraine war through shifts in probabilities. For example:
|
||||||
|
|
||||||
![](images/ec09dfbc3b8498f5f5cea784815e18119327a1c6.jpg)
|
![](.images/ec09dfbc3b8498f5f5cea784815e18119327a1c6.jpg)
|
||||||
|
|
||||||
Global Guessing's tracking of probabilities about the Ukraine conflict.
|
Global Guessing's tracking of probabilities about the Ukraine conflict.
|
||||||
|
|
||||||
|
@ -94,7 +94,7 @@ INFER is organizing a tournament for [EA university groups](https://www.infer-pu
|
||||||
|
|
||||||
[Insight predictions](https://insightprediction.com/markets/206) ([a](http://web.archive.org/web/20220404154746/https://insightprediction.com/markets/206)) continues to have the guts to ask the important questions, such as: "Will Russia Conquer the Donbass by the End of July 2022?". Though liquidity (the opportunity to trade on both sides of a question) is a bit thin.
|
[Insight predictions](https://insightprediction.com/markets/206) ([a](http://web.archive.org/web/20220404154746/https://insightprediction.com/markets/206)) continues to have the guts to ask the important questions, such as: "Will Russia Conquer the Donbass by the End of July 2022?". Though liquidity (the opportunity to trade on both sides of a question) is a bit thin.
|
||||||
|
|
||||||
![](images/eb0db333c24fcfab0eb74133238e1fe5c6a925a3.png)
|
![](.images/eb0db333c24fcfab0eb74133238e1fe5c6a925a3.png)
|
||||||
|
|
||||||
The ¿founder? of Insight Predictions also [objected](https://forum.effectivealtruism.org/posts/xpkpXq57mXmLbgkSC/forecasting-newsletter-february-2022?commentId=jCn8ri7ux7Q28WmTP) ([a](https://web.archive.org/web/20220320031942/https://forum.effectivealtruism.org/posts/xpkpXq57mXmLbgkSC/forecasting-newsletter-february-2022#comments)) to me characterizing Insight as possibly but most likely not a scam in a previous newsletter. One of the key elements that made me suspicious was that he had previously remained anonymous. But he has now de-anonymized himself, and he turns out to be [Douglas Campbell](https://twitter.com/TradeandMoney), who previously served in Obama’s Council of Economic Advisors. So there’s that.
|
The ¿founder? of Insight Predictions also [objected](https://forum.effectivealtruism.org/posts/xpkpXq57mXmLbgkSC/forecasting-newsletter-february-2022?commentId=jCn8ri7ux7Q28WmTP) ([a](https://web.archive.org/web/20220320031942/https://forum.effectivealtruism.org/posts/xpkpXq57mXmLbgkSC/forecasting-newsletter-february-2022#comments)) to me characterizing Insight as possibly but most likely not a scam in a previous newsletter. One of the key elements that made me suspicious was that he had previously remained anonymous. But he has now de-anonymized himself, and he turns out to be [Douglas Campbell](https://twitter.com/TradeandMoney), who previously served in Obama’s Council of Economic Advisors. So there’s that.
|
||||||
|
|
||||||
|
@ -102,7 +102,7 @@ The ¿founder? of Insight Predictions also [objected](https://forum.effectivealt
|
||||||
|
|
||||||
Hypermind has a small [$5k tournament on African developments](https://prod.hypermind.com/ngdp/en/welcomeHA.html) ([a](http://web.archive.org/web/20211128100119/https://prod.hypermind.com/ngdp/en/welcomeHA.html))
|
Hypermind has a small [$5k tournament on African developments](https://prod.hypermind.com/ngdp/en/welcomeHA.html) ([a](http://web.archive.org/web/20211128100119/https://prod.hypermind.com/ngdp/en/welcomeHA.html))
|
||||||
|
|
||||||
![](images/ae398c69f1958af3f7c93b64ab0de9bef84bad65.png)
|
![](.images/ae398c69f1958af3f7c93b64ab0de9bef84bad65.png)
|
||||||
|
|
||||||
Polymarket has been offering rewards for trading. Trading incurs a fee, but trading rewards are higher, which incentivizes wash trading (trading back-and-forth at high volumes.) The thing is, Polymarket developers are not stupid, so I'm guessing that they are doing this because they want the volume to be as high as possible ¿possibly to impress or appease investors? The non-nefarious explanation is that they deeply want to attract new traders and keep the engagement of old ones, and are ok paying wash traders as the cost of doing business.
|
Polymarket has been offering rewards for trading. Trading incurs a fee, but trading rewards are higher, which incentivizes wash trading (trading back-and-forth at high volumes.) The thing is, Polymarket developers are not stupid, so I'm guessing that they are doing this because they want the volume to be as high as possible ¿possibly to impress or appease investors? The non-nefarious explanation is that they deeply want to attract new traders and keep the engagement of old ones, and are ok paying wash traders as the cost of doing business.
|
||||||
|
|
||||||
|
@ -112,7 +112,7 @@ In any case, I have downgraded [my estimates](https://github.com/QURIresearch/me
|
||||||
|
|
||||||
## Research
|
## Research
|
||||||
|
|
||||||
![](images/be76720e2918d0be6803b4839a2dbc1b498389a3.png)
|
![](.images/be76720e2918d0be6803b4839a2dbc1b498389a3.png)
|
||||||
|
|
||||||
Source: [goodjudgment.com](https://goodjudgment.com/) frontpage.
|
Source: [goodjudgment.com](https://goodjudgment.com/) frontpage.
|
||||||
|
|
||||||
|
@ -126,7 +126,7 @@ My forecasting group recently estimated the [risks of nuclear war](https://forum
|
||||||
|
|
||||||
Now a subject matter expert who served as deputy staff director of the Senate Committee on Foreign Relations where he worked on approval of the New START agreement, [critiziced our estimates](https://forum.effectivealtruism.org/posts/W8dpCJGkwrwn7BfLk/) ([a](http://web.archive.org/web/20220326095536/https://forum.effectivealtruism.org/posts/W8dpCJGkwrwn7BfLk/)). Our answer can be seen in [the comments](https://forum.effectivealtruism.org/posts/W8dpCJGkwrwn7BfLk/?commentId=PRkbcuTRDi6s2seLj) ([a](https://web.archive.org/web/20220405134835/https://forum.effectivealtruism.org/posts/W8dpCJGkwrwn7BfLk/?commentId=PRkbcuTRDi6s2seLj)).
|
Now a subject matter expert who served as deputy staff director of the Senate Committee on Foreign Relations where he worked on approval of the New START agreement, [critiziced our estimates](https://forum.effectivealtruism.org/posts/W8dpCJGkwrwn7BfLk/) ([a](http://web.archive.org/web/20220326095536/https://forum.effectivealtruism.org/posts/W8dpCJGkwrwn7BfLk/)). Our answer can be seen in [the comments](https://forum.effectivealtruism.org/posts/W8dpCJGkwrwn7BfLk/?commentId=PRkbcuTRDi6s2seLj) ([a](https://web.archive.org/web/20220405134835/https://forum.effectivealtruism.org/posts/W8dpCJGkwrwn7BfLk/?commentId=PRkbcuTRDi6s2seLj)).
|
||||||
|
|
||||||
![](images/1fdd63c407a3fd9083bfee556bbed6fc95d13d9c.png)
|
![](.images/1fdd63c407a3fd9083bfee556bbed6fc95d13d9c.png)
|
||||||
|
|
||||||
[Why short-range forecasting can be useful for longtermism](https://forum.effectivealtruism.org/posts/zjMeGcgWpvDcm3CkH/why-short-range-forecasting-can-be-useful-for-longtermism) ([a](http://web.archive.org/web/20220322085048/https://forum.effectivealtruism.org/posts/zjMeGcgWpvDcm3CkH/why-short-range-forecasting-can-be-useful-for-longtermism))
|
[Why short-range forecasting can be useful for longtermism](https://forum.effectivealtruism.org/posts/zjMeGcgWpvDcm3CkH/why-short-range-forecasting-can-be-useful-for-longtermism) ([a](http://web.archive.org/web/20220322085048/https://forum.effectivealtruism.org/posts/zjMeGcgWpvDcm3CkH/why-short-range-forecasting-can-be-useful-for-longtermism))
|
||||||
|
|
||||||
|
@ -136,7 +136,7 @@ Now a subject matter expert who served as deputy staff director of the Senate Co
|
||||||
|
|
||||||
In [Cryptoepistemology](https://www.lesswrong.com/posts/sDk3RziupmzShN2RN/cryptoepistemology) ([a](http://web.archive.org/web/20220307222715/https://www.lesswrong.com/posts/sDk3RziupmzShN2RN/cryptoepistemology)), davidad maps different theories of justified beliefs to different styles of cryptographic proof.
|
In [Cryptoepistemology](https://www.lesswrong.com/posts/sDk3RziupmzShN2RN/cryptoepistemology) ([a](http://web.archive.org/web/20220307222715/https://www.lesswrong.com/posts/sDk3RziupmzShN2RN/cryptoepistemology)), davidad maps different theories of justified beliefs to different styles of cryptographic proof.
|
||||||
|
|
||||||
![](images/1884d4bff206fcdeca154b51b2fbaa121872ee89.png)
|
![](.images/1884d4bff206fcdeca154b51b2fbaa121872ee89.png)
|
||||||
|
|
||||||
Lastly, I really enjoyed two prediction-market related April Fool's jokes: [Using prediction markets to generate LessWrong posts](https://www.lesswrong.com/posts/stefz96G9ycfMhjD2/using-prediction-markets-to-generate-lesswrong-posts) ([a](http://web.archive.org/web/20220404092319/https://www.lesswrong.com/posts/stefz96G9ycfMhjD2/using-prediction-markets-to-generate-lesswrong-posts)) and [Anti-Corruption Market](https://www.lesswrong.com/posts/px8ha4wSXcmfejEF9/anti-corruption-market) ([a](http://web.archive.org/web/20220404092344/https://www.lesswrong.com/posts/px8ha4wSXcmfejEF9/anti-corruption-market)). I'm also pretty proud of my own April Fool's: [Forecasting Newsletter: April 2222](https://forecasting.substack.com/p/forecasting-newsletter-april-2222?s=w) ([a](https://web.archive.org/web/20220405155605/https://forecasting.substack.com/p/forecasting-newsletter-april-2222?s=w)).
|
Lastly, I really enjoyed two prediction-market related April Fool's jokes: [Using prediction markets to generate LessWrong posts](https://www.lesswrong.com/posts/stefz96G9ycfMhjD2/using-prediction-markets-to-generate-lesswrong-posts) ([a](http://web.archive.org/web/20220404092319/https://www.lesswrong.com/posts/stefz96G9ycfMhjD2/using-prediction-markets-to-generate-lesswrong-posts)) and [Anti-Corruption Market](https://www.lesswrong.com/posts/px8ha4wSXcmfejEF9/anti-corruption-market) ([a](http://web.archive.org/web/20220404092344/https://www.lesswrong.com/posts/px8ha4wSXcmfejEF9/anti-corruption-market)). I'm also pretty proud of my own April Fool's: [Forecasting Newsletter: April 2222](https://forecasting.substack.com/p/forecasting-newsletter-april-2222?s=w) ([a](https://web.archive.org/web/20220405155605/https://forecasting.substack.com/p/forecasting-newsletter-april-2222?s=w)).
|
||||||
|
|
||||||
|
|
Before Width: | Height: | Size: 78 KiB After Width: | Height: | Size: 78 KiB |
Before Width: | Height: | Size: 92 KiB After Width: | Height: | Size: 92 KiB |
Before Width: | Height: | Size: 24 KiB After Width: | Height: | Size: 24 KiB |
Before Width: | Height: | Size: 36 KiB After Width: | Height: | Size: 36 KiB |
Before Width: | Height: | Size: 258 KiB After Width: | Height: | Size: 258 KiB |
Before Width: | Height: | Size: 60 KiB After Width: | Height: | Size: 60 KiB |
Before Width: | Height: | Size: 34 KiB After Width: | Height: | Size: 34 KiB |
Before Width: | Height: | Size: 477 KiB After Width: | Height: | Size: 477 KiB |
|
@ -65,7 +65,7 @@ INFER is hosting a discussion tomorrow under the ominous headline “[Reassertin
|
||||||
|
|
||||||
Good Judgment Open adds [a few new features](https://mailchi.mp/goodjudgment/gjo6apr22newsletter-1148858?e=72ee153cd5) ([a](https://web.archive.org/web/20220510032016/https://mailchi.mp/goodjudgment/gjo6apr22newsletter-1148858?e=72ee153cd5)), chiefly, a neat slider for visualizing probabilities.
|
Good Judgment Open adds [a few new features](https://mailchi.mp/goodjudgment/gjo6apr22newsletter-1148858?e=72ee153cd5) ([a](https://web.archive.org/web/20220510032016/https://mailchi.mp/goodjudgment/gjo6apr22newsletter-1148858?e=72ee153cd5)), chiefly, a neat slider for visualizing probabilities.
|
||||||
|
|
||||||
![](images/039d2f9f57c41e54559b23f1e7246ed3ecf5e9aa.png)
|
![](.images/039d2f9f57c41e54559b23f1e7246ed3ecf5e9aa.png)
|
||||||
|
|
||||||
PredictIt bettor [refuses to pay](https://twitter.com/PeePeePooPooPI3/status/1521968019433525255) ([a](https://web.archive.org/web/20220510032133/https://twitter.com/PeePeePooPooPI3/status/1521968019433525255)) $15k worth of over-the-counter bets.
|
PredictIt bettor [refuses to pay](https://twitter.com/PeePeePooPooPI3/status/1521968019433525255) ([a](https://web.archive.org/web/20220510032133/https://twitter.com/PeePeePooPooPI3/status/1521968019433525255)) $15k worth of over-the-counter bets.
|
||||||
|
|
||||||
|
@ -109,7 +109,7 @@ Overall, their method performs worse than current machine techniques. It is also
|
||||||
|
|
||||||
I released [three papers on scoring rules](https://github.com/SamotsvetyForecasting/optimal-scoring) ([a](https://web.archive.org/web/20220510032155/https://github.com/SamotsvetyForecasting/optimal-scoring)). The motivation behind them is my frustration with scoring rules as used in current forecasting platforms. I was also frustrated with the "reciprocal scoring" method recently proposed in [Karger et al](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3954498) ([a](https://web.archive.org/web/20220510032158/https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3954498)) and now used in Tetlock's Hybrid Forecasting-Persuasion tournament (see above). These new scoring rules incentivize collaboration, and although not quite [ready for production](https://github.com/SamotsvetyForecasting/optimal-scoring/issues) ([a](https://web.archive.org/web/20220510032202/https://github.com/SamotsvetyForecasting/optimal-scoring/issues)), I hope they could eventually provide a better incentive scheme for the forecasting ecosystem.
|
I released [three papers on scoring rules](https://github.com/SamotsvetyForecasting/optimal-scoring) ([a](https://web.archive.org/web/20220510032155/https://github.com/SamotsvetyForecasting/optimal-scoring)). The motivation behind them is my frustration with scoring rules as used in current forecasting platforms. I was also frustrated with the "reciprocal scoring" method recently proposed in [Karger et al](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3954498) ([a](https://web.archive.org/web/20220510032158/https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3954498)) and now used in Tetlock's Hybrid Forecasting-Persuasion tournament (see above). These new scoring rules incentivize collaboration, and although not quite [ready for production](https://github.com/SamotsvetyForecasting/optimal-scoring/issues) ([a](https://web.archive.org/web/20220510032202/https://github.com/SamotsvetyForecasting/optimal-scoring/issues)), I hope they could eventually provide a better incentive scheme for the forecasting ecosystem.
|
||||||
|
|
||||||
![](images/bde404d545bc59818be1549d1b50e5b5d87f37e9.png)
|
![](.images/bde404d545bc59818be1549d1b50e5b5d87f37e9.png)
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
Before Width: | Height: | Size: 21 KiB After Width: | Height: | Size: 21 KiB |
Before Width: | Height: | Size: 103 KiB After Width: | Height: | Size: 103 KiB |
Before Width: | Height: | Size: 26 KiB After Width: | Height: | Size: 26 KiB |
|
@ -64,7 +64,7 @@ More interestingly, he is creating a [speculative financial instrument](https://
|
||||||
|
|
||||||
Czech Priorities has an [update on their forecasting work](https://forum.effectivealtruism.org/posts/6meqpK339FnQpZ4kv/czech-forecasting-project-summary) ([a](http://web.archive.org/web/20220518092351/https://forum.effectivealtruism.org/posts/6meqpK339FnQpZ4kv/czech-forecasting-project-summary)) trying to influence the Czech government.
|
Czech Priorities has an [update on their forecasting work](https://forum.effectivealtruism.org/posts/6meqpK339FnQpZ4kv/czech-forecasting-project-summary) ([a](http://web.archive.org/web/20220518092351/https://forum.effectivealtruism.org/posts/6meqpK339FnQpZ4kv/czech-forecasting-project-summary)) trying to influence the Czech government.
|
||||||
|
|
||||||
![](images/d4bfc0829791b7017bd2a8c5074a81818ee63e06.png)
|
![](.images/d4bfc0829791b7017bd2a8c5074a81818ee63e06.png)
|
||||||
|
|
||||||
Note that incidents are declassified 10–25 years after they happen.
|
Note that incidents are declassified 10–25 years after they happen.
|
||||||
|
|
||||||
|
@ -95,7 +95,7 @@ Jotto, an experienced Metaculus forecaster, cautions against [boasting about non
|
||||||
|
|
||||||
Nick Bosse and others release a [comprehensive R package for scoring forecasts](https://github.com/epiforecasts/scoringutils) ([a](http://web.archive.org/web/20220603012339/https://github.com/epiforecasts/scoringutils)) ([twitter](https://twitter.com/nikosbosse/status/1526511848144642051) ([a](http://web.archive.org/web/20220603012652/https://twitter.com/nikosbosse/status/1526511848144642051)), [CRAN](https://cran.r-project.org/web/packages/scoringutils/index.html) ([a](http://web.archive.org/web/20220603013058/https://cran.r-project.org/web/packages/scoringutils/index.html)), [accompanying arxiv paper](https://arxiv.org/abs/2205.07090) ([a](http://web.archive.org/web/20220603012725/https://arxiv.org/abs/2205.07090))). Per the [CRAN logs](https://cranlogs.r-pkg.org/downloads/total/2022-05-10:2025-01-02/scoringutils) ([a](http://web.archive.org/web/20220603012812/https://cranlogs.r-pkg.org/downloads/total/2022-05-10:2025-01-02/scoringutils)) so far it's seeing a smallish to medium number of downloads (791 so far, and 151 in the last week). But once a library is well-engineered, I think it will tend to last. And it makes developments in other languages easier.
|
Nick Bosse and others release a [comprehensive R package for scoring forecasts](https://github.com/epiforecasts/scoringutils) ([a](http://web.archive.org/web/20220603012339/https://github.com/epiforecasts/scoringutils)) ([twitter](https://twitter.com/nikosbosse/status/1526511848144642051) ([a](http://web.archive.org/web/20220603012652/https://twitter.com/nikosbosse/status/1526511848144642051)), [CRAN](https://cran.r-project.org/web/packages/scoringutils/index.html) ([a](http://web.archive.org/web/20220603013058/https://cran.r-project.org/web/packages/scoringutils/index.html)), [accompanying arxiv paper](https://arxiv.org/abs/2205.07090) ([a](http://web.archive.org/web/20220603012725/https://arxiv.org/abs/2205.07090))). Per the [CRAN logs](https://cranlogs.r-pkg.org/downloads/total/2022-05-10:2025-01-02/scoringutils) ([a](http://web.archive.org/web/20220603012812/https://cranlogs.r-pkg.org/downloads/total/2022-05-10:2025-01-02/scoringutils)) so far it's seeing a smallish to medium number of downloads (791 so far, and 151 in the last week). But once a library is well-engineered, I think it will tend to last. And it makes developments in other languages easier.
|
||||||
|
|
||||||
![](images/a021b9280ea6e599748f6145cf2f47a435012268.png)
|
![](.images/a021b9280ea6e599748f6145cf2f47a435012268.png)
|
||||||
|
|
||||||
Bayesian method (in bright blue and in shaded confidence intervals) beats previous method (in black) at predicting Marathon records (in red).
|
Bayesian method (in bright blue and in shaded confidence intervals) beats previous method (in black) at predicting Marathon records (in red).
|
||||||
|
|
||||||
|
@ -103,7 +103,7 @@ Jaime Sevilla and Jonathan Lindbloom publish some research on [Bayesian models o
|
||||||
|
|
||||||
This approach might work for some problems, like Olympic records. But it would do less well over other problems where assuming identically distributed draws would not be a good assumption. For instance, Moore's law—or technological progress more generally—doesn't lend itself well to being modelled using this approach, because new approaches tend to build on top of previous approaches. The authors are planning to address this in future work.
|
This approach might work for some problems, like Olympic records. But it would do less well over other problems where assuming identically distributed draws would not be a good assumption. For instance, Moore's law—or technological progress more generally—doesn't lend itself well to being modelled using this approach, because new approaches tend to build on top of previous approaches. The authors are planning to address this in future work.
|
||||||
|
|
||||||
![](images/d0fb1dc864cc1a90611d8673f1402c76f12d3958.png)
|
![](.images/d0fb1dc864cc1a90611d8673f1402c76f12d3958.png)
|
||||||
|
|
||||||
My colleague Sam Nolan looks at [Quantifying Uncertainty in GiveWell's GiveDirectly Cost-Effectiveness Analysis](https://observablehq.com/@hazelfire/givewells-givedirectly-cost-effectiveness-analysis) ([a](http://web.archive.org/web/20220529032327/https://observablehq.com/@hazelfire/givewells-givedirectly-cost-effectiveness-analysis)). He takes point estimates of impact by GiveDirectly, and transforms them into estimates using distributions.
|
My colleague Sam Nolan looks at [Quantifying Uncertainty in GiveWell's GiveDirectly Cost-Effectiveness Analysis](https://observablehq.com/@hazelfire/givewells-givedirectly-cost-effectiveness-analysis) ([a](http://web.archive.org/web/20220529032327/https://observablehq.com/@hazelfire/givewells-givedirectly-cost-effectiveness-analysis)). He takes point estimates of impact by GiveDirectly, and transforms them into estimates using distributions.
|
||||||
|
|
||||||
|
|
Before Width: | Height: | Size: 69 KiB After Width: | Height: | Size: 69 KiB |
Before Width: | Height: | Size: 88 KiB After Width: | Height: | Size: 88 KiB |
Before Width: | Height: | Size: 99 KiB After Width: | Height: | Size: 99 KiB |
Before Width: | Height: | Size: 52 KiB After Width: | Height: | Size: 52 KiB |
Before Width: | Height: | Size: 86 KiB After Width: | Height: | Size: 86 KiB |
Before Width: | Height: | Size: 2.0 MiB After Width: | Height: | Size: 2.0 MiB |
Before Width: | Height: | Size: 44 KiB After Width: | Height: | Size: 44 KiB |
Before Width: | Height: | Size: 45 KiB After Width: | Height: | Size: 45 KiB |
Before Width: | Height: | Size: 42 KiB After Width: | Height: | Size: 42 KiB |
Before Width: | Height: | Size: 282 KiB After Width: | Height: | Size: 282 KiB |
Before Width: | Height: | Size: 146 KiB After Width: | Height: | Size: 146 KiB |
|
@ -14,4 +14,11 @@ Some examples of past markets in this spirit:
|
||||||
- [Will I receive a grant of $50,000 USD before June 1st, 2022?](https://manifold.markets/TimothyRooney/will-i-receive-a-grant-of-50000-usd)
|
- [Will I receive a grant of $50,000 USD before June 1st, 2022?](https://manifold.markets/TimothyRooney/will-i-receive-a-grant-of-50000-usd)
|
||||||
- [Will I find a new job by the end of August 2022?](https://manifold.markets/dukeGartzea/will-i-find-a-new-job-by-the-end-of)
|
- [Will I find a new job by the end of August 2022?](https://manifold.markets/dukeGartzea/will-i-find-a-new-job-by-the-end-of)
|
||||||
|
|
||||||
To let me know about a new such market you want me to bet on, you can find me on [Twitter](https://twitter.com/NunoSempere).
|
To let me know about a new such market you want me to bet on, you can find me on [Twitter](https://twitter.com/NunoSempere) or find my email in the *gossip* section of this website.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
Brave people who have taken me up on this offer:
|
||||||
|
|
||||||
|
- [@BenRCongdon](https://twitter.com/BenRCongdon) for [this market](https://manifold.markets/bcongdon/will-i-run-a-halfmarathon-in-2022).
|
||||||
|
- [@Cosmojg](https://twitter.com/cosmojg/status/1549927067055460352) for [this market](https://manifold.markets/cos/will-i-launch-my-digital-futarchy-b).
|
||||||
|
|
Before Width: | Height: | Size: 208 KiB After Width: | Height: | Size: 208 KiB |
Before Width: | Height: | Size: 376 KiB After Width: | Height: | Size: 376 KiB |
Before Width: | Height: | Size: 53 KiB After Width: | Height: | Size: 53 KiB |
Before Width: | Height: | Size: 67 KiB After Width: | Height: | Size: 67 KiB |
Before Width: | Height: | Size: 144 KiB After Width: | Height: | Size: 144 KiB |
76
blog/2022/07/23/thoughts-on-turing-julia/index.md
Normal file
|
@ -0,0 +1,76 @@
|
||||||
|
Some thoughts on Turing.jl
|
||||||
|
==========================
|
||||||
|
|
||||||
|
[Turing](https://turing.ml/stable/) is a cool probabilistic programming new language written on top of [Julia](https://julialang.org/). Mostly I just wanted to play around with a different probabilistic programming language, and discard the low-probability hypothesis that things that I am currently doing in [Squiggle](https://www.squiggle-language.com/) could be better implemented in it.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
My thoughts after downloading it and playing with it a tiny bit are:
|
||||||
|
|
||||||
|
1\. **Installation is annoying**: The program is pretty heavy, and it requires several steps (you have to install Julia and then Turing as a package, which is annoying (e.g., I had to figure out where in the filesystem to put the Julia binaries !!).
|
||||||
|
|
||||||
|
- Node.js installations can also be pretty gnarly (though there is [nvm](https://github.com/nvm-sh/nvm)), but Turing doesn't have an equivalent online playground. My sense is that running Julia online would also be pretty annoying (?).
|
||||||
|
|
||||||
|
2\. **Compilation and running the thing is _slow_**; 9 seconds until I get an error (I hadn't installed a necessary package), and then 1 min 26 seconds to run their [_simplest example_](https://turing.ml/dev/docs/using-turing/get-started) (!!)
|
||||||
|
|
||||||
|
```
|
||||||
|
using Turing
|
||||||
|
using StatsPlots
|
||||||
|
|
||||||
|
# Define a simple Normal model with unknown mean and variance.
|
||||||
|
@model function gdemo(x, y)
|
||||||
|
s² ~ InverseGamma(2, 3)
|
||||||
|
m ~ Normal(0, sqrt(s²))
|
||||||
|
x ~ Normal(m, sqrt(s²))
|
||||||
|
y ~ Normal(m, sqrt(s²))
|
||||||
|
end
|
||||||
|
|
||||||
|
# Run sampler, collect results
|
||||||
|
chn = sample(gdemo(1.5, 2), HMC(0.1, 5), 1000)
|
||||||
|
|
||||||
|
# Summarise results
|
||||||
|
describe(chn)
|
||||||
|
|
||||||
|
# Plot and save results
|
||||||
|
p = plot(chn)
|
||||||
|
savefig("gdemo-plot.png")
|
||||||
|
```
|
||||||
|
|
||||||
|
This seems like this is a problem with [Julia more generally](<https://www.reddit.com/r/Julia/comments/lmznx7/running_scripts_is_terribly_slow_im_new_to_julia/>). Btw, the [Julia webpage](https://julialang.org/) mentions that Julia "feels like a scripting language", which seems like a bold-faced lie.
|
||||||
|
|
||||||
|
A similar but not equivalent [^1] model in Squiggle would run in seconds, and allow for the fast iteration that I know and love:
|
||||||
|
|
||||||
|
```
|
||||||
|
s = (0.1 to 1)^(1/2) // squiggle doesn't have the inverse gamma function yet
|
||||||
|
m = normal(0, s)
|
||||||
|
x = normal(m, s)
|
||||||
|
y = normal(m, s)
|
||||||
|
```
|
||||||
|
|
||||||
|
3\. Turing is able to do **Bayesian inference** over parameters, which seems cool & intend to [learn more about](https://github.com/rmcelreath/stat_rethinking_2022).
|
||||||
|
|
||||||
|
It's probably kind of weird that Squiggle, as a programming language that manipulates distributions, doesn't allow for Bayesian inference.
|
||||||
|
|
||||||
|
4\. Turing seems **pretty integrated with Julia**, and the documentation seems to assume familiarity with Julia. This can have pros and cons, but made it difficult to just grasp what they are doing.
|
||||||
|
|
||||||
|
- The pros are that it can use all the Julia libraries, and this looks like it is _very_ powerful
|
||||||
|
- The cons are that it requires familiarity with Julia.
|
||||||
|
|
||||||
|
It's possible that there could be some workflows with Squiggle where we go back and forth between Squiggle and javascript in node; Turing seems like it has that kind of integration down-pat.
|
||||||
|
|
||||||
|
5\. Turing seems like it could drive some **hardcore setups**. E.g., [here](https://withdata.io/election/media/) is a project using it to generate election forecasts.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
Overall, I dislike the slowness and, as an outsider, the integration with Julia, but I respect the effort. It's possible but not particularly likely that we may want to first script models in Squiggle and then translate them to a more powerful languages like Turing when speed is not a concern and we need capabilities not natively present in Squiggle (like Baysian inference).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
See also:
|
||||||
|
|
||||||
|
- [Why We Use Julia, 10 Years Later](https://julialang.org/blog/2022/02/10years/)
|
||||||
|
- [What's bad about Julia?](https://viralinstruction.com/posts/badjulia/#whats_bad_about_julia)
|
||||||
|
- [Julia in HackerNews](https://hn.algolia.com/?dateRange=all&page=0&prefix=false&query=Julia%20&sort=byPopularity&type=story)
|
||||||
|
|
||||||
|
|
||||||
|
[^1]: Even if Squiggle had the inverse gamma function, it's not clear to me that the two programs are doing the same thing, because Turing could be doing something trickier even in that simple example (?). E.g., Squiggle is drawing samples whereas Turing is (?) representing the space of distributions with those pararmeters. This is something I didn't understand from the documentation.
|