ESPR RCT.

Introduction

There is a certain valuable way of thinking, which is not yet taught in schools, in this present day. This certain way of thinking is not taught systematically at all. It is just absorbed by people who grow up reading books like Surely You’re Joking, Mr. Feynman or who have an unusually great teacher in high school.

Most famously, this certain way of thinking has to do with science, and with the experimental method. The part of science where you go out and look at the universe instead of just making things up. The part where you say “Oops” and give up on a bad theory when the experiments don’t support it.

But this certain way of thinking extends beyond that. It is deeper and more universal than a pair of goggles you put on when you enter a laboratory and take off when you leave. It applies to daily life, though this part is subtler and more difficult. But if you can’t say “Oops” and give up when it looks like something isn’t working, you have no choice but to keep shooting yourself in the foot. You have to keep reloading the shotgun and you have to keep pulling the trigger. You know people like this. And somewhere, someplace in your life you’d rather not think about, you are people like this. It would be nice if there was a certain way of thinking that could help us stop doing that.

Eliezer Yudkowsky, https://www.lesswrong.com/rationality/preface

The evidence on CFAR.

The evidence for/against CFAR interests me, because I take it as likely that it is very much correlated with the evidence on ESPR. For example, if reading programs in India show that dividing students by initial level improves their learning outcome, then you'd expect similar processes to be at play in Kenya. Thus, if the evidence on CFAR were robust, we might be able to afford being less rigorous when it comes to ESPR.

I've mainly looked over CFAR 2015 Longitudinal Study and the more recent Case studies and 2017 CFAR Impact report

With regards to the first, I consider the data to be weak evidence on causal questions about the effects of the workshop. The study notes that a control group would be a difficult thing to implement, noting it would require finding people who would like to come to the program and forbidding them to do so. The study tries to compensate for the lack of a control by being statistically clever, and to a certain extent, achieves this.

I feel like the above is only partially sufficient, that is: it conludes that there is probably some kind of effect. But it's magnitude could be wildly overestimated. Thus, I feel that an RCT can be delayed on the strength of the evidence that CFAR currently has, but not indefinitely. I suggest teaming up with MIT's JPAL, i.e., (The Abdul Latif Jameel Poverty Action Lab)[https://www.povertyactionlab.org/], which specializes on designing and implementing evaluations. JPAL would provide like the following: we can randomly admit people for either this year or the next, and take as the control the group which has been left waiting.

With regards to the second and third documents, I feel that they provide strong intuitions for why CFAR's logical model is not totally bullshit. This would be something like: CFAR students are taught rationality techniques + have an environment in which they can question their current decisions and consider potentially better choices -> they go on to do more good in the world, f.ex. by switching careers.

Eric described the mindset of people at CFAR as “the exact opposite of learned helplessness”, and found that experiencing more of this mindset, in combination with an increased ability to see what was going on with his mind, was particularly helpful for making this shift.

ESPR as distinct from CFAR.

It must be noted that ESPR gets little love from the main organization, being mainly run by volunteers, with some instructors coming in to give classes. Eventually, it might make sense to institute espr as a different organization with a focus on Europe instead of as an American side project.

ESPR's Logical model.

I think that the logical model underpinning ESPR is fundamentally solid, i.e., as solid as CFAR's. In the words of a student which came back this year as a Junior Counselor:

[Teaches] ESPR smart people not to make stupid mistakes. Examples: betting, prediction markets decrease overconfidence. Units of exchange class decreases likelihood of spending time, money, other currency in counterproductive ways. The whole asking for examples thing prevents people from hiding behind abstract terms and to pretend to understand something when they don't. Some of this is learned in classes. A lot of good techniques from just interacting with people at espr.

I've had conversations with otherwise really smart people and thought “you wouldn't be stuck with those beliefs if you'd gone though two weeks of espr”

ESPR also increases self-awareness. A lot of espr classes / techniques / culture involves noticing things that happen in your head. This is good for avoiding stupid mistakes and also for getting better at accomplishing things.

It is nice to be surrounded by very smart. ambitious people. This might be less relevant for people who do competitions like IMO or go to very selective universities. Personally, it is a fucking awesome and rare experience every time I meet someone really smart with a bearable personality in the real world. Being around lots of those people at espr was awesome. Espr might have made a lot of participants consider options they wouldn't seriously have before talking to the instructors like founding a startup, working on ai alignment, everything that galit talked about etc

espr also increased positive impact participants will have on the world in the future by introducing them to effective altruism ideas. I think last year’s batch would have been affected more by this because I remember there being more on x-risk and prioritizing causes and stuff [1].

I spent 15 mins =)

Additionally, espr gives some of it's alumni the opportunity to come back as Junior Counselors, which take on a possition of some responsibility, an aspect not present in cfar workshops.

[1]. I am not sure I share this impression. In particular, this year, being in Edimburgh, we didn't bring in an FHI person to give a talk. We did have an AI risk panel, and ea/x-risk were important (~10%) focus of conversations. However, I will make a note to bring someone from the FHI next year. We also continued grappling with the boundaries between presenting an important problem and indoctrinating and mindfucking impressionable young persons.

Perverse incentives

As with CFAR's, I think that the profiles in the following section provide useful intuitions. However, while perhaps narratively compelling, there is no control group, which is supremely shitty. These profiles may not allow us to falsify any hypothesis, i.e., to meaningfully change our priors. The evidence is weak in that with the current evidence, I would feel uncomfortable saying that ESPR should be scaled up.

To the extent that OpenPhilantropy prefers these and other weak forms of evidence now, rather than stronger evidence two-three years later, OpenPhilantropy is giving ESPR perverse incentives. Note that with 20-30 students per year, even after we start an rct, there must pass a number of years before we can amass some meaningful statistical power. Furthermore, seeing the process of iterated improvement as an admission of failure would also be catastrophic.

Alternatives to espr: The cheapest option.

One question which interests me is: what is the cheapest version of the program which is still cost effective? What happens if you just record the classes, send them to bright people, and answer their questions? What if you set up a course on edx? Interventions based on universities and highschools are likely to be much cheaper, given that neither board nor flight, nor classrooms would have to be paid for. Is there a low-cost, scalable approach?

I'm told that some of the cfar instructors have strong intuitions that in-person teaching is much more effective, based on their own experience and perhaps also on a 2012 small rct, which is either unpublished or unfindable.

Still, I want to test this assumption, because, almost by definition, to do so would be pretty cheap. As a plus, we can take the population who takes the cheaper course to be a second control group.

8.5 KiB Raw Blame History Unescape Escape