From d54cff97b61503c9aedfd36f71fb2a1b8f66ca25 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Nu=C3=B1o=20Sempere?= <nuno.sempere@gmail.com>
Date: Mon, 6 May 2019 15:59:03 +0200
Subject: [PATCH] Update and rename Writeup.md to
 Self-experimentation-calibration.md

---
 ...md => Self-experimentation-calibration.md} | 61 +++++++++++--------
 1 file changed, 36 insertions(+), 25 deletions(-)
 rename rat/{Writeup.md => Self-experimentation-calibration.md} (85%)

diff --git a/rat/Writeup.md b/rat/Self-experimentation-calibration.md
similarity index 85%
rename from rat/Writeup.md
rename to rat/Self-experimentation-calibration.md
index f1a51bf..3f6d8ea 100644
--- a/rat/Writeup.md
+++ b/rat/Self-experimentation-calibration.md
@@ -21,11 +21,14 @@ I have 505 observations. The dataset is available if I know you or if you can ge
 ## 1. How well calibrated am I?
 A picture is worth a thousand words:
 
-In this case, two pictures: The second merges probabilities > and < than .5 in the obvious way: it interprets having assigned a probability of, say, 0.33 to "X" as having assigned a probability of 0.66 to "Not X". Working with odds, this is even more straightforward: if you offer 1:2 odds in favor of "X", you're also offering 2:1 odds in favor of "Not X".
+![](https://nunosempere.github.io/rat/Calibration1.png)
+![](https://nunosempere.github.io/rat/Calibration2.png)
+
+In this case, two pictures: The second merges probabilities > and < than .5 in the obvious way: it interprets having assigned a probability of, say, 0.33 to "X" as having assigned a probability of 0.66 to "Not X". Working with odds, this is also straightforward: if you think that 1:2 are fair odds in favor of "X", you also think that 2:1 are fair odds in favor of "Not X".
 
 I notice that my 1:5 is closer to 1:2.5 in reality, with n=28 observations. My 1:15 is also closer to 1:5, but I think that this particularity can be explained by 1:15 being the default value, i.e., the value which got written when I left that cell blank. I'll nonetheless pay attention to that in the future. On the bright side, my 1:2 and 1:3 odds are exactly on point.
 
-My Brier score is 0.0755985, which indicates that the questions were in general rather easy, or that I'm very sharp. 
+My Brier score is 0.0755985. The significant digits become relevant later.
 
 ## 2. How do I compare to a some simple regression models?
 I create four simple linear regression models and interpret their output as probability. I also consider a really really dumb predictor, for comparison purposes.
@@ -37,14 +40,14 @@ I create four simple linear regression models and interpret their output as prob
 
 Here is a table:
 
-|                                                         | Which variable does the regression try to predict? | Variables regressed on (what information does this model work with)                                                                                                                                   | Brier score tested & trained on the whole set | Trained on 80% and tested on the rest (average value, 1000+ times) | Trained on 50% and tested on the rest (average value, 1000+ times) |
-|---------------------------------------------------------|----------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------|--------------------------------------------------------------------|--------------------------------------------------------------------|
-| 1. Dumbest model                                        | Binary outcome                                     | None. Empty regression, just the intercept                                                                                                                                                            | 0.095496                                      | 0.09538288                                                         | 0.09542671                                                         |
-| 2. Regression without any subjective factors            | Binary outcome                                     | Type of question Homework vs Exam vs Lecture question First vs second try BDC                                                                                                                         | 0.082598                                      | 0.131296                                                           | 0.1412152                                                          |
-| 3. Regression model with inner experience               | Binary outcome                                     | 1. Type of question 2. Homework vs Exam vs Lecture question 3. First vs second try 4. Becker Depression Checklist Score. 5. Inner experience: Hunch to Incredibly Confident                           | 0.076962                                      | 0.1040722                                                          | 0.1149272                                                          |
-| 4. Full regression model                                | Binary outcome                                     | 1. Type of question 2. Homework vs Exam vs Lecture question 3. First vs second try 4. Becker Depression Checklist Score. 5. Inner experience: Hunch to Incredibly Confident 6. Subjective probability | 0.073224                                      | 0.09260587                                                         | 0.1020023                                                          |
-| 5. Regression model with only my subjective probability | Binary outcome                                     | 1. Subjective probability                                                                                                                                                                             | 0.075541                                      | 0.07545493                                                         | 0.07538371                                                         |
-| 6. Subjective probability                               | Does not apply | Not a regression model            | Does not appy | Not a regression model                                                                                                                                                                | Does not apply | My Brier score was 0.0755985 | -                                                                  | -                                                                  |
+|                                                         | Which variable does the regression try to predict? | Variables regressed on (what information does this model work with)                                                                                                                                       | Brier score tested & trained on the whole set | Trained on 80% and tested on the rest (average value, 1000+ times) | Trained on 50% and tested on the rest (average value, 1000+ times) |
+|---------------------------------------------------------|----------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------|--------------------------------------------------------------------|--------------------------------------------------------------------|
+| 1. Dumbest model                                        | Binary outcome                                     | None. Empty regression, just the intercept                                                                                                                                                                | 0.095496                                      | 0.09538288                                                         | 0.09542671                                                         |
+| 2. Regression without any subjective factors            | Binary outcome                                     | 1. Type of question 2. Homework vs Exam vs Lecture question 3. First vs second try 4. BDC                                                                                                                 | 0.082598                                      | 0.131296                                                           | 0.1412152                                                          |
+| 3. Regression model with inner experience               | Binary outcome                                     | 1. Type of question 2. Homework vs Exam vs Lecture question 3. First vs second try 4. Becker Depression Checklist Score. 5. Inner experience: Hunch to Incredibly Confident                               | 0.076962                                      | 0.1040722                                                          | 0.1149272                                                          |
+| 4. Full regression model                                | Binary outcome                                     | 1. Type of question    2. Homework vs Exam vs Lecture question  3. First vs second try 4. Becker Depression Checklist Score. 5. Inner experience: Hunch to Incredibly Confident 6. Subjective probability | 0.073224                                      | 0.09260587                                                         | 0.1020023                                                          |
+| 5. Regression model with only my subjective probability | Binary outcome                                     | 1. Subjective probability                                                                                                                                                                                 | 0.075541                                      | 0.07545493                                                         | 0.07538371                                                         |
+| 6. Subjective probability                               | Does not apply | Not a regression model            | Does not appy | Not a regression model                                                                                                                                                                    | Does not apply | My Brier score was 0.0755985 | -                                                                  | -                                                                  |
 
 ### 2.1. Dumb model
 
@@ -54,6 +57,7 @@ A dumb model which always outputs the overall base rate gets a Brier score of 0.
 
 #### 2.2.1. Code and output
 
+```
 > summary(LM6)
 
 Call:
@@ -65,21 +69,23 @@ Residuals:
 -1.00188  0.01493  0.04878  0.09309  0.49155 
 
 Coefficients:
-                                           Estimate Std. Error t value Pr(>|t|)    
-(Intercept)                               0.8408927  0.0893759   9.408  < 2e-16 ***
-as.factor(D$Type.of.question)MC           0.1898829  0.0590873   3.214 0.001396 ** 
-as.factor(D$Type.of.question)MS          -0.0080026  0.0618861  -0.129 0.897163    
-as.factor(D$Type.of.question)TF           0.1560246  0.0631872   2.469 0.013875 *  
-Is_Normal                                 0.0342562  0.0453966   0.755 0.450848    
-Is_Homework                              -0.0003015  0.0512829  -0.006 0.995311    
-BDC                                      -0.0020370  0.0015640  -1.302 0.193366    
-as.factor(Trial..1st.if.not.specified.)2 -0.2039517  0.0590451  -3.454 0.000599 ***
----
-Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
+                                           Estimate Std. Error t value Pr(>|t|)   
 
-Residual standard error: 0.2897 on 497 degrees of freedom
-Multiple R-squared:  0.1349,	Adjusted R-squared:  0.1227 
-F-statistic: 11.07 on 7 and 497 DF,  p-value: 4.899e-13
+(Intercept)                               0.8408927  0.0893759   9.408  < 2e-16 ***  
+as.factor(D$Type.of.question)MC           0.1898829  0.0590873   3.214 0.001396 **  
+as.factor(D$Type.of.question)MS          -0.0080026  0.0618861  -0.129 0.897163  
+as.factor(D$Type.of.question)TF           0.1560246  0.0631872   2.469 0.013875 *  
+Is_Normal                                 0.0342562  0.0453966   0.755 0.450848  
+Is_Homework                              -0.0003015  0.0512829  -0.006 0.995311    
+BDC                                      -0.0020370  0.0015640  -1.302 0.193366  
+as.factor(Trial..1st.if.not.specified.)2 -0.2039517  0.0590451  -3.454 0.000599 ***  
+
+Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1  
+
+Residual standard error: 0.2897 on 497 degrees of freedom  
+Multiple R-squared:  0.1349,	Adjusted R-squared:  0.1227  
+F-statistic: 11.07 on 7 and 497 DF,  p-value: 4.899e-13  
+```
 
 #### 2.2.2. Interpretation
 I found it rather surprising that how depressed I was (BDC: Becker Depression Checklist) didn't seem to have that big of an effect. In particular, I try to adjust for my mood, but I wasn't particularly expecting to succeed. Anecdotically, I do see an effect of my mood on the extremity of my odds: The sadder I am the more recluctant I am to give 1:1000, 1:10000 and higher odds, even about things which I'm really sure about.
@@ -94,6 +100,7 @@ If I use this model to output predicted probabilities for each question, I get a
 
 #### 2.3.1. Code and output
 
+```
 > lm(Result_Binary ~ as.factor(D$Type.of.question) + as.factor(D$One.word) + Is_Normal + Is_Homework + BDC + as.factor(Trial..1st.if.not.specified.), data=D) -> LM1
 > summary(LM1)
 
@@ -125,6 +132,7 @@ Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’
 Residual standard error: 0.2813 on 492 degrees of freedom
 Multiple R-squared:  0.193,	Adjusted R-squared:  0.1733 
 F-statistic: 9.804 on 12 and 492 DF,  p-value: < 2.2e-16
+```
 
 #### 2.3.2 Interpretation
 As expected, the coefficients associated with a measure of my inner confidence check out. Huch < Somewhat confident < Confident < Very confident < Incredibly confident (IC). Note that all of the factors are present, instead of one of them having been swallowed by the intercept, because there were 3 times which I just left that question blank, and I didn't want to remove that data.
@@ -135,7 +143,7 @@ If I instead train the model on a random selection of 80% of the data points, an
 
 ### 2.4. Including my subjective probability & everything else.
 #### 2.4.1. Code and output
-
+```
 > summary(LM3)
 
 Call:
@@ -169,6 +177,7 @@ Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’
 Residual standard error: 0.2747 on 491 degrees of freedom
 Multiple R-squared:  0.2316,	Adjusted R-squared:  0.2113 
 F-statistic: 11.39 on 13 and 491 DF,  p-value: < 2.2e-16
+```
 
 #### 2.4.1. Interpretation
 All the other factors become slightly more irrelevant. It seems that my subjective probability does add information, a lot of it. After having seen the graph at the beginning, this is not surprising.
@@ -179,6 +188,7 @@ If I use this model to output probabilities for each question, I get a Brier sco
 ### 2.5. Including only my subjective probability
 
 #### 2.5.1. Code and output
+```
 > summary(LM2)
 
 Call:
@@ -198,6 +208,7 @@ Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’
 Residual standard error: 0.2754 on 503 degrees of freedom
 Multiple R-squared:  0.209,	Adjusted R-squared:  0.2074 
 F-statistic: 132.9 on 1 and 503 DF,  p-value: < 2.2e-16
+```
 
 #### 2.3.2. Interpretation
 Multiply my probability by 1.005-ish and take 1.2% from that, and I'd be slightly more accurate. I'm not reading much into that. If I do this, I get an slightly better Brier score of 0.075541, slightly better than my own 0.0755985.