From 7694124fec1d44709c80ec7f69f5ea0a2fc1d8ff Mon Sep 17 00:00:00 2001 From: NunoSempere Date: Sun, 23 Jul 2023 21:10:22 +0200 Subject: [PATCH] pontificate about tests with wide lognormals --- README.md | 62 +++++++++++++++++++++++++++++++++++++++++++++++++++++- test/test | Bin 26640 -> 26768 bytes 2 files changed, 61 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index d241d17..7e45ce6 100644 --- a/README.md +++ b/README.md @@ -176,6 +176,66 @@ int main(){ } ``` +### Tests and the long tail of the lognormal + +Distribution functions can be tested with: + +```sh +cd tests +make && make run +``` + +`make verify` is an alias that runs all the tests and just displays the ones that are failing. + +These tests are somewhat rudimentary: they get between 1M and 10M samples from a given sampling function, and check that their mean and standard deviations correspond to what they should theoretically should be. + +If you run `make run` (or `make verify`), you will see errors such as these: + +``` +[-] Mean test for normal(47211.047473, 682197.019012) NOT passed. +Mean of normal(47211.047473, 682197.019012): 46933.673278, vs expected mean: 47211.047473 +delta: -277.374195, relative delta: -0.005910 + +[-] Std test for lognormal(4.584666, 2.180816) NOT passed. +Std of lognormal(4.584666, 2.180816): 11443.588861, vs expected std: 11342.434900 +delta: 101.153961, relative delta: 0.008839 + +[-] Std test for to(13839.861856, 897828.354318) NOT passed. +Std of to(13839.861856, 897828.354318): 495123.630575, vs expected std: 498075.002499 +delta: -2951.371925, relative delta: -0.005961 +``` + +These tests I wouldn't worry about. Due to luck of the draw, their relative error is a bit over 0.005, or 0.5%, and so the test fails. But it would surprise me if that had some meaningful practical implication. + +The errors that should raise some worry are: + +``` +[-] Mean test for lognormal(1.210013, 4.766882) NOT passed. +Mean of lognormal(1.210013, 4.766882): 342337.257677, vs expected mean: 288253.061628 +delta: 54084.196049, relative delta: 0.157985 +[-] Std test for lognormal(1.210013, 4.766882) NOT passed. +Std of lognormal(1.210013, 4.766882): 208107782.972184, vs expected std: 24776840217.604111 +delta: -24568732434.631927, relative delta: -118.057730 + +[-] Mean test for lognormal(-0.195240, 4.883106) NOT passed. +Mean of lognormal(-0.195240, 4.883106): 87151.733198, vs expected mean: 123886.818303 +delta: -36735.085104, relative delta: -0.421507 +[-] Std test for lognormal(-0.195240, 4.883106) NOT passed. +Std of lognormal(-0.195240, 4.883106): 33837426.331671, vs expected std: 18657000192.914921 +delta: -18623162766.583248, relative delta: -550.371727 + +[-] Mean test for lognormal(0.644931, 4.795860) NOT passed. +Mean of lognormal(0.644931, 4.795860): 125053.904456, vs expected mean: 188163.894101 +delta: -63109.989645, relative delta: -0.504662 +[-] Std test for lognormal(0.644931, 4.795860) NOT passed. +Std of lognormal(0.644931, 4.795860): 39976300.711166, vs expected std: 18577298706.170452 +delta: -18537322405.459286, relative delta: -463.707799 +``` + +What is happening in this case is that you are taking a normal, like `normal(-0.195240, 4.883106)`, and you are exponentiating it to arrive at a lognormal. But `normal(-0.195240, 4.883106)` is going to have some noninsignificant weight on, say, 18. But `exp(18) = 39976300`, and points like it are going to end up a nontrivial amount to the analytical mean and standard deviation, even though they have little probability mass. + +Fortunately, the reader can also check that for more plausible real-world values, like the + ## Related projects - [Squiggle](https://www.squiggle-language.com/) @@ -185,7 +245,6 @@ int main(){ ## To do list -- [ ] Pontificate about lognormal tests - [ ] Have some more complicated & realistic example - [ ] Add summarization functions: 90% ci (or all c.i.?) - [ ] Systematize references @@ -232,3 +291,4 @@ int main(){ - https://link.springer.com/article/10.1007/bf02293108 - https://stats.stackexchange.com/questions/502146/how-does-numpy-generate-samples-from-a-beta-distribution - https://github.com/numpy/numpy/blob/5cae51e794d69dd553104099305e9f92db237c53/numpy/random/src/distributions/distributions.c +- [x] Pontificate about lognormal tests diff --git a/test/test b/test/test index d2f0a3465e48b439ada7f55c6272f5d4127535ba..af5c341a5c9b63447cc72d4b3905a61e28353c8c 100755 GIT binary patch delta 6200 zcmaJ_3wTu3wLWJi6J|m(k35)6UNcE1FA|s;2nGEwzL~Yx zzt&!Rt+n@F`<&C-&9@%oTVpM~BRHvfWx{p0W&O87Fvy(HcG4AAGg4-1Mp7g{+7}!y zvnu7xGS7pA;^&?v^X`S;rM()N7w9qGYVj&sUPa4GFY^}+xy%S!F74D5C;xwzx)n>^ zWY8{(&gy#h)cSE7PR?!n>UhTI4gdbLsyBzxBihWE7#LEW;<(2o!!h)_b_#!i&T6e> z4)OCr@`Y^RKa}|HkRIJZXr)kxCYtu?-qtm&UA?A}nnG8Q5IVl;GGr{)C)@r5afWiY z6h0vFA)Homp69e&y&?nQqU-7+I8MhIA!p;)#nmCbT28gsXk)yFi&e&}iPN*eXhEw) zBDwKOu2exYbqt$E;rj3hVUeUF0$Mz?|H7{Vj=3_*=F%3!4hK)Z&ZW?E>l z+AOMjpofM?qo~DT<)b@045^{p%qjYmVvf5erlrzI^Ec7UP0+q}@#_DIV(WEN(l-#i zNtKo(x#;{3Tg_z8=Qu_1(p~(nyZGlQHqu#fkBj5{?{R__B=D}li=yZc^a&@CEPu=7 zQIHB>Vzx_qlbzdiiEmuy->S=ZGv&ToziHJWu#>^Eeo!dGasY5rTV!;&br2pENga{t zOWqsJaet{s0Ky7p(O7G;ZcEbIN!5KfjEywj+Y$^y!AQ1isn91JET!8Vc?@pdFXm-3jx#2sLoBD#1 z14106wWhvPwIf65-$^ z|4H{oXB5())!qR_vsLH$uC+OX)=Fe+mhZPVhJFicKN_@FFI(#XFvPcWU}fKWO_FbI zvixq}-l6!IjM+=G)rdJ*yEGhYi=D_8_I@~8t(YDRXoYOD==)91qSly<{Ixl%$*teA zg<|1w6ccpP&tfNzEP&u` zR);-7;qonPrcDRDZ-;?j=v&}hc(D+WayqFYHiq|6v@MOFMiXsL{v%pso5UB=3$`3S zpAG|`L4CH-yo-|JW@ewzOEXTpRW#w#bzdM2dkqv5`VIKDm-+V=>vgezI3&!n?y7JMQt zcqZ)cuQiqH+9Y(|p=+jjx}#G6LETa6j?azCyeY-#FADTMAcr`DtO>sCauf^LxZ`da zd|>W=3p&)@Z_{W~!W90;v_GMe_fSG&nkUTpgYLd@o#oIR?JNqclp-EZYZ9kK{TpCp zdx$4>bRe;kucySMbcc>@tfBEEA45{Gm#UJ+YObWxj-+D#O%jrv7#Cq58+G}rWVP~6 ze4kL6y-s7+(_8j5$A;t|G#f`+hmZHtbccg4pjyW|{xJO={bsuD7#lUvcem{?Xj<}e zP4aO1U9!`$-Ts4V$f2vrR{jMtq}x;{{yW`UCmFSf22CY=6@-QY^~$$^lYXrDrCcY zSj5%9uy2yEW!BMn=Q7Rd%k-jiCRXjTa~Dta)3nsEKU|VWu1;hlsXHw^v?LJ&KA>}H z5Ama@B>iDcSpscMO9@;3g{ucf%`c*7Jta)2I6it%V`vDe#QWE zqK@jb?V5K-P+N9s(l)8iSc;fdz1jF7;Un0sjTn8*1lx;AFJ=C4M2YV0GU z{kSI9M?yGm271_L*G*40;vvM>6^i;9gdJ} zMqG`cKjr>e10Pn6>l&U8R5s@%&ppjj6oTS>kh5S4^kXISvJ}HV+ zpI07=(k-0$n%>Ww)70F<3VqTyzPY{%UwL6*;B9VPI9K5h+$_TsW??tvg@J?q!2Y6^ zJxs)I)eC%8Qs?i^0I8``6TW_?pi&e~b48 z_Q;8C6VBMCXS#L_XFOZ&mp*2`U~}m44+RuH^8*WOO^+U8*Ni{#4!S|#gZ_Qz9DYN$ z^t6GC5f2Vh0QVQQOrC~<=$l&@_{djL7`W*x_XP$j{4Jw$0hbkJ-DXJ>`aVQ~+IL&q zX!o2lb%ss&T=z%stJ)^^l7%YH)gKYX<>)sb#X~b_E9f3Z|B7KDc#->?D3*YCoDjtz z=-k7Q0`MvjMl;Mc~NwO_TtQxGaeb=4B86%D^L!#ri0 z>B{sud_VbS6!Rt2Hls{8fQ0U(tJ87K6imi7<12W8KS8?+%Fw@B;40mXgeEIC*{qdF z<{QAWhF~j!_1uLu1G@oisiGGPx*b>vE%aM$?dVniT@+UgksbhsbM3dL^rCkj3pbvH z<2SlBXTps7!r`0)>;Po?Gj!0Oq;p_-Hq%9awyp%dSIJfwZ>UG_JeI76N(-ZWQg$$i zIlcpul~}}w(Pxc0^a`*ZVE3|#w28dYG0(_OmLEU183w4Wut3L_zn4xI=IKh%t0vn_ zmyRucJ5xKW<`!MP(Y+}&-{{pe8{Ix*rqAe@VT_$>G#486`55vD#&5hrVfRhdA3%3E_6}Vw ziRug*>Ue|lO3pQtZ`Kq`{U)_N)G)i4&!l~`?fd|poxP0hj5$#h)P&ID;*Gqc^H{N+hvBOwR@(qv zySr&Q1D8q3rB;3q{$sSmx#d(^T0rfkBeW*YOC6=wIPcK!1BwAHXW=R*(z((ZHi2c} zuBPQ;I5#}8l6TGstrdnE=2+odV9v8Pv;2u-a0L6xpuy66$x~*<<5FRn%cg$4P?Z`L zoglSA$4xJn74UD+)iM_{mt1bOsb7Xv{o_)H`A}MZCzqJ3n1Xkw$Uw4!pH*-!UIy4L#o86> zsIc-kX+GCEtqyOrdvSN6p;1+Pz&^`)bT zZmELr9wXykMZcQOVrx{S-KnxPO_}9MCgpcXp8Q$SX-<}PiWQxG3f_+!zKnsin{KL&1Q+kG)Z@3iyex!>A!9a4+N#_1Q|kw`t14+}EEd1V}jTxurZEI;jd z>19o=90@8e6<&_ZfnC+c|E*F+OftP2-{o?nPCrSJl>4*Q~0lSyZ>Gv1ZNs zB`Z614?dykY+byV*O)bQZOLd$k4~ObrFfQI>awV%R;4-Au+(K>j}NS6(#uQp40{xA z4_#cEH(1%#3g1kgswUn{9aS!JRGX>4${ps#XEN5j7Wz$9I+awLBh>{|X-p=qsU97s z-ZCiBzp38B=aQ$!Wl;;YN_(iG#$`~yAfu?(i_@% delta 5736 zcmaJ_3s_V~mcG@{(2YQM^GMUYx`A%q^6&*Du+j<2g`kLlqGHl2BACohCqc8BF#~GW z*oh?QaTDFl12@UeWM=XWh?AJaQCyQ5cVe=b$(K=FFGE4jSbRa{h#TlA!{itbXMnIVa@0D^!67prx4+~+V;uqVBxmR#`%K; zud3rvOnZBTAm{}H2Il?O#kV7rL4wBDVdHI`5T#C33zC6%M(MT6B1$M%{`?OrlE-`sO*L2xEdkn_0a_jit$0*C6ab@cgO#=!!Xvx^Xb}Al=mDTz0cZ$a z1$qT&PXKD5B%_rQjo}f!0cbeQ16l*LF940CCxLDNIv9YO>2;&k7NxldITWCc9vCs& zLukxcy2G4m$gUNH>yj&nvdv%2K;?t^f?%>&2*M3h@)u~kCbg3$wPUH+a*QgX;u;s_ z3xfCu!5;+Sd(Ur?BuVsL^Pfai#O|_r(E4xRk)+E8lU>++nRToYpVhY%aN}#OqUj%x zqPZMRULW+Q!FvD|)E^ZSUNiyUw$SmY^pzL$1mT?)%*+2n&KTKhvOguyR^z?GefgUA z0-6f%OIT0v9#iz!O_mC$K^y(hgw;AeOnj#gu7h#iZuq}$V#ph}m5Kcd+XxCT6#lco zO!hAB1K6v+Ir_a5yar|TtZ$5Nx;7|#;J<{;-DF9m1Jut!6rLY|b5a63`g00Ua`_`}MC;PU8?upHqzbQxa%?!57Q?R($j(lYo!_lWy z>WK;6qfkEl?UYH>7n@N*xf7t~KaW4)9V=P*gJ$O$HMhcoVm=|{0{s)<_~vn!Kh zmS(5He-CWyxVy90Kyip?-oioum7Ca1o8K)z#B=I$ce8u>=n_Ec<4N1&V%Z9s zVRNxnwAAKg|3jU&Qr1ofY`Lt3MuE4IDL#+Y(m%vk=e%N&N4)Q5P045LE+Gsl8T|Rk z)PD(CeR38vRg&8OrA&jkOGR#St%Uq+W4y>42fzbe-s&|jsogc;Grf0O4Z z*7&Y1Q>K_sMTzblCz1s?zIOL9m*Q>$x(B+OkWMcr&SpQL&k~y$qtYZ-VYc%--92`L z$Iu+(tnoc8&p3y6B+ZU~7BHe6;K>toKBl;?DYJ#$r(`0}l_)7ea0bnxTrv8R#lIaZC4{IAkh=Neta zRT_6zW8ajze!@PZQ?9hIvzL{ghmv?ta-@fc^d_OhmozQ?du%!Nraz&3IgvhfrH1|N zl49DKIAQ8Zgy}ov{QDNOd)3aUS9e^+}qT>HgYRS%E zKH8Rjn$^?VoCM?ZxHs`&EvDyk;^Jrhm3PE>X;#AMcgcV`sgVYA?7CPhjpWoNpTo@; zRyEwJ>9GHPoLMlJo93kVXhCj8XeMa=X&30F+-ytH1xXs+A{(KP!T=rsSZ4(=yd#CTzC7bBP^bL8taE7dpf^y~ysNwv* zVS2g)Hw*WEe%VEOzJPpQ_)(4TBC_jc+B9P+8>7=RlEZAk(JjIqN&1;@>WU-i;LQC~ z)<VKfPOSw8y2 z?q;V~+E=p-{XrpeiaV;<4*BT$|Zrk#SIiw+!X9NDM>U~R3CZ^dD;K*Es8H* zIKA%#&*I5nxEI%Smn}mAaxbd#ed2Dc^8L--;P!phD0WY+0bE~`b(;_Fmp&OV-94@! z7&DsL-tv7xo&3cDNk;G zhb;KBB+bCS^}>^_kPf)XUuYbdLoRNSgP#J2>^lv6NC9iv2EGEw4Z%XE;6bGaz!2ms zkiv*04MKV$KZfi(CrM+FW1mSPr2ujpej+_FXFA(Sug|H~ef}}snX`}$Q{CKJ zr1m{?>-AR{=##3}<87rW=%4O_bdcCq9btdMQ^&o&*DkUGa3BmjIB(nA5-K43ekU$p9vp|+ns5UnBnvj3MebV*Fsha=EU6j7C< zFGX!Dl~pAqHKVp4!C$S?aA&;0V$NgbX+>4^)F%Q4*agk6p}B|em7pXR=~xnpJAfeK zw-34@`fXLEUciFvBUW9cFGa1G%Bu7AUeqe+k?M!^!?JdyI$@dKZFSKGt?uD^IRuDgemPo#Z&)de- z&l*}MToduyZ5A938Y(SAIGK6b&36sT8&uagGY3u;D{Xo0Q9THK z6r3H#3CYXds{Q*k(r9HL(cuPsy;x+JSYX9U5YBPUiaToW-V6oj`HdF?)|a#PDc=}` z_3C;%RNUcG?21+UJQZIsO~K1mdpGS_WR0p*wOwh7*2U;w7bUZGT)ASWs`R8N%3783 zrz$>#dj~K7s|ewjD(=Wr0Qs>(bsumu-@hK!P!)%1#bRsc5VRJZO)yN-p5{2FDwHec zcpl|)1vn3Y2k&3J@Z`qJEe)5mA09EZc`+bkaC>i&($q%E&Z51}CcmesEGCH4ATRnw})qu+JwmJpv zYo0n2fhR|K%_n*Ka~1DZL!sriy}->vykPdI#Hu)`*hdKt^+DaC2K5Mc-gX*l9`UyO z6niyw2sjCvtZbbB&kRBjzFL#1O~7Rrol0LVk0%31lF&utN!+u4UQSlyinYHhaURtr#YTKr2y*L_q8C7@aPCk~ApU~VV^FaU7w;BDYIeK7E z^AH=@_P|p*8q|dhm{)k1?k1y0SLRs;^$LY1lc|4YzD3K0n#`v0mH9@4noSLqwW`Rt zRc#@)t}2>HxLV?EqyAN$OrXM+e4_(bNM0P&-cm%K7Ben&kF?lnsKp$mEr}*w8FalR zFHFn+NTJ!Qx8M!3e|5e^OQf3IO5>~Z$0K(&lV9PIxG>sJ7;