Starting to pull out distributions functionality
This commit is contained in:
parent
c0ec3b02b7
commit
37047ac9ff
258
packages/website/docs/Features/Distributions.mdx
Normal file
258
packages/website/docs/Features/Distributions.mdx
Normal file
|
@ -0,0 +1,258 @@
|
||||||
|
---
|
||||||
|
title: "Creating Distributions"
|
||||||
|
sidebar_position: 8
|
||||||
|
---
|
||||||
|
|
||||||
|
import TOCInline from "@theme/TOCInline";
|
||||||
|
import { SquiggleEditor } from "../../src/components/SquiggleEditor";
|
||||||
|
import Admonition from "@theme/Admonition";
|
||||||
|
import Tabs from "@theme/Tabs";
|
||||||
|
import TabItem from "@theme/TabItem";
|
||||||
|
|
||||||
|
<TOCInline toc={toc} maxHeadingLevel={2} />
|
||||||
|
|
||||||
|
## To
|
||||||
|
|
||||||
|
`(5thPercentile: float) to (95thPercentile: float)`
|
||||||
|
`to(5thPercentile: float, 95thPercentile: float)`
|
||||||
|
|
||||||
|
The `to` function is an easy way to generate simple distributions using predicted _5th_ and _95th_ percentiles.
|
||||||
|
|
||||||
|
If both values are above zero, a `lognormal` distribution is used. If not, a `normal` distribution is used.
|
||||||
|
|
||||||
|
<Tabs>
|
||||||
|
<TabItem value="ex1" label="5 to 10" default>
|
||||||
|
When `5 to 10` is entered, both numbers are positive, so it generates a
|
||||||
|
lognormal distribution with 5th and 95th percentiles at 5 and 10.
|
||||||
|
<SquiggleEditor initialSquiggleString="5 to 10" />
|
||||||
|
</TabItem>
|
||||||
|
<TabItem value="ex3" label="to(5,10)" default>
|
||||||
|
`5 to 10` does the same thing as `to(5,10)`.
|
||||||
|
<SquiggleEditor initialSquiggleString="to(5,10)" />
|
||||||
|
</TabItem>
|
||||||
|
<TabItem value="ex2" label="-5 to 5">
|
||||||
|
When `-5 to 5` is entered, there's negative values, so it generates a normal
|
||||||
|
distribution. This has 5th and 95th percentiles at 5 and 10.
|
||||||
|
<SquiggleEditor initialSquiggleString="-5 to -3" />
|
||||||
|
</TabItem>
|
||||||
|
<TabItem value="ex4" label="1 to 10000">
|
||||||
|
It's very easy to generate distributions with very long tails. If this
|
||||||
|
happens, you can click the "log x scale" box to view this using a log scale.
|
||||||
|
<SquiggleEditor initialSquiggleString="1 to 10000" />
|
||||||
|
</TabItem>
|
||||||
|
</Tabs>
|
||||||
|
|
||||||
|
### Arguments
|
||||||
|
|
||||||
|
- `5thPercentile`: Float
|
||||||
|
- `95thPercentile`: Float
|
||||||
|
|
||||||
|
<Admonition type="tip" title="Tip">
|
||||||
|
<p>
|
||||||
|
"<bold>To</bold>" is a great way to generate probability distributions very
|
||||||
|
quickly from your intuitions. It's easy to write and easy to read. It's
|
||||||
|
often a good place to begin an estimate.
|
||||||
|
</p>
|
||||||
|
</Admonition>
|
||||||
|
|
||||||
|
<Admonition type="caution" title="Caution">
|
||||||
|
<p>
|
||||||
|
If you haven't tried{" "}
|
||||||
|
<a href="https://www.lesswrong.com/posts/LdFbx9oqtKAAwtKF3/list-of-probability-calibration-exercises">
|
||||||
|
calibration training
|
||||||
|
</a>
|
||||||
|
, you're likely to be overconfident. We recommend doing calibration training
|
||||||
|
to get a feel for what a 90 percent confident interval feels like.
|
||||||
|
</p>
|
||||||
|
</Admonition>
|
||||||
|
|
||||||
|
## Mixture
|
||||||
|
|
||||||
|
`mixture(...distributions: Distribution[], weights?: float[])`
|
||||||
|
`mx(...distributions: Distribution[], weights?: float[])`
|
||||||
|
|
||||||
|
The `mixture` mixes combines multiple distributions to create a mixture. You can optionally pass in a list of proportional weights.
|
||||||
|
|
||||||
|
<Tabs>
|
||||||
|
<TabItem value="ex1" label="Simple" default>
|
||||||
|
<SquiggleEditor initialSquiggleString="mixture(1 to 2, 5 to 8, 9 to 10)" />
|
||||||
|
</TabItem>
|
||||||
|
<TabItem value="ex2" label="With Weights" default>
|
||||||
|
<SquiggleEditor initialSquiggleString="mixture(1 to 2, 5 to 8, 9 to 10, [0.1, 0.1, 0.8])" />
|
||||||
|
</TabItem>
|
||||||
|
<TabItem value="ex3" label="With Continuous and Discrete Inputs" default>
|
||||||
|
<SquiggleEditor initialSquiggleString="mixture(1 to 5, 8 to 10, 1, 3, 20)" />
|
||||||
|
</TabItem>
|
||||||
|
</Tabs>
|
||||||
|
|
||||||
|
### Arguments
|
||||||
|
|
||||||
|
- `distributions`: A set of distributions or floats, each passed as a paramater. Floats will be converted into Delta distributions.
|
||||||
|
- `weights`: An optional array of floats, each representing the weight of its corresponding distribution. The weights will be re-scaled to add to `1.0`. If a weights array is provided, it must be the same length as the distribution paramaters.
|
||||||
|
|
||||||
|
### Aliases
|
||||||
|
|
||||||
|
- `mx`
|
||||||
|
|
||||||
|
### Special Use Cases of Mixtures
|
||||||
|
|
||||||
|
<details>
|
||||||
|
<summary>🕐 Zero or Continuous</summary>
|
||||||
|
<p>
|
||||||
|
One common reason to have mixtures of continous and discrete distributions is to handle the special case of 0.
|
||||||
|
Say I want to model the time I will spend on some upcoming assignment. I think I have an 80% chance of doing it.
|
||||||
|
</p>
|
||||||
|
|
||||||
|
<p>
|
||||||
|
In this case, I have a 20% chance of spending 0 time with it. I might estimate my hours with,
|
||||||
|
</p>
|
||||||
|
<SquiggleEditor
|
||||||
|
initialSquiggleString={`hours_the_project_will_take = 5 to 20
|
||||||
|
chance_of_doing_anything = 0.8
|
||||||
|
mx(hours_the_project_will_take, 0, [chance_of_doing_anything, 1 - chance_of_doing_anything])`}
|
||||||
|
/>
|
||||||
|
</details>
|
||||||
|
|
||||||
|
<details>
|
||||||
|
<summary>🔒 Model Uncertainty Safeguarding</summary>
|
||||||
|
<p>
|
||||||
|
One technique several <a href="https://www.foretold.io/">Foretold.io</a> users used is to combine their main guess, with a
|
||||||
|
"just-in-case distribution". This latter distribution would have very low weight, but would be
|
||||||
|
very wide, just in case they were dramatically off for some weird reason.
|
||||||
|
</p>
|
||||||
|
<p>
|
||||||
|
One common reason to have mixtures of continous and discrete distributions is to handle the special case of 0.
|
||||||
|
Say I want to model the time I will spend on some upcoming assignment. I think I have an 80% chance of doing it.
|
||||||
|
</p>
|
||||||
|
<SquiggleEditor
|
||||||
|
initialSquiggleString={`forecast = 3 to 30
|
||||||
|
chance_completely_wrong = 0.05
|
||||||
|
forecast_if_completely_wrong = -100 to 200
|
||||||
|
mx(forecast, forecast_if_completely_wrong, [1-chance_completely_wrong, chance_completely_wrong])`}
|
||||||
|
/>
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
|
## Normal
|
||||||
|
|
||||||
|
`normal(mean:float, standardDeviation:float)`
|
||||||
|
|
||||||
|
<Tabs>
|
||||||
|
<TabItem value="ex1" label="normal(5,1)" default>
|
||||||
|
<SquiggleEditor initialSquiggleString="normal(5, 1)" />
|
||||||
|
</TabItem>
|
||||||
|
<TabItem value="ex2" label="normal(10m, 10m)" default>
|
||||||
|
<SquiggleEditor initialSquiggleString="normal(100000000000, 100000000000)" />
|
||||||
|
</TabItem>
|
||||||
|
</Tabs>
|
||||||
|
|
||||||
|
### Arguments
|
||||||
|
|
||||||
|
- `mean`: Float
|
||||||
|
- `standard deviation`: Float greater than zero
|
||||||
|
|
||||||
|
[Wikipedia entry](https://en.wikipedia.org/wiki/Normal_distribution)
|
||||||
|
|
||||||
|
## Log-normal
|
||||||
|
|
||||||
|
The log of `lognormal(mu, sigma)` is a normal distribution with mean `mu` and standard deviation `sigma`.
|
||||||
|
|
||||||
|
`lognormal(mu: float, sigma: float)`
|
||||||
|
|
||||||
|
<SquiggleEditor initialSquiggleString="lognormal(0, 0.7)" />
|
||||||
|
|
||||||
|
### Arguments
|
||||||
|
|
||||||
|
- `mu`: Float
|
||||||
|
- `sigma`: Float greater than zero
|
||||||
|
|
||||||
|
[Wikipedia](https://en.wikipedia.org/wiki/Log-normal_distribution)
|
||||||
|
|
||||||
|
An alternative format is also available. The `to` notation creates a lognormal
|
||||||
|
distribution with a 90% confidence interval between the two numbers. We add
|
||||||
|
this convenience as lognormal distributions are commonly used in practice.
|
||||||
|
|
||||||
|
<SquiggleEditor initialSquiggleString="2 to 10" />
|
||||||
|
|
||||||
|
#### Future feature:
|
||||||
|
|
||||||
|
Furthermore, it's also possible to create a lognormal from it's actual mean
|
||||||
|
and standard deviation, using `lognormalFromMeanAndStdDev`.
|
||||||
|
|
||||||
|
TODO: interpreter/parser doesn't provide this in current `develop` branch
|
||||||
|
|
||||||
|
<SquiggleEditor initialSquiggleString="lognormalFromMeanAndStdDev(20, 10)" />
|
||||||
|
|
||||||
|
#### Validity
|
||||||
|
|
||||||
|
- `sigma > 0`
|
||||||
|
- In `x to y` notation, `x < y`
|
||||||
|
|
||||||
|
## Uniform
|
||||||
|
|
||||||
|
`normal(low:float, high:float)`
|
||||||
|
|
||||||
|
<Tabs>
|
||||||
|
<TabItem value="ex1" label="uniform(3,7)" default>
|
||||||
|
<SquiggleEditor initialSquiggleString="uniform(3,7)" />
|
||||||
|
</TabItem>
|
||||||
|
<TabItem value="ex2" label="invalid: uniform(7,5)" default>
|
||||||
|
<SquiggleEditor initialSquiggleString="uniform(7,5)" />
|
||||||
|
</TabItem>
|
||||||
|
</Tabs>
|
||||||
|
|
||||||
|
### Arguments
|
||||||
|
|
||||||
|
- `low`: Float
|
||||||
|
- `high`: Float greater than `low`
|
||||||
|
|
||||||
|
## Beta
|
||||||
|
|
||||||
|
The `beta(a, b)` function creates a beta distribution with parameters `a` and `b`:
|
||||||
|
|
||||||
|
<SquiggleEditor initialSquiggleString="beta(10, 20)" />
|
||||||
|
|
||||||
|
#### Validity
|
||||||
|
|
||||||
|
- `a > 0`
|
||||||
|
- `b > 0`
|
||||||
|
- Empirically, we have noticed that numerical instability arises when `a < 1` or `b < 1`
|
||||||
|
|
||||||
|
## Exponential
|
||||||
|
|
||||||
|
The `exponential(rate)` function creates an exponential distribution with the given
|
||||||
|
rate.
|
||||||
|
|
||||||
|
<SquiggleEditor initialSquiggleString="exponential(1.11)" />
|
||||||
|
|
||||||
|
#### Validity
|
||||||
|
|
||||||
|
- `rate > 0`
|
||||||
|
|
||||||
|
## Triangular distribution
|
||||||
|
|
||||||
|
The `triangular(a,b,c)` function creates a triangular distribution with lower
|
||||||
|
bound `a`, mode `b` and upper bound `c`.
|
||||||
|
|
||||||
|
#### Validity
|
||||||
|
|
||||||
|
- `a < b < c`
|
||||||
|
|
||||||
|
<SquiggleEditor initialSquiggleString="triangular(1, 2, 4)" />
|
||||||
|
|
||||||
|
### Scalar (constant dist)
|
||||||
|
|
||||||
|
Squiggle, when the context is right, automatically casts a float to a constant distribution.
|
||||||
|
|
||||||
|
## `fromSamples`
|
||||||
|
|
||||||
|
The last distribution constructor takes an array of samples and constructs a sample set distribution.
|
||||||
|
|
||||||
|
<SquiggleEditor initialSquiggleString="fromSamples([1,2,3,4,6,5,5,5])" />
|
||||||
|
|
||||||
|
#### Validity
|
||||||
|
|
||||||
|
For `fromSamples(xs)`,
|
||||||
|
|
||||||
|
- `xs.length > 5`
|
||||||
|
- Strictly every element of `xs` must be a number.
|
|
@ -113,31 +113,6 @@ For `fromSamples(xs)`,
|
||||||
|
|
||||||
Here are the ways we combine distributions.
|
Here are the ways we combine distributions.
|
||||||
|
|
||||||
### Mixture of distributions
|
|
||||||
|
|
||||||
The `mixture` function combines 2 or more other distributions to create a weighted
|
|
||||||
combination of the two. The first positional arguments represent the distributions
|
|
||||||
to be combined, and the last argument is how much to weigh every distribution in the
|
|
||||||
combination.
|
|
||||||
|
|
||||||
<SquiggleEditor initialSquiggleString="mixture(uniform(0,1), normal(1,1), [0.5, 0.5])" />
|
|
||||||
|
|
||||||
It's possible to create discrete distributions using this method.
|
|
||||||
|
|
||||||
<SquiggleEditor initialSquiggleString="mixture(0, 1, [0.2,0.8])" />
|
|
||||||
|
|
||||||
As well as mixed distributions:
|
|
||||||
|
|
||||||
<SquiggleEditor initialSquiggleString="mixture(3, 8, 1 to 10, [0.2, 0.3, 0.5])" />
|
|
||||||
|
|
||||||
An alias of `mixture` is `mx`
|
|
||||||
|
|
||||||
#### Validity
|
|
||||||
|
|
||||||
Using javascript's variable arguments notation, consider `mx(...dists, weights)`:
|
|
||||||
|
|
||||||
- `dists.length == weights.length`
|
|
||||||
|
|
||||||
### Addition
|
### Addition
|
||||||
|
|
||||||
A horizontal right shift
|
A horizontal right shift
|
||||||
|
|
Loading…
Reference in New Issue
Block a user