squiggle/packages/website/docs/Api/DistGeneric.mdx

639 lines
11 KiB
Plaintext
Raw Normal View History

---
2022-06-11 00:35:48 +00:00
sidebar_position: 3
title: Distribution
---
2022-06-11 00:35:48 +00:00
2022-06-13 19:10:24 +00:00
Distributions are the flagship data type in Squiggle. The distribution type is a generic data type that contains one of three different formats of distributions.
These subtypes are [point set](/docs/Api/DistPointSet), [sample set](/docs/Api/DistSampleSet), and symbolic. The first two of these have a few custom functions that only work on them. You can read more about the differences between these formats [here](/docs/Discussions/Three-Formats-Of-Distributions).
2022-06-13 04:19:28 +00:00
2022-06-13 19:10:24 +00:00
Several functions below only can work on particular distribution formats.
2022-06-13 04:19:28 +00:00
For example, scoring and pointwise math requires the point set format. When this happens, the types are automatically converted to the correct format. These conversions are lossy.
import TOCInline from "@theme/TOCInline"
2022-06-06 03:02:17 +00:00
<TOCInline toc={toc} />
2022-06-11 00:35:48 +00:00
## Distribution Creation
2022-06-06 03:02:17 +00:00
2022-06-13 19:10:24 +00:00
These are functions for creating primative distributions. Many of these could optionally take in distributions as inputs. In these cases, Monte Carlo Sampling will be used to generate the greater distribution. This can be used for simple hierarchical models.
2022-06-06 03:02:17 +00:00
2022-06-13 04:19:28 +00:00
See a longer tutorial on creating distributions [here](/docs/Guides/DistributionCreation).
2022-06-06 03:02:17 +00:00
2022-06-13 04:33:40 +00:00
### normal
2022-06-06 03:02:17 +00:00
```
2022-06-13 04:19:28 +00:00
normal: (distribution|number, distribution|number) => distribution
normal: (dict<{p5: distribution|number, p95: distribution|number}>) => distribution
normal: (dict<{mean: distribution|number, stdev: distribution|number}>) => distribution
2022-06-06 03:02:17 +00:00
```
**Examples**
2022-06-11 00:35:48 +00:00
```js
2022-06-13 04:19:28 +00:00
normal(5, 1)
normal({ p5: 4, p95: 10 })
normal({ mean: 5, stdev: 2 })
normal(5 to 10, normal(3, 2))
normal({ mean: uniform(5, 9), stdev: 3 })
2022-06-06 03:02:17 +00:00
```
2022-06-13 04:33:40 +00:00
### lognormal
2022-06-06 03:02:17 +00:00
```
2022-06-13 04:19:28 +00:00
lognormal: (distribution|number, distribution|number) => distribution
lognormal: (dict<{p5: distribution|number, p95: distribution|number}>) => distribution
lognormal: (dict<{mean: distribution|number, stdev: distribution|number}>) => distribution
2022-06-06 03:02:17 +00:00
```
**Examples**
```javascript
2022-06-13 04:19:28 +00:00
lognormal(0.5, 0.8)
lognormal({ p5: 4, p95: 10 })
lognormal({ mean: 5, stdev: 2 })
2022-06-06 03:02:17 +00:00
```
2022-06-13 04:33:40 +00:00
### uniform
2022-06-06 03:02:17 +00:00
```
2022-06-13 04:19:28 +00:00
uniform: (distribution|number, distribution|number) => distribution
2022-06-06 03:02:17 +00:00
```
**Examples**
```javascript
2022-06-13 04:19:28 +00:00
uniform(10, 12)
2022-06-06 03:02:17 +00:00
```
2022-06-13 04:33:40 +00:00
### beta
2022-06-06 03:02:17 +00:00
```
2022-06-13 04:19:28 +00:00
beta: (distribution|number, distribution|number) => distribution
2022-06-06 03:02:17 +00:00
```
**Examples**
```javascript
2022-06-13 04:19:28 +00:00
beta(20, 25)
2022-06-06 03:02:17 +00:00
```
2022-06-13 04:33:40 +00:00
### cauchy
2022-06-06 03:02:17 +00:00
```
2022-06-13 04:19:28 +00:00
cauchy: (distribution|number, distribution|number) => distribution
2022-06-06 03:02:17 +00:00
```
**Examples**
```javascript
2022-06-13 04:19:28 +00:00
cauchy(5, 1)
2022-06-06 03:02:17 +00:00
```
2022-06-13 04:33:40 +00:00
### gamma
2022-06-06 03:02:17 +00:00
```javascript
2022-06-13 04:19:28 +00:00
gamma: (distribution|number, distribution|number) => distribution
2022-06-06 03:02:17 +00:00
```
**Examples**
```javascript
2022-06-13 04:19:28 +00:00
gamma(5, 1)
2022-06-06 03:02:17 +00:00
```
2022-06-13 04:19:28 +00:00
### Logistic
2022-06-06 03:02:17 +00:00
```
2022-06-13 04:19:28 +00:00
logistic: (distribution|number, distribution|number) => distribution
2022-06-06 03:02:17 +00:00
```
**Examples**
```javascript
2022-06-13 04:19:28 +00:00
gamma(5, 1)
2022-06-06 03:02:17 +00:00
```
2022-06-13 04:33:40 +00:00
### exponential
2022-06-06 03:02:17 +00:00
```
2022-06-13 04:33:40 +00:00
exponential: (distribution|number) => distribution
```
**Examples**
```javascript
exponential(2)
```
### bernoulli
2022-06-13 04:19:28 +00:00
```
2022-06-13 04:33:40 +00:00
bernoulli: (distribution|number) => distribution
2022-06-06 03:02:17 +00:00
```
**Examples**
```javascript
2022-06-13 04:33:40 +00:00
bernoulli(0.5)
2022-06-06 03:02:17 +00:00
```
2022-06-13 04:33:40 +00:00
### triangular
2022-06-06 03:02:17 +00:00
```javascript
2022-06-13 04:33:40 +00:00
triangular: (number, number, number) => distribution
2022-06-06 03:02:17 +00:00
```
**Examples**
```javascript
2022-06-13 04:33:40 +00:00
triangular(5, 10, 20)
2022-06-06 03:02:17 +00:00
```
2022-06-13 04:33:40 +00:00
### To / credibleIntervalToDistribution
2022-06-06 03:02:17 +00:00
2022-06-13 04:33:40 +00:00
The `to` function is an easy way to generate simple distributions using predicted _5th_ and _95th_ percentiles.
If both values are above zero, a `lognormal` distribution is used. If not, a `normal` distribution is used.
2022-06-13 19:10:24 +00:00
`To` is an alias for `credibleIntervalToDistribution`. However, because of its frequent use, it is recommended to use the shorter name.
2022-06-06 03:02:17 +00:00
```
2022-06-13 04:33:40 +00:00
to: (distribution|number, distribution|number) => distribution
credibleIntervalToDistribution(distribution|number, distribution|number) => distribution
2022-06-06 03:02:17 +00:00
```
**Examples**
```javascript
2022-06-13 04:33:40 +00:00
5 to 10
to(5,10)
-5 to 5
2022-06-06 03:02:17 +00:00
```
2022-06-05 20:59:45 +00:00
### mixture
```
2022-06-13 04:19:28 +00:00
mixture: (...distributionLike, weights?:list<float>) => distribution
mixture: (list<distributionLike>, weights?:list<float>) => distribution
```
**Examples**
2022-06-05 21:13:56 +00:00
```javascript
2022-06-13 04:19:28 +00:00
mixture(normal(5, 1), normal(10, 1), 8)
mx(normal(5, 1), normal(10, 1), [0.3, 0.7])
mx([normal(5, 1), normal(10, 1)], [0.3, 0.7])
```
2022-06-13 04:33:40 +00:00
## Functions
2022-06-05 20:59:45 +00:00
### sample
2022-06-05 21:13:56 +00:00
2022-06-13 04:19:28 +00:00
One random sample from the distribution
```
sample: (distribution) => number
```
**Examples**
2022-06-05 21:13:56 +00:00
```javascript
2022-06-13 04:19:28 +00:00
sample(normal(5, 2))
```
2022-06-05 20:59:45 +00:00
### sampleN
2022-06-05 21:13:56 +00:00
2022-06-13 04:19:28 +00:00
N random samples from the distribution
```
2022-06-11 15:47:52 +00:00
sampleN: (distribution, number) => list<number>
```
**Examples**
2022-06-05 21:13:56 +00:00
```javascript
2022-06-13 04:19:28 +00:00
sampleN(normal(5, 2), 100)
```
2022-06-05 20:59:45 +00:00
### mean
2022-06-05 21:13:56 +00:00
2022-06-13 04:19:28 +00:00
The distribution mean
```
2022-06-13 04:19:28 +00:00
mean: (distribution) => number
```
**Examples**
2022-06-05 21:13:56 +00:00
```javascript
2022-06-13 04:19:28 +00:00
mean(normal(5, 2))
```
2022-06-05 20:59:45 +00:00
### stdev
2022-06-13 04:19:28 +00:00
Standard deviation. Only works now on sample set distributions (so converts other distributions into sample set in order to calculate.)
```
2022-06-13 04:19:28 +00:00
stdev: (distribution) => number
2022-06-05 20:59:45 +00:00
```
### variance
2022-06-13 04:19:28 +00:00
2022-06-13 19:10:24 +00:00
Variance. Similar to stdev, only works now on sample set distributions.
2022-06-05 20:59:45 +00:00
```
2022-06-13 04:19:28 +00:00
variance: (distribution) => number
2022-06-05 20:59:45 +00:00
```
### mode
```
2022-06-13 04:19:28 +00:00
mode: (distribution) => number
2022-06-05 20:59:45 +00:00
```
### cdf
```
2022-06-13 04:19:28 +00:00
cdf: (distribution, number) => number
```
**Examples**
2022-06-05 21:13:56 +00:00
```javascript
2022-06-13 04:19:28 +00:00
cdf(normal(5, 2), 3)
```
2022-06-05 20:59:45 +00:00
### pdf
```
2022-06-13 04:19:28 +00:00
pdf: (distribution, number) => number
2022-06-05 20:59:45 +00:00
```
**Examples**
2022-06-05 21:13:56 +00:00
2022-06-05 20:59:45 +00:00
```javascript
2022-06-13 04:19:28 +00:00
pdf(normal(5, 2), 3)
2022-06-05 20:59:45 +00:00
```
2022-06-13 04:19:28 +00:00
### quantile
```
2022-06-13 04:19:28 +00:00
quantile: (distribution, number) => number
```
**Examples**
2022-06-05 21:13:56 +00:00
```javascript
2022-06-13 04:19:28 +00:00
quantile(normal(5, 2), 0.5)
```
2022-06-05 20:59:45 +00:00
### toPointSet
2022-06-05 21:13:56 +00:00
**TODO: Will soon be called "PointSet.make"**
Converts a distribution to the pointSet format
```
2022-06-13 04:19:28 +00:00
toPointSet: (distribution) => pointSetDistribution
```
**Examples**
2022-06-05 21:13:56 +00:00
```javascript
2022-06-13 04:19:28 +00:00
toPointSet(normal(5, 2))
```
2022-06-05 20:59:45 +00:00
### toSampleSet
2022-06-13 19:10:24 +00:00
**TODO: Will soon be called "SampleSet.make"**
2022-06-05 21:13:56 +00:00
Converts a distribution to the sampleSet format, with n samples
```
2022-06-13 04:19:28 +00:00
toSampleSet: (distribution, number) => sampleSetDistribution
```
**Examples**
2022-06-05 21:13:56 +00:00
```javascript
2022-06-13 04:19:28 +00:00
toSampleSet(normal(5, 2), 1000)
```
2022-06-05 20:59:45 +00:00
### truncateLeft
2022-06-05 21:13:56 +00:00
Truncates the left side of a distribution. Returns either a pointSet distribution or a symbolic distribution.
```
truncateLeft: (distribution, l => number) => distribution
```
**Examples**
2022-06-05 21:13:56 +00:00
```javascript
2022-06-13 04:19:28 +00:00
truncateLeft(normal(5, 2), 3)
```
2022-06-05 20:59:45 +00:00
### truncateRight
2022-06-05 21:13:56 +00:00
Truncates the right side of a distribution. Returns either a pointSet distribution or a symbolic distribution.
```
truncateRight: (distribution, r => number) => distribution
```
**Examples**
2022-06-05 21:13:56 +00:00
```javascript
2022-06-13 04:19:28 +00:00
truncateLeft(normal(5, 2), 6)
```
2022-06-05 20:59:45 +00:00
### klDivergence
2022-06-05 21:13:56 +00:00
2022-06-13 04:19:28 +00:00
[KullbackLeibler divergence](https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence) between two distributions.
```
2022-06-13 04:19:28 +00:00
klDivergence: (distribution, distribution) => number
```
**Examples**
2022-06-05 21:13:56 +00:00
```javascript
2022-06-13 04:19:28 +00:00
klDivergence(normal(5, 2), normal(5, 4)) // returns 0.57
```
2022-06-06 03:02:17 +00:00
## Display
2022-06-05 20:59:45 +00:00
### toString
```
2022-06-13 04:19:28 +00:00
toString: (distribution) => string
```
**Examples**
2022-06-05 21:13:56 +00:00
```javascript
2022-06-13 04:19:28 +00:00
toString(normal(5, 2))
```
2022-06-05 20:59:45 +00:00
### toSparkline
2022-06-05 21:13:56 +00:00
Produce a sparkline of length n
```
2022-06-13 04:19:28 +00:00
toSparkline: (distribution, n = 20) => string
```
**Examples**
2022-06-05 21:13:56 +00:00
```javascript
2022-06-13 04:19:28 +00:00
toSparkline(normal(5, 2), 10)
```
2022-06-05 20:59:45 +00:00
### inspect
2022-06-05 21:13:56 +00:00
Prints the value of the distribution to the Javascript console, then returns the distribution.
```
2022-06-13 04:19:28 +00:00
inspect: (distribution) => distribution
```
**Examples**
2022-06-05 21:13:56 +00:00
```javascript
2022-06-13 04:19:28 +00:00
inspect(normal(5, 2))
```
2022-06-06 03:02:17 +00:00
## Normalization
2022-06-05 20:59:45 +00:00
### normalize
2022-06-05 21:13:56 +00:00
Normalize a distribution. This means scaling it appropriately so that it's cumulative sum is equal to 1.
```
2022-06-13 04:19:28 +00:00
normalize: (distribution) => distribution
```
**Examples**
2022-06-05 21:13:56 +00:00
```javascript
2022-06-13 04:19:28 +00:00
normalize(normal(5, 2))
```
2022-06-05 20:59:45 +00:00
### isNormalized
2022-06-05 21:13:56 +00:00
Check of a distribution is normalized. Most distributions are typically normalized, but there are some commands that could produce non-normalized distributions.
```
2022-06-13 04:19:28 +00:00
isNormalized: (distribution) => bool
```
**Examples**
2022-06-05 21:13:56 +00:00
```javascript
2022-06-13 04:19:28 +00:00
isNormalized(normal(5, 2)) // returns true
```
2022-06-05 20:59:45 +00:00
### integralSum
2022-06-05 21:13:56 +00:00
Get the sum of the integral of a distribution. If the distribution is normalized, this will be 1.
```
2022-06-13 04:19:28 +00:00
integralSum: (distribution) => number
```
**Examples**
2022-06-05 21:13:56 +00:00
```javascript
2022-06-13 04:19:28 +00:00
integralSum(normal(5, 2))
```
2022-06-13 04:19:28 +00:00
## Regular Arithmetic Operations
Regular arithmetic operations cover the basic mathematical operations on distributions. They work much like their equivalent operations on numbers.
2022-06-13 19:10:24 +00:00
The infixes `+`,`-`, `*`, `/`, `^`, `-` are supported for addition, subtraction, multiplication, division, power, and unaryMinus.
2022-06-13 04:19:28 +00:00
```javascript
pointMass(5 + 10) == pointMass(5) + pointMass(10)
```
2022-06-06 03:02:17 +00:00
2022-06-05 20:59:45 +00:00
### add
```
2022-06-13 04:19:28 +00:00
add: (distributionLike, distributionLike) => distribution
```
**Examples**
2022-06-13 04:19:28 +00:00
```javascript
2022-06-13 19:10:24 +00:00
normal(0, 1) + normal(1, 3) // returns normal(1, 3.16...)
add(normal(0, 1), normal(1, 3)) // returns normal(1, 3.16...)
```
2022-06-05 20:59:45 +00:00
### sum
2022-06-13 19:10:24 +00:00
2022-06-13 04:19:28 +00:00
**Todo: Not yet implemented for distributions**
```
sum: (list<distributionLike>) => distribution
```
**Examples**
2022-06-13 04:19:28 +00:00
```javascript
2022-06-13 19:10:24 +00:00
sum([normal(0, 1), normal(1, 3), uniform(10, 1)])
2022-06-13 04:19:28 +00:00
```
2022-06-05 20:59:45 +00:00
### multiply
```
2022-06-13 04:19:28 +00:00
multiply: (distributionLike, distributionLike) => distribution
```
2022-06-05 20:59:45 +00:00
### product
```
product: (list<distributionLike>) => distribution
```
2022-06-05 20:59:45 +00:00
### subtract
```
2022-06-13 04:19:28 +00:00
subtract: (distributionLike, distributionLike) => distribution
```
2022-06-05 20:59:45 +00:00
### divide
```
2022-06-13 04:19:28 +00:00
divide: (distributionLike, distributionLike) => distribution
```
2022-06-05 20:59:45 +00:00
### pow
```
2022-06-13 04:19:28 +00:00
pow: (distributionLike, distributionLike) => distribution
```
2022-06-05 20:59:45 +00:00
### exp
```
2022-06-13 04:19:28 +00:00
exp: (distributionLike, distributionLike) => distribution
```
2022-06-05 20:59:45 +00:00
### log
```
2022-06-13 04:19:28 +00:00
log: (distributionLike, distributionLike) => distribution
```
2022-06-05 20:59:45 +00:00
### log10
```
2022-06-13 04:19:28 +00:00
log10: (distributionLike, distributionLike) => distribution
```
2022-06-05 20:59:45 +00:00
### unaryMinus
```
2022-06-13 04:19:28 +00:00
unaryMinus: (distribution) => distribution
```
**Examples**
2022-06-13 04:19:28 +00:00
```javascript
2022-06-13 19:10:24 +00:00
-normal(5, 2) // same as normal(-5, 2)
unaryMinus(normal(5, 2)) // same as normal(-5, 2)
```
2022-06-13 04:19:28 +00:00
## Pointwise Arithmetic Operations
2022-06-06 03:02:17 +00:00
2022-06-05 20:59:45 +00:00
### dotAdd
```
2022-06-13 04:19:28 +00:00
dotAdd: (distributionLike, distributionLike) => distribution
```
2022-06-05 20:59:45 +00:00
### dotMultiply
```
2022-06-13 04:19:28 +00:00
dotMultiply: (distributionLike, distributionLike) => distribution
```
2022-06-05 20:59:45 +00:00
### dotSubtract
```
2022-06-13 04:19:28 +00:00
dotSubtract: (distributionLike, distributionLike) => distribution
```
2022-06-05 20:59:45 +00:00
### dotDivide
```
2022-06-13 04:19:28 +00:00
dotDivide: (distributionLike, distributionLike) => distribution
```
2022-06-05 20:59:45 +00:00
### dotPow
```
2022-06-13 04:19:28 +00:00
dotPow: (distributionLike, distributionLike) => distribution
```
2022-06-05 20:59:45 +00:00
### dotExp
```
2022-06-13 04:19:28 +00:00
dotExp: (distributionLike, distributionLike) => distribution
```
2022-06-13 04:19:28 +00:00
## Scale Arithmetic Operations
2022-06-06 03:02:17 +00:00
2022-06-05 20:59:45 +00:00
### scaleMultiply
```
2022-06-13 04:19:28 +00:00
scaleMultiply: (distributionLike, number) => distribution
```
2022-06-05 20:59:45 +00:00
### scalePow
```
2022-06-13 04:19:28 +00:00
scalePow: (distributionLike, number) => distribution
2022-06-05 20:59:45 +00:00
```
### scaleExp
```
2022-06-13 04:19:28 +00:00
scaleExp: (distributionLike, number) => distribution
```
2022-06-05 20:59:45 +00:00
### scaleLog
```
2022-06-13 04:19:28 +00:00
scaleLog: (distributionLike, number) => distribution
```
2022-06-05 20:59:45 +00:00
### scaleLog10
```
2022-06-13 04:19:28 +00:00
scaleLog10: (distributionLike, number) => distribution
```
2022-06-11 00:35:48 +00:00
## Special
2022-06-13 04:19:28 +00:00
### Declaration (Continuous Functions)
2022-06-11 00:35:48 +00:00
Adds metadata to a function of the input ranges. Works now for numeric and date inputs. This is useful when making predictions. It allows you to limit the domain that your prediction will be used and scored within.
```
2022-06-11 15:47:52 +00:00
declareFn: (dict<{fn: lambda, inputs: array<dict<{min: number, max: number}>>}>) => declaration
2022-06-11 00:35:48 +00:00
```
**Examples**
```javascript
2022-06-11 00:35:48 +00:00
declareFn({
fn: {|a,b| a },
inputs: [
{min: 0, max: 100},
{min: 30, max: 50}
]
})
```