second pass; one CR comment

This commit is contained in:
Quinn Dougherty 2022-04-20 13:41:22 -04:00
parent 15ccf876a6
commit 9f35039b60
2 changed files with 104 additions and 60 deletions

View File

@ -1,5 +1,5 @@
--- ---
title: "Functions reference" title: "Functions Reference"
sidebar_position: 7 sidebar_position: 7
--- ---
@ -7,33 +7,33 @@ import { SquiggleEditor } from "../../src/components/SquiggleEditor";
_The source of truth for this document is [this file of code](https://github.com/quantified-uncertainty/squiggle/blob/develop/packages/squiggle-lang/src/rescript/ReducerInterface/ReducerInterface_GenericDistribution.res)_ _The source of truth for this document is [this file of code](https://github.com/quantified-uncertainty/squiggle/blob/develop/packages/squiggle-lang/src/rescript/ReducerInterface/ReducerInterface_GenericDistribution.res)_
# Inventory distributions ## Inventory distributions
We provide starter distributions, computed symbolically. We provide starter distributions, computed symbolically.
## Normal distribution ### Normal distribution
The `normal(mean, sd)` function creates a normal distribution with the given mean The `normal(mean, sd)` function creates a normal distribution with the given mean
and standard deviation. and standard deviation.
<SquiggleEditor initialSquiggleString="normal(5, 1)" /> <SquiggleEditor initialSquiggleString="normal(5, 1)" />
### Validity #### Validity
- `sd > 0` - `sd > 0`
## Uniform distribution ### Uniform distribution
The `uniform(low, high)` function creates a uniform distribution between the The `uniform(low, high)` function creates a uniform distribution between the
two given numbers. two given numbers.
<SquiggleEditor initialSquiggleString="uniform(3, 7)" /> <SquiggleEditor initialSquiggleString="uniform(3, 7)" />
### Validity #### Validity
- `low < high` - `low < high`
## Lognormal distribution ### Lognormal distribution
The `lognormal(mu, sigma)` returns the log of a normal distribution with parameters The `lognormal(mu, sigma)` returns the log of a normal distribution with parameters
`mu` and `sigma`. The log of `lognormal(mu, sigma)` is a normal distribution with mean `mu` and standard deviation `sigma`. `mu` and `sigma`. The log of `lognormal(mu, sigma)` is a normal distribution with mean `mu` and standard deviation `sigma`.
@ -46,183 +46,227 @@ this convinience as lognormal distributions are commonly used in practice.
<SquiggleEditor initialSquiggleString="2 to 10" /> <SquiggleEditor initialSquiggleString="2 to 10" />
### Future feature: #### Future feature:
Furthermore, it's also possible to create a lognormal from it's actual mean Furthermore, it's also possible to create a lognormal from it's actual mean
and standard deviation, using `lognormalFromMeanAndStdDev`. and standard deviation, using `lognormalFromMeanAndStdDev`.
<SquiggleEditor initialSquiggleString="lognormalFromMeanAndStdDev(20, 10)" /> <SquiggleEditor initialSquiggleString="lognormalFromMeanAndStdDev(20, 10)" />
### Validity #### Validity
- `sigma > 0` - `sigma > 0`
- In `x to y` notation, `x < y` - In `x to y` notation, `x < y`
## Beta distribution ### Beta distribution
The `beta(a, b)` function creates a beta distribution with parameters `a` and `b`: The `beta(a, b)` function creates a beta distribution with parameters `a` and `b`:
<SquiggleEditor initialSquiggleString="beta(1e1, 2e1)" /> <SquiggleEditor initialSquiggleString="beta(1e1, 2e1)" />
### Validity #### Validity
- `a > 0` - `a > 0`
- `b > 0` - `b > 0`
- Empirically, we have noticed that numerical instability arises when `a < 1` or `b < 1` - Empirically, we have noticed that numerical instability arises when `a < 1` or `b < 1`
## Exponential distribution ### Exponential distribution
The `exponential(rate)` function creates an exponential distribution with the given The `exponential(rate)` function creates an exponential distribution with the given
rate. rate.
<SquiggleEditor initialSquiggleString="exponential(1.11e0)" /> <SquiggleEditor initialSquiggleString="exponential(1.11e0)" />
### Validity #### Validity
- `rate > 0` - `rate > 0`
## Triangular distribution ### Triangular distribution
The `triangular(a,b,c)` function creates a triangular distribution with lower The `triangular(a,b,c)` function creates a triangular distribution with lower
bound `a`, mode `b` and upper bound `c`. bound `a`, mode `b` and upper bound `c`.
### Validity #### Validity
- `a < b < c` - `a < b < c`
<SquiggleEditor initialSquiggleString="triangular(1, 2, 4)" /> <SquiggleEditor initialSquiggleString="triangular(1, 2, 4)" />
## Scalar (constant dist) ### Scalar (constant dist)
Squiggle, when the context is right, automatically casts a float to a constant distribution. Squiggle, when the context is right, automatically casts a float to a constant distribution.
# Operating on distributions ## Operating on distributions
Here are the ways we combine distributions. Here are the ways we combine distributions.
## Mixture of distributions ### Mixture of distributions
The `mx` function combines 2 or more other distributions to create a weighted The `mixture` function combines 2 or more other distributions to create a weighted
combination of the two. The first positional arguments represent the distributions combination of the two. The first positional arguments represent the distributions
to be combined, and the last argument is how much to weigh every distribution in the to be combined, and the last argument is how much to weigh every distribution in the
combination. combination.
<SquiggleEditor initialSquiggleString="mx(uniform(0,1), normal(1,1), [0.5, 0.5])" /> <SquiggleEditor initialSquiggleString="mixture(uniform(0,1), normal(1,1), [0.5, 0.5])" />
It's possible to create discrete distributions using this method. It's possible to create discrete distributions using this method.
<SquiggleEditor initialSquiggleString="mx(0, 1, [0.2,0.8])" /> <SquiggleEditor initialSquiggleString="mixture(0, 1, [0.2,0.8])" />
As well as mixed distributions: As well as mixed distributions:
<SquiggleEditor initialSquiggleString="mx(3, 8, 1 to 10, [0.2, 0.3, 0.5])" /> <SquiggleEditor initialSquiggleString="mixture(3, 8, 1 to 10, [0.2, 0.3, 0.5])" />
An alias of `mx` is `mixture` An alias of `mixture` is `mx`
### Validity #### Validity
Using javascript's variable arguments notation, consider `mx(...dists, weights)`: Using javascript's variable arguments notation, consider `mx(...dists, weights)`:
- `dists.length == weights.length` - `dists.length == weights.length`
## Addition (horizontal right shift) ### Addition (horizontal right shift)
<SquiggleEditor initialSquiggleString="dist1 = 1 to 10; dist2 = triangular(1,2,3); dist1 + dist2" /> <SquiggleEditor initialSquiggleString="dist1 = 1 to 10; dist2 = triangular(1,2,3); dist1 + dist2" />
## Subtraction (horizontal left shift) ### Subtraction (horizontal left shift)
<SquiggleEditor initialSquiggleString="dist1 = 1 to 10; dist2 = triangular(1,2,3); dist1 - dist2" /> <SquiggleEditor initialSquiggleString="dist1 = 1 to 10; dist2 = triangular(1,2,3); dist1 - dist2" />
## Multiplication (??) ### Multiplication (??)
<SquiggleEditor initialSquiggleString="dist1 = 1 to 10; dist2 = triangular(1,2,3); dist1 * dist2" /> <SquiggleEditor initialSquiggleString="dist1 = 1 to 10; dist2 = triangular(1,2,3); dist1 * dist2" />
## Division (???) We also provide concatenation of two distributions as a syntax sugar for `*`
<SquiggleEditor initialSquiggleString="(1e-1 to 1e0) triangular(1,2,3)" />
### Division (???)
<SquiggleEditor initialSquiggleString="dist1 = 1 to 10; dist2 = triangular(1,2,3); dist1 / dist2" /> <SquiggleEditor initialSquiggleString="dist1 = 1 to 10; dist2 = triangular(1,2,3); dist1 / dist2" />
## Taking the base `e` exponential ### Exponentiation (???)
<SquiggleEditor initialSquiggleString="(1e-1 to 1e0) ^ cauchy(1e0, 1e0)" />
### Taking the base `e` exponential
<SquiggleEditor initialSquiggleString="dist = triangular(1,2,3); exp(dist)" /> <SquiggleEditor initialSquiggleString="dist = triangular(1,2,3); exp(dist)" />
## Taking the base `e` and base `10` logarithm ### Taking logarithms
<SquiggleEditor initialSquiggleString="dist = triangular(1,2,3); log(dist)" /> <SquiggleEditor initialSquiggleString="dist = triangular(1,2,3); log(dist)" />
<SquiggleEditor initialSquiggleString="dist = beta(1,2); log10(dist)" /> <SquiggleEditor initialSquiggleString="dist = beta(1,2); log10(dist)" />
### Validity Base `x`
<SquiggleEditor initialSquiggleString="x = 2; dist = cauchy(1e0,1e0); log(dist, x)" />
#### Validity
- `x` must be a scalar
- See [the current discourse](https://github.com/quantified-uncertainty/squiggle/issues/304) - See [the current discourse](https://github.com/quantified-uncertainty/squiggle/issues/304)
# Standard functions on distributions ### Pointwise addition
## Probability density function **Pointwise operations are done with `PointSetDist` internals rather than `SampleSetDist` internals**.
TODO: this isn't in the new interpreter/parser yet.
<SquiggleEditor initialSquiggleString="dist1 = 1 to 10; dist2 = triangular(1,2,3); dist1 .+ dist2" />
### Pointwise subtraction
TODO: this isn't in the new interpreter/parser yet.
<SquiggleEditor initialSquiggleString="dist1 = 1 to 10; dist2 = triangular(1,2,3); dist1 .- dist2" />
### Pointwise multiplication
<SquiggleEditor initialSquiggleString="dist1 = 1 to 10; dist2 = triangular(1,2,3); dist1 .* dist2" />
### Pointwise division
<SquiggleEditor initialSquiggleString="dist1 = 1 to 10; dist2 = triangular(1,2,3); dist1 ./ dist2" />
### Pointwise exponentiation
<SquiggleEditor initialSquiggleString="dist1 = 1 to 10; dist2 = triangular(1,2,3); dist1 .^ dist2" />
### Pointwise logarithm
TODO: write about the semantics and the case handling re scalar vs. dist and log base.
<SquiggleEditor initialSquiggleString="dist1 = 1 to 10; dist2 = triangular(1,2,3); dotLog(dist1, dist2)" />
## Standard functions on distributions
### Probability density function
The `pdf(dist, x)` function returns the density of a distribution at the The `pdf(dist, x)` function returns the density of a distribution at the
given point x. given point x.
<SquiggleEditor initialSquiggleString="pdf(normal(0,1),0)" /> <SquiggleEditor initialSquiggleString="pdf(normal(0,1),0)" />
### Validity #### Validity
- `x` must be a scalar - `x` must be a scalar
- `dist` must be a distribution - `dist` must be a distribution
## Cumulative density function ### Cumulative density function
The `cdf(dist, x)` gives the cumulative probability of the distribution The `cdf(dist, x)` gives the cumulative probability of the distribution
or all values lower than x. It is the inverse of `inv`. or all values lower than x. It is the inverse of `inv`.
<SquiggleEditor initialSquiggleString="cdf(normal(0,1),0)" /> <SquiggleEditor initialSquiggleString="cdf(normal(0,1),0)" />
### Validity #### Validity
- `x` must be a scalar - `x` must be a scalar
- `dist` must be a distribution - `dist` must be a distribution
## Inverse CDF ### Inverse CDF
The `inv(dist, prob)` gives the value x or which the probability for all values The `inv(dist, prob)` gives the value x or which the probability for all values
lower than x is equal to prob. It is the inverse of `cdf`. lower than x is equal to prob. It is the inverse of `cdf`.
<SquiggleEditor initialSquiggleString="inv(normal(0,1),0.5)" /> <SquiggleEditor initialSquiggleString="inv(normal(0,1),0.5)" />
### Validity #### Validity
- `prob` must be a scalar (please only put it in `(0,1)`) - `prob` must be a scalar (please only put it in `(0,1)`)
- `dist` must be a distribution - `dist` must be a distribution
## Mean ### Mean
The `mean(distribution)` function gives the mean (expected value) of a distribution. The `mean(distribution)` function gives the mean (expected value) of a distribution.
<SquiggleEditor initialSquiggleString="mean(normal(5, 10))" /> <SquiggleEditor initialSquiggleString="mean(normal(5, 10))" />
## Sampling a distribution ### Sampling a distribution
The `sample(distribution)` samples a given distribution. The `sample(distribution)` samples a given distribution.
<SquiggleEditor initialSquiggleString="sample(normal(0, 10))" /> <SquiggleEditor initialSquiggleString="sample(normal(0, 10))" />
# Normalization ## Normalization
Some distribution operations (like horizontal shift) return an unnormalized distriibution. Some distribution operations (like horizontal shift) return an unnormalized distriibution.
We provide a `normalize` function We provide a `normalize` function
<SquiggleEditor initialSquiggleString="normalize((1e-1 to 1e0) + triangular(1e-1, 1e0, 1e1))" /> <SquiggleEditor initialSquiggleString="normalize((1e-1 to 1e0) + triangular(1e-1, 1e0, 1e1))" />
### Valdity - Input to `normalize` must be a dist #### Validity - Input to `normalize` must be a dist
We provide a predicate `isNormalized`, for when we have simple control flow We provide a predicate `isNormalized`, for when we have simple control flow
<SquiggleEditor initialSquiggleString="isNormalized((1e-1 to 1e0) * triangular(1e-1, 1e0, 1e1))" /> <SquiggleEditor initialSquiggleString="isNormalized((1e-1 to 1e0) * triangular(1e-1, 1e0, 1e1))" />
### Validity #### Validity
- Input to `isNormalized` must be a dist - Input to `isNormalized` must be a dist
# Convert any distribution to a sample set distribution ## Convert any distribution to a sample set distribution
`toSampleSet` has two signatures `toSampleSet` has two signatures
@ -234,7 +278,7 @@ And binary when you provide a number of samples (floored)
<SquiggleEditor initialSquiggleString="toSampleSet(1e-1 to 1e0, 1e2)" /> <SquiggleEditor initialSquiggleString="toSampleSet(1e-1 to 1e0, 1e2)" />
# `inspect` ## `inspect`
You may like to debug by right clicking your browser and using the _inspect_ functionality on the webpage, and viewing the _console_ tab. Then, wrap your squiggle output with `inspect` to log an internal representation. You may like to debug by right clicking your browser and using the _inspect_ functionality on the webpage, and viewing the _console_ tab. Then, wrap your squiggle output with `inspect` to log an internal representation.
@ -242,7 +286,7 @@ You may like to debug by right clicking your browser and using the _inspect_ fun
Save for a logging side effect, `inspect` does nothing to input and returns it. Save for a logging side effect, `inspect` does nothing to input and returns it.
# Truncate ## Truncate
You can cut off from the left You can cut off from the left

View File

@ -11,13 +11,13 @@ Invariants to check with property tests.
_This document right now is normative and aspirational, not a description of the testing that's currently done_. _This document right now is normative and aspirational, not a description of the testing that's currently done_.
# Algebraic combinations ## Algebraic combinations
The academic keyword to search for in relation to this document is "[algebra of random variables](https://wikiless.org/wiki/Algebra_of_random_variables?lang=en)". Squiggle doesn't yet support getting the standard deviation, denoted by $\sigma$, but such support could yet be added. The academic keyword to search for in relation to this document is "[algebra of random variables](https://wikiless.org/wiki/Algebra_of_random_variables?lang=en)". Squiggle doesn't yet support getting the standard deviation, denoted by $\sigma$, but such support could yet be added.
## Means and standard deviations ### Means and standard deviations
### Sums #### Sums
$$ $$
mean(f+g) = mean(f) + mean(g) mean(f+g) = mean(f) + mean(g)
@ -33,7 +33,7 @@ $$
mean(normal(a,b) + normal(c,d)) = mean(normal(a+c, \sqrt{b^2 + d^2})) mean(normal(a,b) + normal(c,d)) = mean(normal(a+c, \sqrt{b^2 + d^2}))
$$ $$
### Subtractions #### Subtractions
$$ $$
mean(f-g) = mean(f) - mean(g) mean(f-g) = mean(f) - mean(g)
@ -43,7 +43,7 @@ $$
\sigma(f-g) = \sqrt{\sigma(f)^2 + \sigma(g)^2} \sigma(f-g) = \sqrt{\sigma(f)^2 + \sigma(g)^2}
$$ $$
### Multiplications #### Multiplications
$$ $$
mean(f \cdot g) = mean(f) \cdot mean(g) mean(f \cdot g) = mean(f) \cdot mean(g)
@ -53,15 +53,15 @@ $$
\sigma(f \cdot g) = \sqrt{ (\sigma(f)^2 + mean(f)) \cdot (\sigma(g)^2 + mean(g)) - (mean(f) \cdot mean(g))^2} \sigma(f \cdot g) = \sqrt{ (\sigma(f)^2 + mean(f)) \cdot (\sigma(g)^2 + mean(g)) - (mean(f) \cdot mean(g))^2}
$$ $$
### Divisions #### Divisions
Divisions are tricky, and in general we don't have good expressions to characterize properties of ratios. In particular, the ratio of two normals is a Cauchy distribution, which doesn't have to have a mean. Divisions are tricky, and in general we don't have good expressions to characterize properties of ratios. In particular, the ratio of two normals is a Cauchy distribution, which doesn't have to have a mean.
## Probability density functions (pdfs) ### Probability density functions (pdfs)
Specifying the pdf of the sum/multiplication/... of distributions as a function of the pdfs of the individual arguments can still be done. But it requires integration. My sense is that this is still doable, and I (Nuño) provide some _pseudocode_ to do this. Specifying the pdf of the sum/multiplication/... of distributions as a function of the pdfs of the individual arguments can still be done. But it requires integration. My sense is that this is still doable, and I (Nuño) provide some _pseudocode_ to do this.
### Sums #### Sums
Let $f, g$ be two independently distributed functions. Then, the pdf of their sum, evaluated at a point $z$, expressed as $(f + g)(z)$, is given by: Let $f, g$ be two independently distributed functions. Then, the pdf of their sum, evaluated at a point $z$, expressed as $(f + g)(z)$, is given by:
@ -114,31 +114,31 @@ let pdfOfSum = (pdf1, pdf2, cdf1, cdf2, z) => {
}; };
``` ```
## Cumulative density functions ### Cumulative density functions
TODO TODO
## Inverse cumulative density functions ### Inverse cumulative density functions
TODO TODO
# `pdf`, `cdf`, and `inv` ## `pdf`, `cdf`, and `inv`
With $\forall dist, pdf := x \mapsto \texttt{pdf}(dist, x) \land cdf := x \mapsto \texttt{cdf}(dist, x) \land inv := p \mapsto \texttt{inv}(dist, p)$, With $\forall dist, pdf := x \mapsto \texttt{pdf}(dist, x) \land cdf := x \mapsto \texttt{cdf}(dist, x) \land inv := p \mapsto \texttt{inv}(dist, p)$,
## `cdf` and `inv` are inverses ### `cdf` and `inv` are inverses
$$ $$
\forall x \in (0,1), cdf(inv(x)) = x \land \forall x \in \texttt{dom}(cdf), x = inv(cdf(x)) \forall x \in (0,1), cdf(inv(x)) = x \land \forall x \in \texttt{dom}(cdf), x = inv(cdf(x))
$$ $$
## The codomain of `cdf` equals the open interval `(0,1)` equals the codomain of `pdf` ### The codomain of `cdf` equals the open interval `(0,1)` equals the codomain of `pdf`
$$ $$
\texttt{cod}(cdf) = (0,1) = \texttt{cod}(pdf) \texttt{cod}(cdf) = (0,1) = \texttt{cod}(pdf)
$$ $$
# To do: ## To do:
- Provide sources or derivations, useful as this document becomes more complicated - Provide sources or derivations, useful as this document becomes more complicated
- Provide definitions for the probability density function, exponential, inverse, log, etc. - Provide definitions for the probability density function, exponential, inverse, log, etc.