squiggle/packages/website/docs/Features/Distributions.mdx

---
title: "Creating Distributions"
sidebar_position: 8
---

import TOCInline from "@theme/TOCInline";
import { SquiggleEditor } from "../../src/components/SquiggleEditor";
import Admonition from "@theme/Admonition";
import Tabs from "@theme/Tabs";
import TabItem from "@theme/TabItem";

<TOCInline toc={toc} maxHeadingLevel={2} />

## To

`(5thPercentile: float) to (95thPercentile: float)`
`to(5thPercentile: float, 95thPercentile: float)`

The `to` function is an easy way to generate simple distributions using predicted _5th_ and _95th_ percentiles.

If both values are above zero, a `lognormal` distribution is used. If not, a `normal` distribution is used.

<Tabs>
  <TabItem value="ex1" label="5 to 10" default>
    When `5 to 10` is entered, both numbers are positive, so it generates a
    lognormal distribution with 5th and 95th percentiles at 5 and 10.
    <SquiggleEditor initialSquiggleString="5 to 10" />
  </TabItem>
  <TabItem value="ex3" label="to(5,10)">
    `5 to 10` does the same thing as `to(5,10)`.
    <SquiggleEditor initialSquiggleString="to(5,10)" />
  </TabItem>
  <TabItem value="ex2" label="-5 to 5">
    When `-5 to 5` is entered, there's negative values, so it generates a normal
    distribution. This has 5th and 95th percentiles at 5 and 10.
    <SquiggleEditor initialSquiggleString="-5 to -3" />
  </TabItem>
  <TabItem value="ex4" label="1 to 10000">
    It's very easy to generate distributions with very long tails. If this
    happens, you can click the "log x scale" box to view this using a log scale.
    <SquiggleEditor initialSquiggleString="1 to 10000" />
  </TabItem>
</Tabs>

### Arguments

- `5thPercentile`: Float
- `95thPercentile`: Float, greater than `5thPercentile`

<Admonition type="tip" title="Tip">
  <p>
    "<bold>To</bold>" is a great way to generate probability distributions very
    quickly from your intuitions. It's easy to write and easy to read. It's
    often a good place to begin an estimate.
  </p>
</Admonition>

<Admonition type="caution" title="Caution">
  <p>
    If you haven't tried{" "}
    <a href="https://www.lesswrong.com/posts/LdFbx9oqtKAAwtKF3/list-of-probability-calibration-exercises">
      calibration training
    </a>
    , you're likely to be overconfident. We recommend doing calibration training
    to get a feel for what a 90 percent confident interval feels like.
  </p>
</Admonition>

## Mixture

`mixture(...distributions: Distribution[], weights?: float[])`
`mx(...distributions: Distribution[], weights?: float[])`

The `mixture` mixes combines multiple distributions to create a mixture. You can optionally pass in a list of proportional weights.

<Tabs>
  <TabItem value="ex1" label="Simple" default>
    <SquiggleEditor initialSquiggleString="mixture(1 to 2, 5 to 8, 9 to 10)" />
  </TabItem>
  <TabItem value="ex2" label="With Weights">
    <SquiggleEditor initialSquiggleString="mixture(1 to 2, 5 to 8, 9 to 10, [0.1, 0.1, 0.8])" />
  </TabItem>
  <TabItem value="ex3" label="With Continuous and Discrete Inputs">
    <SquiggleEditor initialSquiggleString="mixture(1 to 5, 8 to 10, 1, 3, 20)" />
  </TabItem>
</Tabs>

### Arguments

- `distributions`: A set of distributions or floats, each passed as a paramater. Floats will be converted into Delta distributions.
- `weights`: An optional array of floats, each representing the weight of its corresponding distribution. The weights will be re-scaled to add to `1.0`. If a weights array is provided, it must be the same length as the distribution paramaters.

### Aliases

- `mx`

### Special Use Cases of Mixtures

<details>
  <summary>🕐 Zero or Continuous</summary>
  <p>
    One common reason to have mixtures of continous and discrete distributions is to handle the special case of 0.
    Say I want to model the time I will spend on some upcoming assignment. I think I have an 80% chance of doing it.
  </p>

  <p>
    In this case, I have a 20% chance of spending 0 time with it. I might estimate my hours with,
  </p>
  <SquiggleEditor
    initialSquiggleString={`hours_the_project_will_take = 5 to 20
chance_of_doing_anything = 0.8
mx(hours_the_project_will_take, 0, [chance_of_doing_anything, 1 - chance_of_doing_anything])`}
  />
</details>

<details>
  <summary>🔒 Model Uncertainty Safeguarding</summary>
  <p>
  One technique several <a href="https://www.foretold.io/">Foretold.io</a> users used is to combine their main guess, with a
  "just-in-case distribution". This latter distribution would have very low weight, but would be
  very wide, just in case they were dramatically off for some weird reason.
  </p>
  <p>
  One common reason to have mixtures of continous and discrete distributions is to handle the special case of 0.
  Say I want to model the time I will spend on some upcoming assignment. I think I have an 80% chance of doing it.
  </p>
<SquiggleEditor
  initialSquiggleString={`forecast = 3 to 30
chance_completely_wrong = 0.05
forecast_if_completely_wrong = -100 to 200
mx(forecast, forecast_if_completely_wrong, [1-chance_completely_wrong, chance_completely_wrong])`}
/>

</details>

## Normal

`normal(mean:float, standardDeviation:float)`

Creates a [normal distribution](https://en.wikipedia.org/wiki/Normal_distribution) with the given mean and standard deviation.
<Tabs>
  <TabItem value="ex1" label="normal(5,1)" default>
    <SquiggleEditor initialSquiggleString="normal(5, 1)" />
  </TabItem>
  <TabItem value="ex2" label="normal(100000000000, 100000000000)">
    <SquiggleEditor initialSquiggleString="normal(100000000000, 100000000000)" />
  </TabItem>
</Tabs>

### Arguments

- `mean`: Float
- `standard deviation`: Float greater than zero

[Wikipedia](https://en.wikipedia.org/wiki/Normal_distribution)

## Log-normal

`lognormal(mu: float, sigma: float)`

Creates a [log-normal distribution](https://en.wikipedia.org/wiki/Log-normal_distribution) with the given mu and sigma.

<SquiggleEditor initialSquiggleString="lognormal(0, 0.7)" />

### Arguments

- `mu`: Float
- `sigma`: Float greater than zero

[Wikipedia](https://en.wikipedia.org/wiki/Log-normal_distribution)

### Argument Alternatives
`Mu` and `sigma` can be difficult to directly reason about. Because of this complexity, we recommend typically using the <a href="#to">to</a> syntax.

<details>
  <summary>❓ Understanding <bold>mu</bold> and <bold>sigma</bold></summary>
  <p>
    The log of `lognormal(mu, sigma)` is a normal distribution with mean `mu` and standard deviation `sigma`. For example, these two distributions are identical:
  </p>
<SquiggleEditor
  initialSquiggleString={`normalMean = 10
normalStdDev = 2
logOfLognormal = log(lognormal(normalMean, normalStdDev))
[logOfLognormal, normal(normalMean, normalStdDev)]`}
/>
</details>

## Uniform

`uniform(low:float, high:float)`

Creates a [uniform distribution](https://en.wikipedia.org/wiki/Uniform_distribution_(continuous)) with the given low and high values.
<SquiggleEditor initialSquiggleString="uniform(3,7)" />

### Arguments

- `low`: Float
- `high`: Float greater than `low`

<Admonition type="caution" title="Caution">
  <p>
    While uniform distributions are very simple to understand, we find it rare to find uncertainties that actually look like this. Before using a uniform distribution, think hard about if you are really 100% confident that the paramater will not wind up being just outside the stated boundaries.
  </p>

  <p>
    One good example of a uniform distribution uncertainty would be clear physical limitations. You might have complete complete uncertainty on what time of day an event will occur, but can say with 100% confidence it will happen between the hours of 0:00 and 24:00.
  </p>
</Admonition>

## Beta
``beta(alpha:float, beta:float)``

Creates a [beta distribution](https://en.wikipedia.org/wiki/Beta_distribution) with the given `alpha` and `beta` values. For a good summary of the beta distribution, see [this explanation](https://stats.stackexchange.com/a/47782) on Stack Overflow.

<Tabs>
  <TabItem value="ex1" label="beta(10, 20)" default>
    <SquiggleEditor initialSquiggleString="beta(10,20)" />
  </TabItem>
  <TabItem value="ex2" label="beta(1000, 1000)" >
    <SquiggleEditor initialSquiggleString="beta(1000, 2000)" />
  </TabItem>
  <TabItem value="ex3" label="beta(1, 10)" >
    <SquiggleEditor initialSquiggleString="beta(1, 10)" />
  </TabItem>
  <TabItem value="ex4" label="beta(10, 1)" >
    <SquiggleEditor initialSquiggleString="beta(10, 1)" />
  </TabItem>
  <TabItem value="ex5" label="beta(0.8, 0.8)" >
    <SquiggleEditor initialSquiggleString="beta(0.8, 0.8)" />
  </TabItem>
</Tabs>

### Arguments

- `alpha`: Float greater than zero
- `beta`: Float greater than zero

<Admonition type="caution" title="Caution with small numbers">
  <p>
    Squiggle struggles to show beta distributions when either alpha or beta are below 1.0. This is because the tails at ~0.0 and ~1.0 are very high. Using a log scale for the y-axis helps here.
  </p>
<details>
  <summary>Examples</summary>
<Tabs>
  <TabItem value="ex1" label="beta(0.3, 0.3)" default>
    <SquiggleEditor initialSquiggleString="beta(0.3, 0.3)" />
  </TabItem>
  <TabItem value="ex2" label="beta(0.5, 0.5)">
    <SquiggleEditor initialSquiggleString="beta(0.5, 0.5)" />
  </TabItem>
  <TabItem value="ex3" label="beta(0.8, 0.8)">
    <SquiggleEditor initialSquiggleString="beta(.8,.8)" />
  </TabItem>
  <TabItem value="ex4" label="beta(0.9, 0.9)">
    <SquiggleEditor initialSquiggleString="beta(.9,.9)" />
  </TabItem>
</Tabs>
</details>
</Admonition>

## Exponential

``exponential(rate:float)``

Creates an [exponential distribution](https://en.wikipedia.org/wiki/Exponential_distribution) with the given rate.

<SquiggleEditor initialSquiggleString="exponential(4)" />

### Arguments
- `rate`: Float greater than zero

## Triangular distribution

``triangular(low:float, mode:float, high:float)``

Creates a [triangular distribution](https://en.wikipedia.org/wiki/Triangular_distribution) with the given low, mode, and high values.

#### Validity

### Arguments
- `low`: Float
- `mode`: Float greater than `low`
- `high`: Float greater than `mode`

<SquiggleEditor initialSquiggleString="triangular(1, 2, 4)" />

## FromSamples

Creates a sample set distribution using an array of samples.

<SquiggleEditor initialSquiggleString="fromSamples([1,2,3,4,6,5,5,5])" />

#### Validity

For `fromSamples(xs)`,

- `xs.length > 5`
- Strictly every element of `xs` must be a number.