tweak: some tweaks to documentation, part 1/2

This commit is contained in:
NunoSempere 2022-05-03 17:22:08 -04:00
parent 94a1155264
commit 526ee921b5
5 changed files with 47 additions and 27 deletions

View File

@ -11,7 +11,7 @@ _Symbolic_ formats are just the math equations. `normal(5,3)` is the symbolic re
When you sample distributions (usually starting with symbolic formats), you get lists of samples. Monte Carlo techniques return lists of samples. Lets call this the “_Sample Set_” format. When you sample distributions (usually starting with symbolic formats), you get lists of samples. Monte Carlo techniques return lists of samples. Lets call this the “_Sample Set_” format.
Lastly is what Ill refer to as the _Graph_ format. It describes the coordinates, or the shape, of the distribution. You can save these formats in JSON, for instance, like, `{xs: [1, 2, 3, 4…], ys: [.0001, .0003, .002, …]}`. Lastly is what Ill refer to as the _Graph_ format. It describes the coordinates, or the shape, of the distribution. You can save these formats in JSON, for instance, like, `{xs: [1, 2, 3, 4, …], ys: [.0001, .0003, .002, …]}`.
Symbolic, Sample Set, and Graph formats all have very different advantages and disadvantages. Symbolic, Sample Set, and Graph formats all have very different advantages and disadvantages.
@ -19,7 +19,7 @@ Note that the name "Symbolic" is fairly standard, but I haven't found common nam
## Symbolic Formats ## Symbolic Formats
**TLDR** **TL;DR**
Mathematical representations. Require analytic solutions. These are often ideal where they can be applied, but apply to very few actual functions. Typically used sparsely, except for the starting distributions (before any computation is performed). Mathematical representations. Require analytic solutions. These are often ideal where they can be applied, but apply to very few actual functions. Typically used sparsely, except for the starting distributions (before any computation is performed).
**Examples** **Examples**
@ -29,9 +29,6 @@ Mathematical representations. Require analytic solutions. These are often ideal
**How to Do Computation** **How to Do Computation**
To perform calculations of symbolic systems, you need to find analytical solutions. For example, there are equations to find the pdf or cdf of most distribution shapes at any point. There are also lots of simplifications that could be done in particular situations. For example, theres an analytical solution for combining normal distributions. To perform calculations of symbolic systems, you need to find analytical solutions. For example, there are equations to find the pdf or cdf of most distribution shapes at any point. There are also lots of simplifications that could be done in particular situations. For example, theres an analytical solution for combining normal distributions.
**Special: The Metalog Distribution**
The Metalog distribution seems like it can represent almost any reasonable distribution. Its symbolic. This is great for storage, but its not clear if it helps with calculation. My impression is that we dont have symbolic ways of doing most functions (addition, multiplication, etc) on metalog distributions. Also, note that it can take a fair bit of computation to fit a shape to the Metalog distribution.
**Advantages** **Advantages**
- Maximally compressed; i.e. very easy to store. - Maximally compressed; i.e. very easy to store.
@ -54,10 +51,14 @@ The Metalog distribution seems like it can represent almost any reasonable distr
**How to Visualize** **How to Visualize**
Convert to graph, then display that. (Optionally, you can also convert to samples, then display those using a histogram, but this is often worse you have both options.) Convert to graph, then display that. (Optionally, you can also convert to samples, then display those using a histogram, but this is often worse you have both options.)
**Bonus: The Metalog Distribution**
The Metalog distribution seems like it can represent almost any reasonable distribution. Its symbolic. This is great for storage, but its not clear if it helps with calculation. My impression is that we dont have symbolic ways of doing most functions (addition, multiplication, etc) on metalog distributions. Also, note that it can take a fair bit of computation to fit a shape to the Metalog distribution.
## Graph Formats ## Graph Formats
**TLDR** **TL;DR**
Lists of the x-y coordinates of the shape of a distribution. (Usually the pdf, which is more compressed than the cdf). Some key functions (like pdf, cdf) and manipulations can work on almost any graphally-described distribution. Lists of the x-y coordinates of the shape of a distribution. (Usually the pdf, which is more compressed than the cdf). Some key functions (like pdf, cdf) and manipulations can work on almost any graphically-described distribution.
**Alternative Names:** **Alternative Names:**
Grid, Mesh, Graph, Vector, Pdf, PdfCoords/PdfPoints, Discretised, Bezier, Curve Grid, Mesh, Graph, Vector, Pdf, PdfCoords/PdfPoints, Discretised, Bezier, Curve
@ -77,7 +78,7 @@ Use graph techniques. These can be fairly computationally-intensive (particularl
**Disadvantages** **Disadvantages**
- Most calculations are infeasible/impossible to perform graphally. In these cases, you need to use sampling. - Most calculations are infeasible/impossible to perform graphically. In these cases, you need to use sampling.
- Not as accurate or fast as symbolic methods, where the symbolic methods are applicable. - Not as accurate or fast as symbolic methods, where the symbolic methods are applicable.
- The tails get cut off, which is subideal. Its assumed that the value of the pdf outside of the bounded range is exactly 0, which is not correct. (Note: If you have ideas on how to store graph formats that dont cut off tails, let me know) - The tails get cut off, which is subideal. Its assumed that the value of the pdf outside of the bounded range is exactly 0, which is not correct. (Note: If you have ideas on how to store graph formats that dont cut off tails, let me know)
@ -108,7 +109,7 @@ Use graph techniques. These can be fairly computationally-intensive (particularl
## Sample Set Formats ## Sample Set Formats
**TLDR** **TL;DR**
Random samples. Use Monte Carlo simulation to perform calculations. This is the predominant technique using Monte Carlo methods; in these cases, most nodes are essentially represented as sample sets. [Guesstimate](https://www.getguesstimate.com/) works this way. Random samples. Use Monte Carlo simulation to perform calculations. This is the predominant technique using Monte Carlo methods; in these cases, most nodes are essentially represented as sample sets. [Guesstimate](https://www.getguesstimate.com/) works this way.
**How to Do Computation** **How to Do Computation**

View File

@ -159,7 +159,9 @@ Creates a [normal distribution](https://en.wikipedia.org/wiki/Normal_distributio
Creates a [log-normal distribution](https://en.wikipedia.org/wiki/Log-normal_distribution) with the given mu and sigma. Creates a [log-normal distribution](https://en.wikipedia.org/wiki/Log-normal_distribution) with the given mu and sigma.
`Mu` and `sigma` can be difficult to directly reason about. Because of this complexity, we recommend typically using the <a href="#to">to</a> syntax instead of estimating `mu` and `sigma` directly. `Mu` and `sigma` represent the mean and standard deviation of the normal which results when
you take the log of our lognormal distribution. They can be difficult to directly reason about.
Because of this complexity, we recommend typically using the <a href="#to">to</a> syntax instead of estimating `mu` and `sigma` directly.
<SquiggleEditor initialSquiggleString="lognormal(0, 0.7)" /> <SquiggleEditor initialSquiggleString="lognormal(0, 0.7)" />

View File

@ -11,7 +11,9 @@ Here are the ways we combine distributions.
### Addition ### Addition
A horizontal right shift A horizontal right shift. The addition operation represents the distribution of the sum of
the value of one random sample chosen from the first distribution and the value one random sample
chosen from the second distribution.
<SquiggleEditor <SquiggleEditor
initialSquiggleString={`dist1 = 1 to 10 initialSquiggleString={`dist1 = 1 to 10
@ -21,7 +23,9 @@ dist1 + dist2`}
### Subtraction ### Subtraction
A horizontal left shift A horizontal left shift. A horizontal right shift. The substraction operation represents
the distribution of the value of one random sample chosen from the first distribution minus
the value of one random sample chosen from the second distribution.
<SquiggleEditor <SquiggleEditor
initialSquiggleString={`dist1 = 1 to 10 initialSquiggleString={`dist1 = 1 to 10
@ -31,7 +35,9 @@ dist1 - dist2`}
### Multiplication ### Multiplication
TODO: provide intuition pump for the semantics A proportional scaling. The addition operation represents the distribution of the multiplication of
the value of one random sample chosen from the first distribution times the value one random sample
chosen from the second distribution.
<SquiggleEditor <SquiggleEditor
initialSquiggleString={`dist1 = 1 to 10 initialSquiggleString={`dist1 = 1 to 10
@ -45,7 +51,11 @@ We also provide concatenation of two distributions as a syntax sugar for `*`
### Division ### Division
TODO: provide intuition pump for the semantics A proportional scaling (normally a shrinking if the second distribution has values higher than 1).
The addition operation represents the distribution of the division of
the value of one random sample chosen from the first distribution over the value one random sample
chosen from the second distribution. If the second distribution has some values near zero, it
tends to be particularly unstable.
<SquiggleEditor <SquiggleEditor
initialSquiggleString={`dist1 = 1 to 10 initialSquiggleString={`dist1 = 1 to 10
@ -55,7 +65,9 @@ dist1 / dist2`}
### Exponentiation ### Exponentiation
TODO: provide intuition pump for the semantics A projection over a contracted x-axis. The exponentiation operation represents the distribution of
the exponentiation of the value of one random sample chosen from the first distribution to the power of
the value one random sample chosen from the second distribution.
<SquiggleEditor initialSquiggleString={`(0.1 to 1) ^ beta(2, 3)`} /> <SquiggleEditor initialSquiggleString={`(0.1 to 1) ^ beta(2, 3)`} />
@ -68,6 +80,8 @@ exp(dist)`}
### Taking logarithms ### Taking logarithms
A projection over a stretched x-axis.
<SquiggleEditor <SquiggleEditor
initialSquiggleString={`dist = triangular(1,2,3) initialSquiggleString={`dist = triangular(1,2,3)
log(dist)`} log(dist)`}
@ -93,6 +107,8 @@ log(dist, x)`}
### Pointwise addition ### Pointwise addition
For every point on the x-axis, operate the corresponding points in the y axis of the pdf.
**Pointwise operations are done with `PointSetDist` internals rather than `SampleSetDist` internals**. **Pointwise operations are done with `PointSetDist` internals rather than `SampleSetDist` internals**.
TODO: this isn't in the new interpreter/parser yet. TODO: this isn't in the new interpreter/parser yet.
@ -166,7 +182,8 @@ or all values lower than x. It is the inverse of `inv`.
### Inverse CDF ### Inverse CDF
The `inv(dist, prob)` gives the value x or which the probability for all values The `inv(dist, prob)` gives the value x or which the probability for all values
lower than x is equal to prob. It is the inverse of `cdf`. lower than x is equal to prob. It is the inverse of `cdf`. In the literature, it
is also known as the quantiles function.
<SquiggleEditor initialSquiggleString="inv(normal(0,1),0.5)" /> <SquiggleEditor initialSquiggleString="inv(normal(0,1),0.5)" />
@ -201,7 +218,7 @@ Or `PointSet` format
Above, we saw the unary `toSampleSet`, which uses an internal hardcoded number of samples. If you'd like to provide the number of samples, it has a binary signature as well (floored) Above, we saw the unary `toSampleSet`, which uses an internal hardcoded number of samples. If you'd like to provide the number of samples, it has a binary signature as well (floored)
<SquiggleEditor initialSquiggleString="toSampleSet(0.1 to 1, 100.1)" /> <SquiggleEditor initialSquiggleString="[toSampleSet(0.1 to 1, 100.1), toSampleSet(0.1 to 1, 5000), toSampleSet(0.1 to 1, 20000)]" />
#### Validity #### Validity
@ -241,7 +258,7 @@ You can cut off from the left
You can cut off from the right You can cut off from the right
<SquiggleEditor initialSquiggleString="truncateRight(0.1 to 1, 10)" /> <SquiggleEditor initialSquiggleString="truncateRight(0.1 to 1, 0.5)" />
You can cut off from both sides You can cut off from both sides

View File

@ -7,21 +7,21 @@ import { SquiggleEditor } from "../../src/components/SquiggleEditor";
## Expressions ## Expressions
A distribution ### Distributions
<SquiggleEditor initialSquiggleString={`mixture(1 to 2, 3, [0.3, 0.7])`} /> <SquiggleEditor initialSquiggleString={`mixture(1 to 2, 3, [0.3, 0.7])`} />
A number ### Numbers
<SquiggleEditor initialSquiggleString="4.321e-3" /> <SquiggleEditor initialSquiggleString="4.32" />
Arrays ### Arrays
<SquiggleEditor <SquiggleEditor
initialSquiggleString={`[beta(1,10), 4, isNormalized(toSampleSet(1 to 2))]`} initialSquiggleString={`[beta(1,10), 4, isNormalized(toSampleSet(1 to 2))]`}
/> />
Records ### Records
<SquiggleEditor <SquiggleEditor
initialSquiggleString={`d = {dist: triangular(0, 1, 2), weight: 0.25} initialSquiggleString={`d = {dist: triangular(0, 1, 2), weight: 0.25}
@ -42,9 +42,9 @@ A statement assigns expressions to names. It looks like `<symbol> = <expression>
We can define functions We can define functions
<SquiggleEditor <SquiggleEditor
initialSquiggleString={`ozzie_estimate(t) = lognormal(1, t ^ 1.01) initialSquiggleString={`ozzie_estimate(t) = lognormal(t^(1.1), 0.5)
nuño_estimate(t, m) = mixture(0.5 to 2, normal(m, t ^ 1.25)) nuno_estimate(t, m) = mixture(normal(-5, 1), lognormal(m, t / 1.25))
ozzie_estimate(5) * nuño_estimate(5.01, 1)`} ozzie_estimate(1) * nuno_estimate(1, 1)`}
/> />
## See more ## See more

View File

@ -30,7 +30,7 @@ this library to help navigate the return type.
The `@quri/squiggle-components` package offers several components and utilities The `@quri/squiggle-components` package offers several components and utilities
for people who want to embed Squiggle components into websites. This documentation for people who want to embed Squiggle components into websites. This documentation
relies on `@quri/squiggle-components` frequently. uses `@quri/squiggle-components` frequently.
We host [a storybook](https://squiggle-components.netlify.app/) with details We host [a storybook](https://squiggle-components.netlify.app/) with details
and usage of each of the components made available. and usage of each of the components made available.