2024-07-12 22:29:55 +00:00
# A calculator for distributions, for Fermi estimation
2024-05-10 19:35:06 +00:00
2024-07-12 22:29:55 +00:00
This project is a minimalist, calculator-style DSL for fermi estimation. It can multiply, divide, add and substract scalars, lognormals and beta distributions, and supports variables.
2024-05-10 19:35:06 +00:00
## Motivation
2024-11-10 17:07:19 +00:00
Sometimes, [Squiggle ](https://github.com/quantified-uncertainty/squiggle ), [simple squiggle ](https://git.nunosempere.com/quantified.uncertainty/simple-squiggle ) or [squiggle.c ](https://git.nunosempere.com/personal/squiggle.c ) are still too complicated and un-unix-like. In particular, their startup cost is not instant.
2024-05-10 19:35:06 +00:00
2024-07-07 14:30:35 +00:00
## Installation
```
make build
sudo make install
fermi
```
2024-05-10 19:35:06 +00:00
2024-07-07 14:30:35 +00:00
## Usage
2024-05-10 19:35:06 +00:00
2024-05-10 21:38:20 +00:00
```
2024-07-07 14:30:35 +00:00
$ fermi
2024-05-10 21:38:20 +00:00
5000000 12000000
2024-06-10 15:06:31 +00:00
=> 5.0M 12.0M
* beta 1 200
2024-07-12 16:11:25 +00:00
=> 1.9K 123.1K
2024-06-10 15:06:31 +00:00
* 30 180
2024-07-12 16:11:25 +00:00
=> 122.9K 11.7M
2024-05-10 21:38:20 +00:00
/ 48 52
2024-07-12 16:11:25 +00:00
=> 2.5K 234.6K
2024-05-10 21:38:20 +00:00
/ 5 6
2024-07-12 16:11:25 +00:00
=> 448.8 43.0K
2024-05-10 21:38:20 +00:00
/ 6 8
2024-07-12 16:11:25 +00:00
=> 64.5 6.2K
2024-05-11 11:22:07 +00:00
/ 60
2024-07-12 16:11:25 +00:00
=> 1.1 103.7
2024-05-10 21:38:20 +00:00
```
2024-05-10 19:35:06 +00:00
2024-05-10 21:38:20 +00:00
Perhaps this example is more understandable with comments and better units:
2024-05-10 19:35:06 +00:00
2024-05-10 21:38:20 +00:00
```
2024-07-07 14:30:35 +00:00
$ fermi
2024-07-12 16:11:25 +00:00
5M 12M # number of people living in Chicago
beta 1 200 # fraction of people that have a piano
30 180 # minutes it takes to tune a piano, including travel time
/ 48 52 # weeks a year that piano tuners work for
/ 5 6 # days a week in which piano tuners work
/ 6 8 # hours a day in which piano tuners work
/ 60 # minutes to an hour
=: piano_tuners
2024-05-10 21:38:20 +00:00
```
2024-05-10 19:35:06 +00:00
2024-10-01 07:59:45 +00:00
If you type "help" (or run fermi -h), you can see a small grammar and some optional command flags:
2024-06-19 14:41:47 +00:00
```
2024-07-07 14:30:35 +00:00
$ fermi
2024-10-01 07:59:45 +00:00
1. Grammar:
2024-07-07 14:30:35 +00:00
Operation | Variable assignment | Special
Operation: operator operand
operator: (empty) | * | / | + | -
operand: scalar | lognormal | beta | variable
lognormal: low high
beta: beta alpha beta
Variable assignment: =: variable_name
Variable assignment and clear stack: =. variable_name
2024-11-10 17:16:49 +00:00
Suffixes: %, K, M, B, T
2024-10-01 07:59:45 +00:00
Special commands:
2024-07-12 16:11:25 +00:00
Comment: # this is a comment
2024-10-01 07:59:45 +00:00
Summary stats: stats
2024-07-07 14:30:35 +00:00
Clear stack: clear | c | .
Print debug info: debug | d
2024-07-12 16:11:25 +00:00
Print help message: help | h
Start additional stack: operator (
Return from additional stack )
2024-07-07 14:30:35 +00:00
Exit: exit | e
Examples:
+ 2
2024-10-01 07:59:45 +00:00
/ 2.5
* 1 10 (interpreted as lognormal)
2024-07-07 14:30:35 +00:00
+ 1 10
* beta 1 10
2024-10-01 07:59:45 +00:00
1 10 (multiplication taken as default operation)
=: x
.
2024-07-07 14:30:35 +00:00
1 100
+ x
2024-10-01 07:59:45 +00:00
# this is a comment
* 1 12 # this is an operation followed by a comment
2024-07-12 16:11:25 +00:00
* (
1 10
+ beta 1 100
)
2024-07-07 14:30:35 +00:00
exit
2024-10-01 07:59:45 +00:00
Command flags:
-echo
Specifies whether inputs should be echoed back. Useful if reading from a file
. -f string
2024-11-10 17:07:19 +00:00
Specifies a file with a model to run. Sets the echo command to true by default.
2024-10-01 07:59:45 +00:00
-n int
Specifies the number of samples to draw when using samples (default 100000)
-h Shows help message
2024-05-10 21:38:20 +00:00
```
2024-06-03 06:45:33 +00:00
2024-09-15 20:33:56 +00:00
You can see real life examples [here ](https://x.com/NunoSempere/status/1831106442721452312 ), [here ](https://x.com/NunoSempere/status/1829525844169248912 ), [here ](https://x.com/NunoSempere/status/1818810770932568308 ), [here ](https://x.com/NunoSempere/status/1816605190415401100 ), [here ](https://x.com/NunoSempere/status/1816604386703081894 ), [here ](https://x.com/NunoSempere/status/1815169781907042504 )
2024-07-12 22:32:24 +00:00
## Tips & tricks
- It's conceptually clearer to have all the multiplications first and then all the divisions
- For things between 0 and 1, consider using a beta distribution
### Command line options
You can specify the number of samples to draw when algebraic manipulations are not sufficient:
2024-07-12 22:13:24 +00:00
```
$ fermi -n 1000000
$ fermi -n 1_000_000
```
2024-07-12 22:32:24 +00:00
You also run a file with the -f option
2024-07-12 22:29:55 +00:00
2024-07-12 22:32:24 +00:00
```
$ fermi -f more/piano-tuners.fermi
```
2024-07-12 22:29:55 +00:00
### Integrations with linux utilities
2024-07-12 22:32:24 +00:00
Because the model reads from standard input, you can a model to it:
2024-07-12 22:29:55 +00:00
```
2024-07-12 22:32:24 +00:00
$ cat more/piano-tuners.fermi | fermi
2024-07-12 22:29:55 +00:00
```
2024-10-01 08:05:47 +00:00
In that case, you will probably want to use the echo flag as well
```
$ cat more/piano-tuners-commented.fermi | fermi -echo
```
2024-07-12 22:32:24 +00:00
You can make a model an executable file by running `$ chmod -x model.fermi` and then adding the following at the top!
2024-07-12 22:29:55 +00:00
```
#!/bin/usr/fermi -f
```
You can save a session to a logfile with tee:
```
fermi | tee -a fermi.log
```
2024-05-10 21:47:34 +00:00
2024-06-19 14:41:47 +00:00
## Different levels of complexity
2024-07-12 22:13:24 +00:00
The top level f.go file (420 lines) has a bunch of complexity: variables, parenthesis, samples, beta distributions, number of samples, etc. In the simple/ folder:
2024-06-19 14:41:47 +00:00
- f_simple.go (370 lines) strips variables and parenthesis, but keeps beta distributions, samples, and addition and substraction
- f_minimal.go (140 lines) strips everything that isn't lognormal and scalar multiplication and addition, plus a few debug options.
2024-05-10 21:38:20 +00:00
## Roadmap
2024-05-10 19:09:35 +00:00
2024-06-19 14:41:47 +00:00
Done:
2024-05-10 21:38:20 +00:00
- [x] Write README
- [x] Add division?
- [x] Read from file?
- [x] Save to file?
- [x] Allow comments?
- [x] Use a sed filter?
2024-07-07 14:30:35 +00:00
- [x] Add proper comment processing
2024-05-10 21:38:20 +00:00
- [x] Add show more info version
2024-05-12 17:10:25 +00:00
- [x] Scalar multiplication and division
2024-06-09 12:48:53 +00:00
- [x] Think how to integrate with squiggle.c to draw samples
- [x] Copy the time to botec go code
2024-06-10 01:08:10 +00:00
- [x] Define samplers
- [x] Call those samplers when operating on distributions that can't be operted on algebraically
2024-06-03 07:28:16 +00:00
- [x] Display output more nicely, with K/M/B/T
2024-06-10 01:08:10 +00:00
- [x] Consider the following: make this into a stack-based DSL, with:
- [x] Variables that can be saved to and then displayed
- [x] Other types of distributions, particularly beta distributions? => But then this requires moving to bags of samples. It could still be ~instantaneous though.
2024-07-07 14:30:35 +00:00
- [x] Added bags of samples to support addition and multiplication of betas and lognormals
2024-06-10 01:08:10 +00:00
- [x] Figure out go syntax for
2024-06-09 12:48:53 +00:00
- Maps
- Joint types
- Enums
2024-06-10 15:06:31 +00:00
- [x] Fix correlation problem, by spinning up a new randomness thing every time some serial computation is done.
2024-06-19 14:41:47 +00:00
- [x] Clean up error code. Right now only needed for division
- [x] Maintain *both* a more complex thing that's more featureful *and* the more simple multiplication of lognormals thing.
2024-07-07 14:30:35 +00:00
- [x] Allow input with K/M/T
2024-07-12 16:11:25 +00:00
- [x] Document parenthesis syntax
2024-07-12 22:13:24 +00:00
- [x] Specify number of samples as a command line option
2024-07-12 22:29:55 +00:00
- [x] Figure out how to make models executable, by adding a #!/bin/bash-style command at the top?
2024-08-09 15:39:38 +00:00
- [x] Make -n flag work
- [x] Add flag to repeat input lines (useful when reading from files)
2024-11-10 17:16:49 +00:00
- [x] Add percentages
2024-12-24 16:05:53 +00:00
- [x] Consider adding an understanding of percentages
2024-06-19 14:41:47 +00:00
To (possibly) do:
2024-12-24 16:05:53 +00:00
- [ ] Consider implications of sampling strategy for operating variables in this case.
- [ ] Document mixture distributions
2024-11-19 19:43:45 +00:00
- [ ] Fix lognormal multiplication and division by 0 or < 0
2024-11-10 17:07:19 +00:00
- [ ] With the -f command line option, the program doesn't read from stdin after finishing reading the file
2024-06-19 14:41:47 +00:00
- [ ] Add functions. Now easier to do with an explicit representation of the stakc
- [ ] Think about how to draw a histogram from samples
2024-06-10 15:06:31 +00:00
- [ ] Dump samples to file
- [ ] Represent samples/statistics in some other way
- [ ] Perhaps use qsort rather than full sorting
2024-06-19 14:41:47 +00:00
- [ ] Program into a small device, like a calculator?
2024-11-10 17:16:49 +00:00
- [ ] Units?
2024-06-19 14:41:47 +00:00
Discarded:
2024-06-03 07:28:16 +00:00
2024-06-19 14:41:47 +00:00
- [ ] ~~Think of some way of calling bc~~