fermi/README.md

194 lines
5.5 KiB
Markdown
Raw Normal View History

2024-06-19 14:41:47 +00:00
# A minimalist calculator for f estimation
2024-06-19 14:41:47 +00:00
This project is a minimalist, stack-based DSL for f estimation. It can multiply and divide scalars, lognormals and beta distributions.
## Motivation
Sometimes, [Squiggle](https://github.com/quantified-uncertainty/squiggle), [simple squiggle](https://git.nunosempere.com/quantified.uncertainty/simple-squiggle) or [squiggle.c](https://git.nunosempere.com/personal/squiggle.c) are still too complicated and un-unix-like.
2024-05-10 21:38:20 +00:00
## Usage
2024-05-10 21:38:20 +00:00
Here is an example
2024-05-10 21:38:20 +00:00
```
2024-06-19 14:41:47 +00:00
$ go run f.go
2024-05-10 21:38:20 +00:00
5000000 12000000
2024-06-10 15:06:31 +00:00
=> 5.0M 12.0M
* beta 1 200
1.9K 123.1K
* 30 180
122.9K 11.7M
2024-05-10 21:38:20 +00:00
/ 48 52
2024-06-10 15:06:31 +00:00
2.5K 234.6K
2024-05-10 21:38:20 +00:00
/ 5 6
2024-06-10 15:06:31 +00:00
448.8 43.0K
2024-05-10 21:38:20 +00:00
/ 6 8
2024-06-10 15:06:31 +00:00
64.5 6.2K
2024-05-11 11:22:07 +00:00
/ 60
2024-06-10 15:06:31 +00:00
1.1 103.7
2024-05-10 21:38:20 +00:00
```
2024-05-10 21:38:20 +00:00
Perhaps this example is more understandable with comments and better units:
2024-05-10 21:38:20 +00:00
```
2024-06-19 14:41:47 +00:00
$ sed -u "s|#.*||" | sed -u 's|M|000000|g' | go run f.go
2024-05-10 21:38:20 +00:00
5M 12M # number of people living in Chicago
2024-06-10 15:06:31 +00:00
=> 5.0M 12.0M
* beta 1 200 # fraction of people that have a piano
1.9K 123.1K
2024-05-10 21:38:20 +00:00
30 180 # minutes it takes to tune a piano, including travel time
2024-06-10 15:06:31 +00:00
122.9K 11.7M
/ 48 52 # weeks a year pianotuners work for
2.5K 234.6K
/ 6 8 # hours a day
353.9 34.1K
2024-05-11 11:22:07 +00:00
/ 60 # minutes to an hour
2024-06-10 15:06:31 +00:00
5.9 568.3
=: piano_tuners_in_Chicago
piano_tuners_in_Chicago => 5.9 568.3
2024-05-10 21:38:20 +00:00
```
2024-06-10 15:06:31 +00:00
Here is instead an example using beta distributions and variables:
2024-05-13 22:26:31 +00:00
2024-06-10 15:06:31 +00:00
```
1 2
=> 1.0 2.0
* 1_000_000_000
=> 1000.0M 2.0B
=: x # assign to variable
x => 1000.0M 2.0B
. # clear the stack, i.e., make it be 1
beta 1 2
=> beta 1.0 2.0
beta 12 300
=> beta 13.0 302.0
=. y # assign to variable and clear the stack (return it to 1)
y => beta 13.0 302.0
x
=> 1000.0M 2.0B
* y
=> samples 31.3M 98.2M
```
The difference between `=: x` and `=. y` is that `=.` clears the stack after the assignment.
2024-05-13 22:26:31 +00:00
2024-06-19 14:41:47 +00:00
If you type "help", you can see a small grammar:
```
help
Operation | Variable assignment | Special
Operation: operator operand
operator: (empty) | * | / | + | -
operand: scalar | lognormal | beta | variable
lognormal: low high
beta: beta alpha beta
Variable assignment: =: variable_name
Clear stack: . | c | clear
Variable assignment and clear stack: =. variable_name
Other special operations: help | debug | exit
Examples:
+ 2
/ 2.5
* 1 10 (interpreted as lognormal)
+ 1 10
* beta 1 10
1 10 (multiplication taken as default operation)
=: x
.
1 100
+ x
exit
```
2024-05-10 21:38:20 +00:00
## Installation
2024-05-10 21:38:20 +00:00
```
make build
sudo make install
2024-06-19 14:41:47 +00:00
f # rather than the previous go run f.go
2024-05-10 21:38:20 +00:00
```
2024-05-10 19:09:35 +00:00
2024-05-10 21:38:20 +00:00
Why use make instead of the built-in go commands? Because the point of make is to be able to share command-line recipes.
## Usage together with standard Linux utilities
2024-05-10 19:09:35 +00:00
2024-05-10 21:47:34 +00:00
```bash
2024-05-10 21:38:20 +00:00
f
2024-05-10 21:47:34 +00:00
sed -u "s|#.*||" | sed -u 's|M|000000|g' | f
2024-05-10 21:38:20 +00:00
cat more/piano-tuners.f | f
cat more/piano-tuners-commented.f | sed -u "s|#.*||" | sed -u 's|M|000000|g' | f
2024-05-10 21:47:34 +00:00
2024-06-19 14:41:47 +00:00
tee -a input.log | go run f.go | tee -a output.log
tee -a io.log | go run f.go | tee -a io.log
2024-05-10 21:47:34 +00:00
function f(){
sed -u "s|#.*||" |
sed -u "s|//.*||" |
sed -u 's|K|000|g' |
sed -u 's|M|000000|g' |
sed -u 's|B|000000000|g' |
/usr/bin/f
}
2024-05-10 21:38:20 +00:00
```
2024-05-10 19:09:35 +00:00
2024-06-03 06:45:33 +00:00
Note that these sed commands are just hacks, and won't parse e.g., `3.5K` correctly—it will just substitute for 3.5000
2024-05-10 21:47:34 +00:00
## Tips & tricks
2024-06-10 15:06:31 +00:00
- It's conceptually clearer to have all the multiplications first and then all the divisions
- Sums and divisions now also supported
- For things between 0 and 1, consider using a beta distribution
2024-05-10 21:47:34 +00:00
2024-06-19 14:41:47 +00:00
## Different levels of complexity
The top level f.go file (400 lines) has a bunch of complexity: variables, parenthesis, samples, beta distributions. In the simple/ folder:
- f_simple.go (370 lines) strips variables and parenthesis, but keeps beta distributions, samples, and addition and substraction
- f_minimal.go (140 lines) strips everything that isn't lognormal and scalar multiplication and addition, plus a few debug options.
2024-05-10 21:38:20 +00:00
## Roadmap
2024-05-10 19:09:35 +00:00
2024-06-19 14:41:47 +00:00
Done:
2024-05-10 21:38:20 +00:00
- [x] Write README
- [x] Add division?
- [x] Read from file?
- [x] Save to file?
- [x] Allow comments?
- [x] Use a sed filter?
- [x] Add show more info version
2024-05-12 17:10:25 +00:00
- [x] Scalar multiplication and division
2024-06-09 12:48:53 +00:00
- [x] Think how to integrate with squiggle.c to draw samples
- [x] Copy the time to botec go code
- [x] Define samplers
- [x] Call those samplers when operating on distributions that can't be operted on algebraically
2024-06-03 07:28:16 +00:00
- [x] Display output more nicely, with K/M/B/T
- [x] Consider the following: make this into a stack-based DSL, with:
- [x] Variables that can be saved to and then displayed
- [x] Other types of distributions, particularly beta distributions? => But then this requires moving to bags of samples. It could still be ~instantaneous though.
- [x] Figure out go syntax for
2024-06-09 12:48:53 +00:00
- Maps
- Joint types
- Enums
2024-06-10 15:06:31 +00:00
- [x] Fix correlation problem, by spinning up a new randomness thing every time some serial computation is done.
2024-06-19 14:41:47 +00:00
- [x] Clean up error code. Right now only needed for division
- [x] Maintain *both* a more complex thing that's more featureful *and* the more simple multiplication of lognormals thing.
To (possibly) do:
- [ ] Document parenthesis syntax
- [ ] Allow input with K/M/T
- [ ] Add functions. Now easier to do with an explicit representation of the stakc
- [ ] Think about how to draw a histogram from samples
2024-06-10 15:06:31 +00:00
- [ ] Dump samples to file
- [ ] Represent samples/statistics in some other way
- [ ] Perhaps use qsort rather than full sorting
2024-06-19 14:41:47 +00:00
- [ ] Program into a small device, like a calculator?
Discarded:
2024-06-03 07:28:16 +00:00
2024-06-19 14:41:47 +00:00
- [ ] ~~Think of some way of calling bc~~