clean up roadmap

This commit is contained in:
NunoSempere 2023-12-03 18:56:10 +00:00
parent f7754a142e
commit d56d1732a3

View File

@ -323,6 +323,10 @@ But for more complicated use cases, my recommendation would be to not use parall
- If you run the `sampler_parallel` function on two different inputs, their outputs will be correlated. E.g., if you run two lognormals, indices which have higher samples in one will tend to have higher samples in the other one. Why?
- For a small amount of samples, if you run the `sampler_parallel` function, you will get better spread out random numbers than if you run things serially. Why?
#### Extra: Algebraic manipulations
`squiggle_more.c` has some functions to do some simple algebra manipulations: sums of normals and products of lognormals. You can see some example usage [here](examples/more/07_algebra/example.c) and [here](examples/more/08_algebra_and_conversion/example.c).
#### Extra: cdf auxiliary functions
I provide some auxiliary functions that take a cdf, and return a sample from the distribution produced by that cdf. This might make it easier to program models, at the cost of a 20x to 60x slowdown in the parts of the code that use it.
@ -350,6 +354,10 @@ Behaviour on error can be toggled by the `EXIT_ON_ERROR` variable. This library
Overall, I'd describe the error handling capabilities of this library as pretty rudimentary. For example, this program might fail in surprising ways if you ask for a lognormal with negative standard deviation, because I haven't added error checking for that case yet.
### Other gotchas
- Even though the C standard is ambiguous about this, this code assumes that doubles are 64 bit precision (otherwise the xorshift should be different).
## Related projects
- [Squiggle](https://www.squiggle-language.com/)
@ -358,35 +366,34 @@ Overall, I'd describe the error handling capabilities of this library as pretty
- [time to BOTEC](https://github.com/NunoSempere/time-to-botec)
- [Find a beta distribution that fits your desired confidence interval](https://nunosempere.com/blog/2023/03/15/fit-beta/)
## To do list
## Roadmap
- [ ] Think through seed initialization
- [ ] Point out that, even though the C standard is ambiguous about this, this code assumes that doubles are 64 bit precision (otherwise the xorshift should be different).
- [ ] Document rudimentary algebra manipulations for normal/lognormal
- [ ] Think through whether to delete cdf => samples function
- [ ] Think through whether to:
### To do
- [ ] Drive in a few more real-life applications
### Done
- [x] Document rudimentary algebra manipulations for normal/lognormal
- [x] Think through whether to delete cdf => samples function => not for now
- [x] Think through whether to:
- simplify and just abort on error
- complexify and use boxes for everything
- leave as is
- [ ] Systematize references
- [ ] Support all distribution functions in <https://www.squiggle-language.com/docs/Api/Dist>
- [ ] do so efficiently
- [ ] Add more functions to do algebra and get the 90% c.i. of normals, lognormals, betas, etc.
- [x] Offer both options
- [x] Add more functions to do algebra and get the 90% c.i. of normals, lognormals, betas, etc.
- Think through which of these make sense.
- [ ] Disambiguate sample_laplace--successes vs failures || successes vs total trials as two distinct and differently named functions
## Done
- [x] Systematize references
- [x] Think through seed initialization
- [x] Document parallelism
- [x] Document confidence intervals
- [x] Add example for only one sample
- [x] Add example for many samples
- [ ] ~~Add a custom preprocessor to allow simple nested functions that don't rely on local scope?~~
- [x] Use gcc extension to define functions nested inside main.
- [x] Chain various `sample_mixture` functions
- [x] Add beta distribution
- See <https://stats.stackexchange.com/questions/502146/how-does-numpy-generate-samples-from-a-beta-distribution> for a faster method.
- [ ] ~~Use OpenMP for acceleration~~
- [x] Use OpenMP for acceleration
- [x] Add function to get sample when given a cdf
- [x] Don't have a single header file.
- [x] Structure project a bit better
@ -403,7 +410,6 @@ Overall, I'd describe the error handling capabilities of this library as pretty
- [x] Add sampling from a gamma distribution
- https://dl.acm.org/doi/pdf/10.1145/358407.358414
- [x] Explain correlated samples
- [ ] ~~Add tests in Stan?~~
- [x] Test summary statistics for each of the distributions.
- [x] For uniform
- [x] For normal
@ -431,13 +437,20 @@ Overall, I'd describe the error handling capabilities of this library as pretty
- [x] Consider ergonomics of using ci instead of c_i
- [x] use named struct instead
- [x] demonstrate and document feeding a struct directly to a function; my_function((struct c_i){.low = 1, .high = 2});
- [ ] Consider desirability of defining shortcuts for those functions. Adds a level of magic, though.
- [ ] Test results
- [x] Move to own file? Or signpost in file? => signposted in file.
- [x] Write twitter thread: now [here](https://twitter.com/NunoSempere/status/1707041153210564959); retweets appreciated.
- [ ] ~~Think about whether to write a simple version of this for [uxn](https://100r.co/site/uxn.html), a minimalist portable programming stack which, sadly, doesn't have doubles (64 bit floats)~~
- [x] Write better confidence interval code that:
- Gets number of samples as an input
- Gets either a sampler function or a list of samples
- is O(n), not O(nlog(n))
- Parallelizes stuff
### Discarded
- [ ] ~~Disambiguate sample_laplace--successes vs failures || successes vs total trials as two distinct and differently named functions~~
- [ ] ~~Support all distribution functions in <https://www.squiggle-language.com/docs/Api/Dist>~~
- [ ] ~~Add a custom preprocessor to allow simple nested functions that don't rely on local scope?~~
- [ ] ~~Add tests in Stan?~~
- [ ] ~~Test results for lognormal manipulations~~
- [ ] ~~Consider desirability of defining shortcuts for algebra functions. Adds a level of magic, though.~~
- [ ] ~~Think about whether to write a simple version of this for [uxn](https://100r.co/site/uxn.html), a minimalist portable programming stack which, sadly, doesn't have doubles (64 bit floats)~~