modify makefiles to make reorg work, change README

This commit is contained in:
NunoSempere 2023-11-18 21:00:02 +00:00
parent dd6cbb2ec2
commit 0de4132080
34 changed files with 180 additions and 125 deletions

View File

@ -46,7 +46,6 @@ This library provides some basic building blocks. The recommended strategy is to
### Cdf auxiliary functions
To help with the above core strategy, this library provides convenience functions, which take a cdf, and return a sample from the distribution produced by that cdf. This might make it easier to program models, at the cost of a 20x to 60x slowdown in the parts of the code that use it.
### Nested functions and compilation with tcc.
@ -60,29 +59,6 @@ GCC has an extension which allows a program to define a function inside another
My recommendation would be to use tcc while drawing a small number of samples for fast iteration, and then using gcc for the final version with lots of samples, and possibly with nested functions for ease of reading by others.
### Error propagation vs exiting on error
The process of taking a cdf and returning a sample might fail, e.g., it's a Newton method which might fail to converge because of cdf artifacts. The cdf itself might also fail, e.g., if a distribution only accepts a range of parameters, but is fed parameters outside that range.
This library provides two approaches:
1. Print the line and function in which the error occured, then exit on error
2. In situations where there might be an error, return a struct containing either the correct value or an error message:
```C
struct box {
int empty;
double content;
char* error_msg;
};
```
The first approach produces terser programs but might not scale. The second approach seems like it could lead to more robust programmes, but is more verbose.
Behaviour on error can be toggled by the `EXIT_ON_ERROR` variable. This library also provides a convenient macro, `PROCESS_ERROR`, to make error handling in either case much terser—see the usage in example 4 in the examples/ folder.
Overall, I'd describe the error handling capabilities of this library as pretty rudimentary. For example, this program might fail in surprising ways if you ask for a lognormal with negative standard deviation, because I haven't added error checking for that case yet.
### Guarantees and licensing
- I offer no guarantees about stability, correctness, performance, etc. I might, for instance, abandon the version in C and rewrite it in Zig, Nim or Rust.
@ -272,6 +248,49 @@ make tidy
It emits one warning about something I already took care of, so by default I've suppressed it. I think this is good news in terms of making me more confident that this simple library is correct :).
### Division between core functions and extraneous expansions
This library differentiates between core functions, which are pretty tightly scoped, and expansions and convenience functions, which are more meandering. Expansions are in `extra.c` and `extra.h`. To use them, take care to link them:
```
// In your C source file
#include "extra.h"
```
```
# When compiling:
gcc -std=c99 -Wall -O3 example.c squiggle.c extra.c -lm -o ./example
```
#### Extra: Cdf auxiliary functions
I provide some Take a cdf, and return a sample from the distribution produced by that cdf. This might make it easier to program models, at the cost of a 20x to 60x slowdown in the parts of the code that use it.
#### Extra: Error propagation vs exiting on error
The process of taking a cdf and returning a sample might fail, e.g., it's a Newton method which might fail to converge because of cdf artifacts. The cdf itself might also fail, e.g., if a distribution only accepts a range of parameters, but is fed parameters outside that range.
This library provides two approaches:
1. Print the line and function in which the error occured, then exit on error
2. In situations where there might be an error, return a struct containing either the correct value or an error message:
```C
struct box {
int empty;
double content;
char* error_msg;
};
```
The first approach produces terser programs but might not scale. The second approach seems like it could lead to more robust programmes, but is more verbose.
Behaviour on error can be toggled by the `EXIT_ON_ERROR` variable. This library also provides a convenient macro, `PROCESS_ERROR`, to make error handling in either case much terser—see the usage in example 4 in the examples/ folder.
Overall, I'd describe the error handling capabilities of this library as pretty rudimentary. For example, this program might fail in surprising ways if you ask for a lognormal with negative standard deviation, because I haven't added error checking for that case yet.
## Related projects
- [Squiggle](https://www.squiggle-language.com/)

Binary file not shown.

View File

@ -1,4 +1,5 @@
#include "../../squiggle.h"
#include "../../extra.h"
#include <math.h>
#include <stdint.h>
#include <stdio.h>

View File

@ -9,7 +9,7 @@ CC=gcc # required for nested functions
# CC=tcc # <= faster compilation
# Main file
SRC=example.c ../../squiggle.c
SRC=example.c ../../squiggle.c ../../extra.c
OUTPUT=./example
## Dependencies

View File

@ -1,4 +1,5 @@
#include "../../squiggle.h"
#include "../../extra.h"
#include <math.h>
#include <stdint.h>
#include <stdio.h>

View File

@ -9,7 +9,7 @@
# CC=tcc # <= faster compilation
# Main file
SRC=example.c ../../squiggle.c
SRC=example.c ../../squiggle.c ../../extra.c
OUTPUT=./example
## Dependencies

Binary file not shown.

Binary file not shown.

View File

@ -1,4 +1,5 @@
#include "../../squiggle.h"
#include "../../extra.h"
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>

View File

@ -9,7 +9,7 @@ CC=gcc
# CC=tcc # <= faster compilation
# Main file
SRC=example.c ../../squiggle.c
SRC=example.c ../../squiggle.c ../../extra.c
OUTPUT=example
## Dependencies

View File

@ -1,4 +1,5 @@
#include "../../squiggle.h"
#include "../../extra.h"
#include <math.h>
#include <stdint.h>
#include <stdio.h>

View File

@ -9,7 +9,7 @@ CC=gcc
# CC=tcc # <= faster compilation
# Main file
SRC=example.c ../../squiggle.c
SRC=example.c ../../squiggle.c ../../extra.c
OUTPUT=example
## Dependencies

View File

@ -1,4 +1,5 @@
#include "../../squiggle.h"
#include "../../extra.h"
#include <math.h>
#include <stdint.h>
#include <stdio.h>

View File

@ -5,22 +5,25 @@
# make run
# Compiler
CC=gcc
CC=gcc # required for nested functions
# CC=tcc # <= faster compilation
# Main file
SRC=example.c ../../squiggle.c
OUTPUT=example
SRC=example.c ../../squiggle.c ../../extra.c
OUTPUT=./example
## Dependencies
MATH=-lm
DEPENDENCIES=$(MATH)
# OPENMP=-fopenmp
## Flags
DEBUG= #'-g'
STANDARD=-std=c99
STANDARD=-std=c99 ## gnu99 allows for nested functions.
EXTENSIONS= #-fnested-functions
WARNINGS=-Wall
OPTIMIZED=-O3#-Ofast
# OPENMP=-fopenmp
CFLAGS=$(DEBUG) $(STANDARD) $(EXTENSIONS) $(WARNINGS) $(OPTIMIZED)
## Formatter
STYLE_BLUEPRINT=webkit
@ -28,13 +31,14 @@ FORMATTER=clang-format -i -style=$(STYLE_BLUEPRINT)
## make build
build: $(SRC)
$(CC) $(OPTIMIZED) $(DEBUG) $(SRC) $(MATH) -o $(OUTPUT)
# gcc -std=gnu99 example.c -lm -o example
$(CC) $(CFLAGS) $(SRC) $(DEPENDENCIES) -o $(OUTPUT)
format: $(SRC)
$(FORMATTER) $(SRC)
run: $(SRC) $(OUTPUT)
OMP_NUM_THREADS=1 ./$(OUTPUT) && echo
./$(OUTPUT) && echo
time-linux:
@echo "Requires /bin/time, found on GNU/Linux systems" && echo

View File

@ -1,4 +1,5 @@
#include "../../squiggle.h"
#include "../../extra.h"
#include <math.h>
#include <stdint.h>
#include <stdio.h>

View File

@ -5,22 +5,25 @@
# make run
# Compiler
CC=gcc
CC=gcc # required for nested functions
# CC=tcc # <= faster compilation
# Main file
SRC=example.c ../../squiggle.c
OUTPUT=example
SRC=example.c ../../squiggle.c ../../extra.c
OUTPUT=./example
## Dependencies
MATH=-lm
DEPENDENCIES=$(MATH)
# OPENMP=-fopenmp
## Flags
DEBUG= #'-g'
STANDARD=-std=c99
STANDARD=-std=c99 ## gnu99 allows for nested functions.
EXTENSIONS= #-fnested-functions
WARNINGS=-Wall
OPTIMIZED=-O3#-Ofast
# OPENMP=-fopenmp
CFLAGS=$(DEBUG) $(STANDARD) $(EXTENSIONS) $(WARNINGS) $(OPTIMIZED)
## Formatter
STYLE_BLUEPRINT=webkit
@ -28,13 +31,14 @@ FORMATTER=clang-format -i -style=$(STYLE_BLUEPRINT)
## make build
build: $(SRC)
$(CC) $(OPTIMIZED) $(DEBUG) $(SRC) $(MATH) -o $(OUTPUT)
# gcc -std=gnu99 example.c -lm -o example
$(CC) $(CFLAGS) $(SRC) $(DEPENDENCIES) -o $(OUTPUT)
format: $(SRC)
$(FORMATTER) $(SRC)
run: $(SRC) $(OUTPUT)
OMP_NUM_THREADS=1 ./$(OUTPUT) && echo
./$(OUTPUT) && echo
time-linux:
@echo "Requires /bin/time, found on GNU/Linux systems" && echo

View File

@ -1,4 +1,5 @@
#include "../../squiggle.h"
#include "../../extra.h"
#include <math.h>
#include <stdint.h>
#include <stdio.h>

View File

@ -5,22 +5,25 @@
# make run
# Compiler
CC=gcc
CC=gcc # required for nested functions
# CC=tcc # <= faster compilation
# Main file
SRC=example.c ../../squiggle.c
OUTPUT=example
SRC=example.c ../../squiggle.c ../../extra.c
OUTPUT=./example
## Dependencies
MATH=-lm
DEPENDENCIES=$(MATH)
# OPENMP=-fopenmp
## Flags
DEBUG= #'-g'
STANDARD=-std=c99
STANDARD=-std=c99 ## gnu99 allows for nested functions.
EXTENSIONS= #-fnested-functions
WARNINGS=-Wall
OPTIMIZED=-O3#-Ofast
# OPENMP=-fopenmp
CFLAGS=$(DEBUG) $(STANDARD) $(EXTENSIONS) $(WARNINGS) $(OPTIMIZED)
## Formatter
STYLE_BLUEPRINT=webkit
@ -28,13 +31,14 @@ FORMATTER=clang-format -i -style=$(STYLE_BLUEPRINT)
## make build
build: $(SRC)
$(CC) $(OPTIMIZED) $(DEBUG) $(SRC) $(MATH) -o $(OUTPUT)
# gcc -std=gnu99 example.c -lm -o example
$(CC) $(CFLAGS) $(SRC) $(DEPENDENCIES) -o $(OUTPUT)
format: $(SRC)
$(FORMATTER) $(SRC)
run: $(SRC) $(OUTPUT)
OMP_NUM_THREADS=1 ./$(OUTPUT) && echo
./$(OUTPUT) && echo
time-linux:
@echo "Requires /bin/time, found on GNU/Linux systems" && echo

View File

@ -1,4 +1,5 @@
#include "../../squiggle.h"
#include "../../extra.h"
#include <math.h>
#include <stdint.h>
#include <stdio.h>

View File

@ -5,22 +5,25 @@
# make run
# Compiler
CC=gcc
CC=gcc # required for nested functions
# CC=tcc # <= faster compilation
# Main file
SRC=example.c ../../squiggle.c
OUTPUT=example
SRC=example.c ../../squiggle.c ../../extra.c
OUTPUT=./example
## Dependencies
MATH=-lm
DEPENDENCIES=$(MATH)
# OPENMP=-fopenmp
## Flags
DEBUG= #'-g'
STANDARD=-std=c99
STANDARD=-std=c99 ## gnu99 allows for nested functions.
EXTENSIONS= #-fnested-functions
WARNINGS=-Wall
OPTIMIZED=-O3#-Ofast
# OPENMP=-fopenmp
CFLAGS=$(DEBUG) $(STANDARD) $(EXTENSIONS) $(WARNINGS) $(OPTIMIZED)
## Formatter
STYLE_BLUEPRINT=webkit
@ -28,13 +31,14 @@ FORMATTER=clang-format -i -style=$(STYLE_BLUEPRINT)
## make build
build: $(SRC)
$(CC) $(OPTIMIZED) $(DEBUG) $(SRC) $(MATH) -o $(OUTPUT)
# gcc -std=gnu99 example.c -lm -o example
$(CC) $(CFLAGS) $(SRC) $(DEPENDENCIES) -o $(OUTPUT)
format: $(SRC)
$(FORMATTER) $(SRC)
run: $(SRC) $(OUTPUT)
OMP_NUM_THREADS=1 ./$(OUTPUT) && echo
./$(OUTPUT) && echo
time-linux:
@echo "Requires /bin/time, found on GNU/Linux systems" && echo

View File

@ -1,4 +1,5 @@
#include "../../squiggle.h"
#include "../../extra.h"
#include <math.h>
#include <stdint.h>
#include <stdio.h>

View File

@ -5,22 +5,25 @@
# make run
# Compiler
CC=gcc
CC=gcc # required for nested functions
# CC=tcc # <= faster compilation
# Main file
SRC=example.c ../../squiggle.c
OUTPUT=example
SRC=example.c ../../squiggle.c ../../extra.c
OUTPUT=./example
## Dependencies
MATH=-lm
DEPENDENCIES=$(MATH)
# OPENMP=-fopenmp
## Flags
DEBUG= #'-g'
STANDARD=-std=c99
STANDARD=-std=c99 ## gnu99 allows for nested functions.
EXTENSIONS= #-fnested-functions
WARNINGS=-Wall
OPTIMIZED=-O3#-Ofast
# OPENMP=-fopenmp
CFLAGS=$(DEBUG) $(STANDARD) $(EXTENSIONS) $(WARNINGS) $(OPTIMIZED)
## Formatter
STYLE_BLUEPRINT=webkit
@ -28,13 +31,14 @@ FORMATTER=clang-format -i -style=$(STYLE_BLUEPRINT)
## make build
build: $(SRC)
$(CC) $(OPTIMIZED) $(DEBUG) $(SRC) $(MATH) -o $(OUTPUT)
# gcc -std=gnu99 example.c -lm -o example
$(CC) $(CFLAGS) $(SRC) $(DEPENDENCIES) -o $(OUTPUT)
format: $(SRC)
$(FORMATTER) $(SRC)
run: $(SRC) $(OUTPUT)
OMP_NUM_THREADS=1 ./$(OUTPUT) && echo
./$(OUTPUT) && echo
time-linux:
@echo "Requires /bin/time, found on GNU/Linux systems" && echo

View File

@ -1,4 +1,5 @@
#include "../../squiggle.h"
#include "../../extra.h"
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>

View File

@ -5,22 +5,25 @@
# make run
# Compiler
CC=gcc
CC=gcc # required for nested functions
# CC=tcc # <= faster compilation
# Main file
SRC=example.c ../../squiggle.c
OUTPUT=example
SRC=example.c ../../squiggle.c ../../extra.c
OUTPUT=./example
## Dependencies
MATH=-lm
DEPENDENCIES=$(MATH)
# OPENMP=-fopenmp
## Flags
DEBUG= #'-g'
STANDARD=-std=c99
STANDARD=-std=c99 ## gnu99 allows for nested functions.
EXTENSIONS= #-fnested-functions
WARNINGS=-Wall
OPTIMIZED=-O3#-Ofast
# OPENMP=-fopenmp
CFLAGS=$(DEBUG) $(STANDARD) $(EXTENSIONS) $(WARNINGS) $(OPTIMIZED)
## Formatter
STYLE_BLUEPRINT=webkit
@ -28,7 +31,8 @@ FORMATTER=clang-format -i -style=$(STYLE_BLUEPRINT)
## make build
build: $(SRC)
$(CC) $(OPTIMIZED) $(DEBUG) $(SRC) $(MATH) -o $(OUTPUT)
# gcc -std=gnu99 example.c -lm -o example
$(CC) $(CFLAGS) $(SRC) $(DEPENDENCIES) -o $(OUTPUT)
format: $(SRC)
$(FORMATTER) $(SRC)
@ -40,7 +44,7 @@ time-linux:
@echo "Requires /bin/time, found on GNU/Linux systems" && echo
@echo "Running 100x and taking avg time $(OUTPUT)"
@t=$$(/usr/bin/time -f "%e" -p bash -c 'for i in {1..100}; do ./$(OUTPUT); done' 2>&1 >/dev/null | grep real | awk '{print $$2}' ); echo "scale=2; 1000 * $$t / 100" | bc | sed "s|^|Time using 1 thread: |" | sed 's|$$|ms|' && echo
@t=$$(/usr/bin/time -f "%e" -p bash -c 'for i in {1..100}; do $(OUTPUT); done' 2>&1 >/dev/null | grep real | awk '{print $$2}' ); echo "scale=2; 1000 * $$t / 100" | bc | sed "s|^|Time using 1 thread: |" | sed 's|$$|ms|' && echo
## Profiling

88
extra.c
View File

@ -6,15 +6,58 @@
#include <stdlib.h>
#include <sys/types.h>
#include <time.h>
#include "squiggle.h"
// math constants
#define PI 3.14159265358979323846 // M_PI in gcc gnu99
#define NORMAL90CONFIDENCE 1.6448536269514727
// Some error niceties; these won't be used until later
#define MAX_ERROR_LENGTH 500
#define EXIT_ON_ERROR 0
#define PROCESS_ERROR(error_msg) process_error(error_msg, EXIT_ON_ERROR, __FILE__, __LINE__)
// # More cool stuff
// This is no longer necessary to do basic estimation,
// but is still cool
// Get confidence intervals, given a sampler
// Not in core yet because I'm not sure how much I like the interface,
// to do: add n to function parameters and document
typedef struct ci_t {
float low;
float high;
} ci;
int compare_doubles(const void* p, const void* q)
{
// https://wikiless.esmailelbob.xyz/wiki/Qsort?lang=en
double x = *(const double*)p;
double y = *(const double*)q;
/* Avoid return x - y, which can cause undefined behaviour
because of signed integer overflow. */
if (x < y)
return -1; // Return -1 if you want ascending, 1 if you want descending order.
else if (x > y)
return 1; // Return 1 if you want ascending, -1 if you want descending order.
return 0;
}
ci get_90_confidence_interval(double (*sampler)(uint64_t*), uint64_t* seed)
{
int n = 100 * 1000;
double* samples_array = malloc(n * sizeof(double));
for (int i = 0; i < n; i++) {
samples_array[i] = sampler(seed);
}
qsort(samples_array, n, sizeof(double), compare_doubles);
ci result = {
.low = samples_array[5000],
.high = samples_array[94999],
};
free(samples_array);
return result;
}
// ## Sample from an arbitrary cdf
struct box {
@ -210,45 +253,6 @@ double sampler_danger(struct box cdf(double), uint64_t* seed)
}
*/
// Get confidence intervals, given a sampler
typedef struct ci_t {
float low;
float high;
} ci;
int compare_doubles(const void* p, const void* q)
{
// https://wikiless.esmailelbob.xyz/wiki/Qsort?lang=en
double x = *(const double*)p;
double y = *(const double*)q;
/* Avoid return x - y, which can cause undefined behaviour
because of signed integer overflow. */
if (x < y)
return -1; // Return -1 if you want ascending, 1 if you want descending order.
else if (x > y)
return 1; // Return 1 if you want ascending, -1 if you want descending order.
return 0;
}
ci get_90_confidence_interval(double (*sampler)(uint64_t*), uint64_t* seed)
{
int n = 100 * 1000;
double* samples_array = malloc(n * sizeof(double));
for (int i = 0; i < n; i++) {
samples_array[i] = sampler(seed);
}
qsort(samples_array, n, sizeof(double), compare_doubles);
ci result = {
.low = samples_array[5000],
.high = samples_array[94999],
};
free(samples_array);
return result;
}
// # Small algebra manipulations
// here I discover named structs,

View File

@ -43,10 +43,7 @@ typedef struct lognormal_params_t {
} lognormal_params;
lognormal_params algebra_product_lognormals(lognormal_params a, lognormal_params b);
lognormal_params convert_ci_to_lognormal_params(ci x);
ci convert_lognormal_params_to_ci(lognormal_params y);
#endif

Binary file not shown.

View File

@ -7,13 +7,14 @@
#include <sys/types.h>
#include <time.h>
#define PI 3.14159265358979323846 // M_PI in gcc gnu99
#define NORMAL90CONFIDENCE 1.6448536269514727
// # Key functionality
// Define the minimum number of functions needed to do simple estimation
// Starts here, ends until the end of the mixture function
// math constants
#define PI 3.14159265358979323846 // M_PI in gcc gnu99
#define NORMAL90CONFIDENCE 1.6448536269514727
// Pseudo Random number generator
uint64_t xorshift32(uint32_t* seed)
{

BIN
test/test

Binary file not shown.