time-to-botec/C-optimized
2023-06-02 15:44:52 -06:00
..
out savepoint 2023-06-02 13:17:12 -06:00
makefile add better timing to makefile 2023-06-02 15:44:52 -06:00
README.md feat: add timing across 10 runs. 2023-06-02 13:56:50 -06:00
samples-one-thread.c savepoint 2023-06-02 13:17:12 -06:00
samples.c savepoint 2023-06-02 13:17:12 -06:00

C-Optimized

An optimized version of the original C implementation.

The main changes are:

  • an optimization of the mixture function (it passes the functions instead of the whole arrays, reducing in great measure the memory usage and the computation time) and
  • the implementation of multi-threading with OpenMP.

Performance

The mean time of execution is 6 ms. With the following distribution:

Time histogram

The hardware used has been an AMD 5800x3D and 16GB of DDR4-3200 MHz.

Also, the time data has been collected by executing the interior of the main() function 1000 times in a for loop, not executing the program itself 1000 times.

Multithreading

Take into account that the multi-threading introduces a bit of dispersion in the execution time due to the creation and destruction of threads.

In Nuño's machine, multithreading actually introduces a noticeable slowdown factor.

To do

  • Use proper profiling tool to capture timing with 1M samples.
  • Update above with correct timing
  • Add Windows/Powershell time-measuring commands
  • Add CUDA?
  • See if program can be reworded so as to use multithreading effectively, e.g., so that you see speed gains proportional to the number of threads used