... | @@ -272,4 +272,4 @@ In all parallel codes, the maximum speedup that can be achieved is given by Amda |
... | @@ -272,4 +272,4 @@ In all parallel codes, the maximum speedup that can be achieved is given by Amda |
|
|
|
|
|
The first is the writing of checkpoints, which at the time of testing, occurred every 10 minutes. This is wasteful, as it provides users no real advantages over a longer checkpointing interval. In a recent update, the checkpoint time is an adjustable parameter, with the default increased to 1 hour. This is expected to provide a 5-10% improvement in run time. For further optimisation, the serial bottleneck of checkpointing can be eliminated entirely by offloading the I/O to one worker task while the other tasks continue with the next iteration.
|
|
The first is the writing of checkpoints, which at the time of testing, occurred every 10 minutes. This is wasteful, as it provides users no real advantages over a longer checkpointing interval. In a recent update, the checkpoint time is an adjustable parameter, with the default increased to 1 hour. This is expected to provide a 5-10% improvement in run time. For further optimisation, the serial bottleneck of checkpointing can be eliminated entirely by offloading the I/O to one worker task while the other tasks continue with the next iteration.
|
|
|
|
|
|
The second serial portion is the processing and updating of points at the end of each iteration. Since the next iteration depends on the processing of these points, it cannot begin until this is complete. An optimisation requiring minimal effort is disabling the calculation of `n_effective` in the dynesty library, since it is not used for the stopping criteria. This will provide a speedup of roughly 3%. |
|
The second serial portion is the processing and updating of points at the end of each iteration. Since the next iteration depends on the processing of these points, it cannot begin until this is complete. An optimisation requiring minimal effort is disabling the calculation of `n_effective` in the dynesty library, since it is not used for the stopping criteria. This will provide a speedup of roughly 3%. Although this seems small, it is important to note that this is a reduction of the serial portion. For example, reducing the serial portion of the code from 4% to 1% will result in an increase of the theoretical maximum parallel speedup from 25x to 100x. |
|
\ No newline at end of file |
|
\ No newline at end of file |