Gregory Ashton · b013d110
--- a/dynesty-parallization.md
+++ b/dynesty-parallization.md
@@ -257,4 +257,36 @@ Here is the speed-up relative to the 1-core job

 Notes
 * This demonstrates the linear scaling
-* This demonstrates that if the code is instructed to use more cores than the available number, we reach a plateue
\ No newline at end of file
+* This demonstrates that if the code is instructed to use more cores than the available number, we reach a plateue
+
+## Update: 04/12/2020
+
+After it was pointed out that the scaling was linear, but that the gradient was not close to one, I studied the behaviour in a little more detail. First, here is a re-run of the data above showing different gradients:
+![image](uploads/32ece9913c735eb981eac9ad52011a9e/image.png)
+
+Here, it looks like the gradient is ~0.5. This is at odds with [the pbilby paper](https://arxiv.org/pdf/1909.11873.pdf) which demonstrated a speed-up close to the theoretically expected behaviour (Eq 10):
+![image](uploads/36b4515be72d9748e0d7fc114bf83f5d/image.png)
+
+After digging in, I realized that the single-core job used about half as many likelihood evaluations as the parallelized version. Here is a table of the number of evaluations:
+
+| n cores | # likelihood evaluations [millions] |
+| ------ | ------ |
+| 1 | 0.59 |
+| 4 | 1.5 |
+| 8 | 1.5 |
+| 12 | 1.5 |
+| 16 | 1.6 |
+
+So, this offers to ways to calculate the speed up. The usual "total time" method, or on a "per-likelihood". On this basis, things look much better!
+
+![image](uploads/e98da47726dff85af7841a78a713555c/image.png)
+
+Of course, what we really care about is "total time". So, some conclusions:
+
+1. The parallel algorithm is different from the serial algorithm.
+2. The parallel algorithm is about 2-3 times less efficient than the serial algorithm.
+3. This explains the difference in speedups (pbilby speedups where measured per-likelihood)
+4. It is worth stating: while it is less efficient, the parallel algorithm does let you scale!
+5. This suggests the parallel algorithm could be improved yielding up to a factor of 3 in speed gains.
+
+Note: For the first run of the update, the ratio of likelihood evaluations between the serial and parallel jobs was ~2.8 while for the second run it was 2.7)
\ No newline at end of file