Lazy version of `generate_all_posterior_samples`
The Read
objects which represent GW analyses and their posterior samples have a method generate_all_posterior_samples
, which takes the sampling parameters and converts them to any missing parameters which can be derived from them (e.g., \mathcal{M}_c
from m_1, m_2
). This can be a slow process, as all samples must be converted for all coordinate options. There is an option to disable the conversions, but often one or more converted coordinates are needed (and I believe all are needed when generating summary pages).
As a way to speed up evaluation, in the scenario where at least one conversion is needed (but not all), the list of converted parameter samples could be replaced with a lazy sequence. Rather than store the converted samples in the list, initially we just store a list of conversion functions, and when the __get__
method is called for that element for the first time, we call the function, and cache the result for future uses.
I've written a generic version of a lazy sequence that should work just fine.
from collections.abc import Sequence
class LazySequence(Sequence):
NOT_COMPUTED = object()
def __init__(self, *functions):
self._fns = functions
self._cached_values = [LazySequence.NOT_COMPUTED]*len(functions)
# A list of the indices 0, ..., length - 1, used to help with slicing.
self._indices = list(range(len(functions)))
def __len__(self):
return len(self._cached_values)
def __getitem__(self, key):
if isinstance(key, int):
output = self._compute_or_load_from_cache(key)
elif isinstance(key, slice):
output = [
self._compute_or_load_from_cache(i)
for i in self._indices[key]
]
else:
raise TypeError("LazySequence indices must be integers or slices, not {}".format(type(key)))
return output
def _compute_or_load_from_cache(self, index):
if self._cached_values[index] is LazySequence.NOT_COMPUTED:
output = self._fns[index]()
self._cached_values[index] = output
else:
output = self._cached_values[index]
return output
Rather than building the list with the elements, you build the list with a series of functions that will each generate a single element. For example:
l = LazySequence(lambda: 0, lambda: 1, lambda: 2)
l[0] # calls the first function, caches its output, and returns it (0)
l[0] # loads the cached value (0)
l[:2] # uses the first cached value, and computes the second value ([0, 1])
l[:2] # uses all cached values ([0, 1])
l[:] # uses the first two cached values, and computes the third ([0, 1, 2])
l[:] # uses all cached values ([0, 1, 2])
In PESummary each of these lambda
functions would be replaced with functions which compute coordinate conversions. I'm not well versed in how this is done internally (it seems to happen inside of pesummary.gw.conversions.convert
which in turn uses pesummary.gw.conversion._Conversion
) so I'll need some help integrating this if it's wanted.
An even better approach would not use this generic class, but instead one that is tightly integrated with the _Conversion
class, so we do not need to pass around lots of functions. I'm happy to help write this / integrate it with PESummary, but again, I'll need some help from a PESummary expert to do it right.