Performance improvement for SamplesDict and reading
Hello,
I have been using this package for reading the O4a online PE hdf5 files and found that reading them (using the pesummary.io.read
function) was fairly slow. Using profiling tools, I found that half of the time was spent within lalinference (see this issue for more details lalsuite#738 (closed)), and half was spent dealing with the specific structures from this package (List
, Dict
and so on). Those incur a quite significant overhead.
This PR aims to improve the performance of SamplesDict
by adding some caching; indeed, each time an element is accessed, the keys
are retrieved (this behavior is inherited for Dict
, which does so because it checks whether the keys that were passed to it are actual keys in order to raise a warning (which does not seem like a very desirable behavior, but anyway)); however, for SamplesDict
in particular, retrieving the keys is not free, as by default it filters the keys starting by an underscore. With this PR, the filtered keys are cached, and the cache is reset when an element of the dict is set or deleted.
Combined with the changes to lalinference, I am able to obtain something like a 4-fold improvement in performance.
EDIT : Was not a good idea in the end. See comment below.
Caveats:
- If the dict is modified in some arcane way (unsure of whether this is possible), the keys may not be correct anymore.
Possible (untested) alternatives:
Somehow callingkeys
with theremove_debug=False
argument. Not trivial as this would break for other objects thanSamplesDict
. This could be achieved by making False the default value, but this may break something else.Grouping calls tokeys
: could also be done now; this probably divides the number of calls by 2 in total, which I think is less significant than the current improvement.- Removing this behavior of
Dict
altogether, allowing it to err if any of the keys passed to it is not in the dictionary. I do not know whether this behavior is used somewhere but it seems very error prone. - Figure out some way to get the same behavior without
keys
; probably using theget
method.
In any case, no matter the solution that gets chosen, I think this is an issue to be aware of as it may affect anyone using this to load PE results.