Commit 8aa8546f authored by Moritz Huebner's avatar Moritz Huebner

Merge remote-tracking branch 'origin/master'

parents 99bb6a62 c7893edf
Pipeline #20612 passed with stages
in 12 minutes and 41 seconds
......@@ -9,6 +9,7 @@ Welcome to tupak's documentation!
:caption: Contents:
basics-of-parameter-estimation
tupak-output
likelihood
samplers
writing-documentation
......
============
Tupak output
============
In this document, we will describe what :code::code:`tupak` outputs, where it is stored,
and how you can access it.
When you call :code:`run_sampler`, there are two arguments :code:`outdir` and :code:`label` which
are used in generating all the file names for saved data. In the rest of these
documents, we'll assume the defaults where used (which are :code:`outdir` and
:code:`label`).
The raw data
------------
First off, the primary data dump of :code:`tupak` goes into :code:`outdir/label_result.h5`.
This is a binary file, so isn't human readable, but you can use it using a
command-line tool :code:`ddls` [#f1]_ (you have this installed already if you have installed
the :code:`tupak` requirements). To have a look at what is inside run
.. code-block:: console
$ ddls outdir/label_result.h5
/fixed_parameter_keys list
/fixed_parameter_keys/i0 'dec' (3) [ascii]
/fixed_parameter_keys/i1 'psi' (3) [ascii]
/fixed_parameter_keys/i2 'a_2' (3) [ascii]
/fixed_parameter_keys/i3 'a_1' (3) [ascii]
/fixed_parameter_keys/i4 'geocent_time' (12) [ascii]
/fixed_parameter_keys/i5 'phi_jl' (6) [ascii]
/fixed_parameter_keys/i6 'ra' (2) [ascii]
/fixed_parameter_keys/i7 'phase' (5) [ascii]
/fixed_parameter_keys/i8 'phi_12' (6) [ascii]
/fixed_parameter_keys/i9 'tilt_2' (6) [ascii]
/fixed_parameter_keys/i10 'tilt_1' (6) [ascii]
/injection_parameters dict
/injection_parameters/a_1 0.4 [float64]
/injection_parameters/a_2 0.3 [float64]
/injection_parameters/dec -1.2108 [float64]
/injection_parameters/geocent_time 1126259642.413 [float64]
/injection_parameters/iota 0.4 [float64]
/injection_parameters/luminosity_distance 4000.0 [float64]
/injection_parameters/mass_1 36.0 [float64]
/injection_parameters/mass_2 29.0 [float64]
/injection_parameters/phase 1.3 [float64]
/injection_parameters/phi_12 1.7 [float64]
/injection_parameters/phi_jl 0.3 [float64]
/injection_parameters/psi 2.659 [float64]
/injection_parameters/ra 1.375 [float64]
/injection_parameters/reference_frequency 50.0 [float64]
/injection_parameters/tilt_1 0.5 [float64]
/injection_parameters/tilt_2 1.0 [float64]
/injection_parameters/waveform_approximant 'IMRPhenomPv2' (12) [ascii]
/kwargs dict
/kwargs/bound 'multi' (5) [ascii]
/kwargs/dlogz 0.1 [float64]
/kwargs/nlive 1000 [int64]
/kwargs/sample 'rwalk' (5) [ascii]
/kwargs/update_interval 600 [int64]
/kwargs/verbose True [bool]
/kwargs/walks 20 [int64]
/label 'basic_tutorial' (14) [ascii]
/log_bayes_factor 29.570224130853056 [float64]
/logz -12042.875089354413 [float64]
/logzerr 0.040985337772488764 [float64]
/noise_logz -12072.445313485267 [float64]
/outdir 'outdir' (6) [ascii]
/parameter_labels list
/parameter_labels/i0 '$d_L$' (5) [ascii]
/parameter_labels/i1 '$m_2$' (5) [ascii]
/parameter_labels/i2 '$m_1$' (5) [ascii]
/parameter_labels/i3 '$\\iota$' (7) [ascii]
/posterior DataFrame (4, 15073)
/search_parameter_keys list
/search_parameter_keys/i0 'luminosity_distance' (19) [ascii]
/search_parameter_keys/i1 'mass_2' (6) [ascii]
/search_parameter_keys/i2 'mass_1' (6) [ascii]
/search_parameter_keys/i3 'iota' (4) [ascii]
This shows the different data that is stored in the :code:`h5` file. You can think of
the file like a python dictionary - its a bag with lots of different kind of
data which can be accessed via a :code:`key` (a string). We use `deepdish
<http://deepdish.io/>`_ to handle the saving of :code:`h5` files in :code:`tupak`. In python,
can load any :code:`h5` file and access its contents like a dictionary::
>>> import deepdish
>>> output = deepdish.io.load('outdir/label_result.h5')
>>> print(output['logz'])
-146.23
Here we printed the stored data for the :code:`'logz'` key, but you could equally get
the :code:`posterior` or any other attribute. For a full list, print the
`output.keys()`.
Reading in a result file
------------------------
Rather than reading in the raw :code:`h5` file, may find it more convienient to
instead load a :code:`*result.h5` as a :code:`tupak` :code:`result` object (i.e., like the output
of :code:`run_sampler`. To do this::
>>> import tupak
>>> result = tupak.result.read_in_result(outdir=outdir, label=label)
>>> result = tupak.result.read_in_result(filename='outdir/label_result.h5)
Note, these two lines are equivalent, but show two different ways to read in
the data. Using this method, :code:`result` is now a :code:`tupak.result.Result` instance
and has all the methods and attributes. So, for example, you could call
`result.plot_corner()` to generate a corner plot.
Accessing samples directly
--------------------------
To get the samples for a particular parameter, use::
>>> result.posterior.[key]
where :code:`key` is a string of the parameter name you are interested in. The
`posterior` attribute is a :code:`pandas.DataFrame`, so if you want to return just
a :code:`numpy` array, use::
>>> result.posterior.[key].values
Saving to ASCII text
--------------------
You may wish to get hold of the samples in a simple text file, this can be
done via::
result.save_posterior_samples()
which will generate a file :code:`outdir/label_posterior.txt`.
.. rubric:: Footnotes
.. [#f1] :code:`ddls` is an acronym for deep-dish ls
......@@ -29,10 +29,10 @@ injection_parameters = dict(mass_1=36., mass_2=29., a_1=0.4, a_2=0.3, tilt_1=0.5
waveform_approximant='IMRPhenomPv2', reference_frequency=50., ra=1.375, dec=-1.2108)
# Create the waveform_generator using a LAL BinaryBlackHole source function
waveform_generator = tupak.waveform_generator.WaveformGenerator(time_duration=time_duration,
sampling_frequency=sampling_frequency,
frequency_domain_source_model=tupak.source.lal_binary_black_hole,
parameters=injection_parameters)
waveform_generator = tupak.WaveformGenerator(time_duration=time_duration,
sampling_frequency=sampling_frequency,
frequency_domain_source_model=tupak.source.lal_binary_black_hole,
parameters=injection_parameters)
hf_signal = waveform_generator.frequency_domain_strain()
# Set up interferometers. In this case we'll use three interferometers (LIGO-Hanford (H1), LIGO-Livingston (L1),
......@@ -55,11 +55,11 @@ for key in ['a_1', 'a_2', 'tilt_1', 'tilt_2', 'phi_12', 'phi_jl', 'phase', 'psi'
priors['luminosity_distance'] = tupak.prior.create_default_prior(name='luminosity_distance')
# Initialise the likelihood by passing in the interferometer data (IFOs) and the waveoform generator
likelihood = tupak.likelihood.GravitationalWaveTransient(interferometers=IFOs, waveform_generator=waveform_generator)
likelihood = tupak.GravitationalWaveTransient(interferometers=IFOs, waveform_generator=waveform_generator)
# Run sampler. In this case we're going to use the `dynesty` sampler
result = tupak.sampler.run_sampler(likelihood=likelihood, priors=priors, sampler='dynesty', npoints=1000,
injection_parameters=injection_parameters, outdir=outdir, label=label)
result = tupak.run_sampler(likelihood=likelihood, priors=priors, sampler='dynesty', npoints=1000,
injection_parameters=injection_parameters, outdir=outdir, label=label)
# make some plots of the outputs
result.plot_corner()
......
......@@ -44,20 +44,19 @@ prior['luminosity_distance'] = tupak.prior.PowerLaw(
# creates the frequency-domain strain. In this instance, we are using the
# `lal_binary_black_hole model` source model. We also pass other parameters:
# the waveform approximant and reference frequency.
waveform_generator = tupak.waveform_generator.WaveformGenerator(time_duration=interferometers[0].duration,
sampling_frequency=interferometers[
0].sampling_frequency,
frequency_domain_source_model=tupak.source.lal_binary_black_hole,
parameters={'waveform_approximant': 'IMRPhenomPv2',
'reference_frequency': 50})
waveform_generator = tupak.WaveformGenerator(time_duration=interferometers[0].duration,
sampling_frequency=interferometers[0].sampling_frequency,
frequency_domain_source_model=tupak.source.lal_binary_black_hole,
parameters={'waveform_approximant': 'IMRPhenomPv2',
'reference_frequency': 50})
# In this step, we define the likelihood. Here we use the standard likelihood
# function, passing it the data and the waveform generator.
likelihood = tupak.likelihood.GravitationalWaveTransient(interferometers, waveform_generator)
likelihood = tupak.GravitationalWaveTransient(interferometers, waveform_generator)
# Finally, we run the sampler. This function takes the likelihood and prio
# along with some options for how to do the sampling and how to save the data
result = tupak.sampler.run_sampler(likelihood, prior, sampler='dynesty',
outdir=outdir, label=label)
result = tupak.run_sampler(likelihood, prior, sampler='dynesty',
outdir=outdir, label=label)
result.plot_corner()
print(result)
......@@ -10,5 +10,5 @@ t0 = tupak.utils.get_event_time("GW150914")
prior = dict(geocent_time=tupak.prior.Uniform(t0-0.1, t0+0.1, name='geocent_time'))
interferometers = tupak.detector.get_event_data("GW150914")
likelihood = tupak.likelihood.get_binary_black_hole_likelihood(interferometers)
result = tupak.sampler.run_sampler(likelihood, prior, label='GW150914')
result = tupak.run_sampler(likelihood, prior, label='GW150914')
result.plot_corner()
......@@ -55,7 +55,7 @@ fig.savefig('{}/{}_data.png'.format(outdir, label))
# our model.
class GaussianLikelihood(tupak.likelihood.Likelihood):
class GaussianLikelihood(tupak.Likelihood):
def __init__(self, x, y, sigma, waveform_generator):
"""
......@@ -91,7 +91,7 @@ class GaussianLikelihood(tupak.likelihood.Likelihood):
# can generate a signal. We give it information on how to make the time series
# and the model() we wrote earlier.
waveform_generator = tupak.waveform_generator.WaveformGenerator(
waveform_generator = tupak.WaveformGenerator(
time_duration=time_duration, sampling_frequency=sampling_frequency,
time_domain_source_model=model)
......@@ -106,7 +106,7 @@ priors['m'] = tupak.prior.Uniform(0, 5, 'm')
priors['c'] = tupak.prior.Uniform(-2, 2, 'c')
# And run sampler
result = tupak.sampler.run_sampler(
result = tupak.run_sampler(
likelihood=likelihood, priors=priors, sampler='dynesty', npoints=500,
walks=10, injection_parameters=injection_parameters, outdir=outdir,
label=label, plot=True)
......
......@@ -21,3 +21,8 @@ from . import waveform_generator
from . import result
from . import sampler
from . import conversion
# import a few oft-used functions and classes to simplify scripts
from likelihood import Likelihood, GravitationalWaveTransient
from waveform_generator import WaveformGenerator
from sampler import run_sampler
import tupak
import numpy as np
import lalsimulation as lalsim
import pandas as pd
import logging
try:
import lalsimulation as lalsim
except ImportError:
logging.warning("You do not have lalsuite installed currently. You will "
" not be able to use some of the prebuilt functions.")
def convert_to_lal_binary_black_hole_parameters(parameters, search_keys, remove=True):
"""
......
......@@ -557,7 +557,7 @@ def test_redundancy(key, prior):
return redundant
def write_priors_to_file(priors, outdir):
def write_priors_to_file(priors, outdir, label):
"""
Write the prior distribution to file.
......@@ -565,13 +565,12 @@ def write_priors_to_file(priors, outdir):
----------
priors: dict
priors used
outdir: str
output directory
outdir, label: str
output directory and label
"""
if outdir[-1] != "/":
outdir += "/"
prior_file = outdir + "prior.txt"
logging.info("Writing priors to {}".format(prior_file))
prior_file = os.path.join(outdir, "{}_prior.txt".format(label))
logging.debug("Writing priors to {}".format(prior_file))
with open(prior_file, "w") as outfile:
for key in priors:
outfile.write("prior['{}'] = {}\n".format(key, priors[key]))
......@@ -39,7 +39,8 @@ class Result(dict):
def __init__(self, dictionary=None):
if type(dictionary) is dict:
for key in dictionary:
setattr(self, key, self._decode_object(dictionary[key]))
val = self._standardise_strings(dictionary[key], key)
setattr(self, key, val)
def __getattr__(self, name):
try:
......@@ -62,30 +63,27 @@ class Result(dict):
else:
return ''
def _decode_object(self, item):
""" When reading in data, ensure all bytes are decoded to strings """
try:
def _standardise_a_string(self, item):
""" When reading in data, ensure all strings are decoded correctly """
if type(item) in [bytes]:
return item.decode()
except AttributeError:
pass
try:
return [i.decode() for i in item]
except (AttributeError, TypeError):
pass
else:
return item
logging.debug("Unable to decode item")
def _standardise_strings(self, item, name=None):
if type(item) in [list]:
item = [self._standardise_a_string(i) for i in item]
#logging.debug("Unable to decode item {}".format(name))
return item
def get_result_dictionary(self):
return dict(self)
def save_to_file(self, outdir, label):
def save_to_file(self):
""" Writes the Result to a deepdish h5 file """
file_name = result_file_name(outdir, label)
if os.path.isdir(outdir) is False:
os.makedirs(outdir)
file_name = result_file_name(self.outdir, self.label)
if os.path.isdir(self.outdir) is False:
os.makedirs(self.outdir)
if os.path.isfile(file_name):
logging.info(
'Renaming existing file {} to {}.old'.format(file_name,
......@@ -100,6 +98,10 @@ class Result(dict):
"\n\n Saving the data has failed with the following message:\n {} \n\n"
.format(e))
def save_posterior_samples(self):
filename = '{}/{}_posterior_samples.txt'.format(self.outdir, self.label)
self.posterior.to_csv(filename, index=False, header=True)
def get_latex_labels_from_parameter_keys(self, keys):
return_list = []
for k in keys:
......@@ -186,20 +188,10 @@ class Result(dict):
"""
logging.warning("plot_distributions deprecated")
def write_prior_to_file(self, outdir):
"""
Write the prior distribution to file.
:return:
"""
outfile = outdir + '.prior'
with open(outfile, "w") as prior_file:
for key in self.prior:
prior_file.write(self.prior[key])
def samples_to_data_frame(self, likelihood=None, priors=None, conversion_function=None):
def samples_to_posterior(self, likelihood=None, priors=None,
conversion_function=None):
"""
Convert array of samples to data frame.
Convert array of samples to posterior (a Pandas data frame).
Parameters
----------
......@@ -211,7 +203,8 @@ class Result(dict):
Function which adds in extra parameters to the data frame,
should take the data_frame, likelihood and prior as arguments.
"""
data_frame = pd.DataFrame(self.samples, columns=self.search_parameter_keys)
data_frame = pd.DataFrame(
self.samples, columns=self.search_parameter_keys)
if conversion_function is not None:
conversion_function(data_frame, likelihood, priors)
self.posterior = data_frame
......@@ -235,7 +228,7 @@ class Result(dict):
(4 * self.posterior.q + 3) / (3 * self.posterior.q + 4) * self.posterior.q
* self.posterior.a_2 * np.sin(self.posterior.tilt_2))
def check_attribute_match_to_other_object(self, name, other_object):
def _check_attribute_match_to_other_object(self, name, other_object):
""" Check attribute name exists in other_object and is the same """
A = getattr(self, name, False)
B = getattr(other_object, name, False)
......
......@@ -51,18 +51,18 @@ class Sampler(object):
self.__search_parameter_keys = []
self.__fixed_parameter_keys = []
self.initialise_parameters()
self.verify_parameters()
self._initialise_parameters()
self._verify_parameters()
self.kwargs = kwargs
self.check_cached_result()
self._check_cached_result()
self.log_summary_for_sampler()
self._log_summary_for_sampler()
if os.path.isdir(outdir) is False:
os.makedirs(outdir)
self.result = self.initialise_result()
self.result = self._initialise_result()
@property
def search_parameter_keys(self):
......@@ -102,7 +102,7 @@ class Sampler(object):
raise TypeError('sampler must either be a string referring to built in sampler or a custom made class that '
'inherits from sampler')
def verify_kwargs_against_external_sampler_function(self):
def _verify_kwargs_against_external_sampler_function(self):
args = inspect.getargspec(self.external_sampler_function).args
bad_keys = []
for user_input in self.kwargs.keys():
......@@ -114,7 +114,7 @@ class Sampler(object):
for key in bad_keys:
self.kwargs.pop(key)
def initialise_parameters(self):
def _initialise_parameters(self):
for key in self.priors:
if isinstance(self.priors[key], Prior) is True \
......@@ -132,7 +132,7 @@ class Sampler(object):
for key in self.__fixed_parameter_keys:
logging.info(' {} = {}'.format(key, self.priors[key].peak))
def initialise_result(self):
def _initialise_result(self):
result = Result()
result.search_parameter_keys = self.__search_parameter_keys
result.fixed_parameter_keys = self.__fixed_parameter_keys
......@@ -144,7 +144,7 @@ class Sampler(object):
result.kwargs = self.kwargs
return result
def verify_parameters(self):
def _verify_parameters(self):
for key in self.priors:
try:
self.likelihood.parameters[key] = self.priors[key].sample()
......@@ -192,13 +192,13 @@ class Sampler(object):
logging.info('Prior draw {} has inf prior'.format(draw))
return draw
def run_sampler(self):
def _run_external_sampler(self):
pass
def _run_test(self):
raise ValueError("Method not yet implemented")
def check_cached_result(self):
def _check_cached_result(self):
""" Check if the cached data file exists and can be used """
if utils.command_line_args.clean:
......@@ -221,14 +221,14 @@ class Sampler(object):
'kwargs']
use_cache = True
for key in check_keys:
if self.cached_result.check_attribute_match_to_other_object(
if self.cached_result._check_attribute_match_to_other_object(
key, self) is False:
logging.debug("Cached value {} is unmatched".format(key))
use_cache = False
if use_cache is False:
self.cached_result = None
def log_summary_for_sampler(self):
def _log_summary_for_sampler(self):
if self.cached_result is None:
logging.info("Using sampler {} with kwargs {}".format(
self.__class__.__name__, self.kwargs))
......@@ -250,14 +250,14 @@ class Nestle(Sampler):
if equiv in self.__kwargs:
self.__kwargs['npoints'] = self.__kwargs.pop(equiv)
def run_sampler(self):
def _run_external_sampler(self):
nestle = self.external_sampler
self.external_sampler_function = nestle.sample
if 'verbose' in self.kwargs:
if self.kwargs['verbose']:
self.kwargs['callback'] = nestle.print_progress
self.kwargs.pop('verbose')
self.verify_kwargs_against_external_sampler_function()
self._verify_kwargs_against_external_sampler_function()
out = self.external_sampler_function(
loglikelihood=self.log_likelihood,
......@@ -304,7 +304,40 @@ class Dynesty(Sampler):
if 'update_interval' not in self.__kwargs:
self.__kwargs['update_interval'] = int(0.6 * self.__kwargs['nlive'])
def run_sampler(self):
def _print_func(self, results, niter, ncall, dlogz, *args, **kwargs):
""" Replacing status update for dynesty.result.print_func """
# Extract results at the current iteration.
(worst, ustar, vstar, loglstar, logvol, logwt,
logz, logzvar, h, nc, worst_it, boundidx, bounditer,
eff, delta_logz) = results
# Adjusting outputs for printing.
if delta_logz > 1e6:
delta_logz = np.inf
if logzvar >= 0. and logzvar <= 1e6:
logzerr = np.sqrt(logzvar)
else:
logzerr = np.nan
if logz <= -1e6:
logz = -np.inf
if loglstar <= -1e6:
loglstar = -np.inf
if self.use_ratio:
key = 'logz ratio'
else:
key = 'logz'
# Constructing output.
print_str = "\r {}| {}={:6.3f} +/- {:6.3f} | dlogz: {:6.3f} > {:6.3f}".format(
niter, key, logz, logzerr, delta_logz, dlogz)
# Printing.
sys.stderr.write(print_str)
sys.stderr.flush()
def _run_external_sampler(self):
dynesty = self.external_sampler
if self.kwargs.get('dynamic', False) is False:
......@@ -314,7 +347,8 @@ class Dynesty(Sampler):
ndim=self.ndim, **self.kwargs)
nested_sampler.run_nested(
dlogz=self.kwargs['dlogz'],
print_progress=self.kwargs['verbose'])
print_progress=self.kwargs['verbose'],
print_func=self._print_func)
else:
nested_sampler = dynesty.DynamicNestedSampler(
loglikelihood=self.log_likelihood,
......@@ -383,10 +417,10 @@ class Pymultinest(Sampler):
if equiv in self.__kwargs:
self.__kwargs['n_live_points'] = self.__kwargs.pop(equiv)
def run_sampler(self):
def _run_external_sampler(self):
pymultinest = self.external_sampler
self.external_sampler_function = pymultinest.run
self.verify_kwargs_against_external_sampler_function()
self._verify_kwargs_against_external_sampler_function()
# Note: pymultinest.solve adds some extra steps, but underneath
# we are calling pymultinest.run - hence why it is used in checking
# the arguments.
......@@ -404,7 +438,7 @@ class Pymultinest(Sampler):
class Ptemcee(Sampler):
def run_sampler(self):
def _run_external_sampler(self):
ntemps = self.kwargs.pop('ntemps', 2)
nwalkers = self.kwargs.pop('nwalkers', 100)
nsteps = self.kwargs.pop('nsteps', 100)
......@@ -495,7 +529,7 @@ def run_sampler(likelihood, priors=None, label='label', outdir='outdir',
if priors is None:
priors = dict()
priors = fill_priors(priors, likelihood)
tupak.prior.write_priors_to_file(priors, outdir)
tupak.prior.write_priors_to_file(priors, outdir, label)
if implemented_samplers.__contains__(sampler.title()):
sampler_class = globals()[sampler.title()]
......@@ -510,7 +544,7 @@ def run_sampler(likelihood, priors=None, label='label', outdir='outdir',
if utils.command_line_args.test:
result = sampler._run_test()
else:
result = sampler.run_sampler()
result = sampler._run_external_sampler()
result.noise_logz = likelihood.noise_log_likelihood()
if use_ratio:
......@@ -524,9 +558,10 @@ def run_sampler(likelihood, priors=None, label='label', outdir='outdir',
conversion_function(result.injection_parameters)
result.fixed_parameter_keys = sampler.fixed_parameter_keys
# result.prior = prior # Removed as this breaks the saving of the data
result.samples_to_data_frame(likelihood=likelihood, priors=priors, conversion_function=conversion_function)
result.samples_to_posterior(likelihood=likelihood, priors=priors,
conversion_function=conversion_function)
result.kwargs = sampler.kwargs
result.save_to_file(outdir=outdir, label=label)
result.save_to_file()
if plot:
result.plot_corner()
return result
......
......@@ -5,8 +5,8 @@ import logging
try:
import lalsimulation as lalsim
except ImportError:
logging.warning("You do not have lalsuite installed currently. You will not be able to use some of the "
"prebuilt functions.")
logging.warning("You do not have lalsuite installed currently. You will "
" not be able to use some of the prebuilt functions.")