Commit ed68fc54 authored by Charlie Hoy's avatar Charlie Hoy
Browse files

Merge branch 'classification_module' into 'master'

Combine `pesummary.gw.pepredicates` and `pesummary.gw.p_astro` modules into a single `pesummary.gw.classification` module

See merge request !647
parents da8b25d3 9324b5c7
Pipeline #474233 failed with stages
in 131 minutes and 2 seconds
......@@ -42,4 +42,5 @@ classification probabilities with,
Where the `default` classification is generated by using the samples directly,
and the `population` classification is generated by reweighing the samples to a
population based prior.
population based prior. For more details see the
`pesummary.gw.classification <./classification.html>`_ module.
==================================
pesummary.gw.classification module
==================================
The :code:`pesummary.gw.classification` module allows for classification
probabilities to be generated directly from a set of posterior samples. We
calculate probability that the source is a binary black hole (BBH),
binary neutron star (BNS), neutron star black hole (NSBH) and/or lies
within the lower mass gap (MassGap) by interacting with the
`PEPredicates <https://git.ligo.org/will-farr/pepredicates>`_ package. We also
calculate the probability that the binary contains at least one neutron star
(HasNS) and the probability that the binary has a visible remnant
(HasRemnant) by interacting with the
`p-astro <https://git.ligo.org/lscsoft/p-astro>`_ package. :code:`pesummary`
provides 3 classes for calculating classification probabilities:
:code:`pesummary.gw.classification.PEPredicates`,
:code:`pesummary.gw.classification.PAstro` and
:code:`pesummary.gw.classification.Classify` as well as a helper
:code:`pesummary.gw.classification.classify` function. We discuss each of them
below.
`pesummary.gw.classification.PEPredicates`
------------------------------------------
We may calculate the probability that the binary is a BBH, BNS, NSBH and/or
lies within the lower mass gap by passing a set of posterior samples to the
:code:`pesummary.gw.classification.PEPredicates` class,
.. code-block:: python
>>> from pesummary.gw.classification import PEPredicates
>>> x = PEPredicates(
... {
... "mass_1_source": [20, 30], "mass_2_source": [10, 20],
... "a_1": [0.5, 0.2], "a_2": [0.3, 0.1], "redshift": [0.4, 0.2]
... }
... )
>>> print(x.classification())
{'BNS': 0.0, 'NSBH': 0.0, 'BBH': 1.0, 'MassGap': 0.0}
We may also see how these probabilities change if we reweigh the posterior
samples to a population inferred prior by passing the
:code:`population=True` kwarg,
.. code-block:: python
>>> print(x.classification(population=True))
{'BNS': 0.0, 'NSBH': 0.0, 'BBH': 1.0, 'MassGap': 0.0}
If we wish to calculate the probabilities for both the raw samples and the
reweighted posterior samples in a single command, we can use the
:code:`dual_classification()` method,
.. code-block:: python
>>> print(x.dual_classification())
{'default': {'BNS': 0.0, 'NSBH': 0.0, 'BBH': 1.0, 'MassGap': 0.0}, 'population': {'BNS': 0.0, 'NSBH': 0.0, 'BBH': 1.0, 'MassGap': 0.0}}
.. autoclass:: pesummary.gw.classification.PEPredicates
:members:
`pesummary.gw.classification.PAstro`
------------------------------------
Similar to the :code:`pesummary.gw.classification.PEPredicates` class
the :code:`pesummary.gw.classification.PAstro` class can be used to calculate
the probability that the source has a neutron star and visible remnant by
passing a set of posterior samples,
.. code-block:: python
>>> from pesummary.gw.classification import PAstro
>>> x = PAstro(
... {
... "mass_1_source": [20, 30], "mass_2_source": [10, 20],
... "a_1": [0.5, 0.2], "a_2": [0.3, 0.1], "redshift": [0.4, 0.2]
... }
... )
>>> print(x.classification())
{'HasNS': 0.0, 'HasRemnant': 0.0}
We may again calculate the probabilities with samples reweighted to a population
prior with,
.. code-block:: python
>>> print(x.classification(population=True))
{'HasNS': 0.0, 'HasRemnant': 0.0}
and the combination can be printed with,
.. code-block:: python
>>> print(x.dual_classification())
{'default': {'HasNS': 0.0, 'HasRemnant': 0.0}, 'population': {'HasNS': 0.0, 'HasRemnant': 0.0}}
.. autoclass:: pesummary.gw.classification.PAstro
:members:
`pesummary.gw.classification.Classify`
--------------------------------------
The :code:`pesummary.gw.classification.Classify` class combines the
:code:`pesummary.gw.classification.PEPredicates` and
:code:`pesummary.gw.classification.PAstro` classes into one and returns the
probability that the binary is a BBH, BNS, NSBH and/or lies within the lower
mass gap as well as probability that the source has a neutron star and visible
remnant. For example,
.. code-block:: python
>>> from pesummary.gw.classification import Classify
>>> x = Classify(
... {
... "mass_1_source": [20, 30], "mass_2_source": [10, 20],
... "a_1": [0.5, 0.2], "a_2": [0.3, 0.1], "redshift": [0.4, 0.2]
... }
... )
>>> print(x.classification())
{'BNS': 0.0, 'NSBH': 0.0, 'BBH': 1.0, 'MassGap': 0.0, 'HasNS': 0.0, 'HasRemnant': 0.0}
>>> print(x.classification(population=True))
{'BNS': 0.0, 'NSBH': 0.0, 'BBH': 1.0, 'MassGap': 0.0, 'HasNS': 0.0, 'HasRemnant': 0.0}
>>> print(x.dual_classification())
{'default': {'BNS': 0.0, 'NSBH': 0.0, 'BBH': 1.0, 'MassGap': 0.0, 'HasNS': 0.0, 'HasRemnant': 0.0}, 'population': {'BNS': 0.0, 'NSBH': 0.0, 'BBH': 1.0, 'MassGap': 0.0, 'HasNS': 0.0, 'HasRemnant': 0.0}}
.. autoclass:: pesummary.gw.classification.Classify
:members:
`pesummary.gw.classification.classify`
--------------------------------------
The :code:`pesummary.gw.classification.classify` function provides an easy-to-use
interface to the :code:`classification` method provides by the
:code:`pesummary.gw.classification.Classify` class. For example,
. code-block:: python
>>> from pesummary.gw.classification import classify
>>> posterior = {
... "mass_1_source": [20, 30], "mass_2_source": [10, 20],
... "a_1": [0.5, 0.2], "a_2": [0.3, 0.1], "redshift": [0.4, 0.2]
... }
>>> print(classify(posterior))
{'BNS': 0.0, 'NSBH': 0.0, 'BBH': 1.0, 'MassGap': 0.0, 'HasNS': 0.0, 'HasRemnant': 0.0}
>>> print(classify(posterior, population=True))
{'BNS': 0.0, 'NSBH': 0.0, 'BBH': 1.0, 'MassGap': 0.0, 'HasNS': 0.0, 'HasRemnant': 0.0}
.. autofunction:: pesummary.gw.classification.classify
:members:
......@@ -4,9 +4,9 @@
import os
import pesummary
from pesummary.core.cli.inputs import _Input
from pesummary.gw.file.read import read as GWRead
from pesummary.gw.pepredicates import PEPredicates
from pesummary.gw.p_astro import PAstro
from pesummary.gw.classification import PEPredicates, PAstro
from pesummary.utils.utils import make_dir, logger
from pesummary.utils.exceptions import InputError
from pesummary.core.cli.parser import ArgumentParser as _ArgumentParser
......@@ -53,19 +53,16 @@ def generate_probabilities(result_files, prior="both"):
for num, i in enumerate(result_files):
mydict = {}
f = GWRead(i)
if not isinstance(f, pesummary.gw.file.formats.pesummary.PESummary):
f.generate_all_posterior_samples()
mydict["default"], mydict["population"] = \
PEPredicates.classifications(f.samples, f.parameters)
em_bright = PAstro.classifications(f.samples_dict)
if not _Input.is_pesummary_metafile(i):
mydict = PEPredicates.dual_classification_from_file(i)
em_bright = PAstro.dual_classification_from_file(i)
else:
f = GWRead(i)
label = f.labels[0]
mydict["default"], mydict["population"] = \
PEPredicates.classifications(f.samples[0], f.parameters[0])
em_bright = PAstro.classifications(f.samples_dict[label])
mydict["default"].update(em_bright[0])
mydict["population"].update(em_bright[1])
mydict = PEPredicates(f.samples_dict[label]).dual_classification()
em_bright = PAstro(f.samples_dict[label]).dual_classification()
mydict["default"].update(em_bright["default"])
mydict["population"].update(em_bright["population"])
classifications.append(mydict)
if prior == "both":
return classifications
......
This diff is collapsed.
......@@ -651,11 +651,20 @@ class _GWInput(pesummary.core.cli.inputs._Input):
@pepredicates_probs.setter
def pepredicates_probs(self, pepredicates_probs):
from pesummary.gw.pepredicates import get_classifications
from pesummary.gw.classification import PEPredicates
classifications = {}
for num, i in enumerate(list(self.samples.keys())):
classifications[i] = get_classifications(self.samples[i])
try:
classifications[i] = PEPredicates(
self.samples[i]
).dual_classification()
except Exception as e:
logger.warning(
"Failed to generate source classification probabilities "
"because {}".format(e)
)
classifications[i] = None
if self.mcmc_samples:
if any(_probs is None for _probs in classifications.values()):
classifications[self.labels[0]] = None
......@@ -684,16 +693,17 @@ class _GWInput(pesummary.core.cli.inputs._Input):
@pastro_probs.setter
def pastro_probs(self, pastro_probs):
from pesummary.gw.p_astro import get_probabilities
from pesummary.gw.classification import PAstro
probabilities = {}
for num, i in enumerate(list(self.samples.keys())):
em_bright = get_probabilities(self.samples[i])
if em_bright is not None:
probabilities[i] = {
"default": em_bright[0], "population": em_bright[1]
}
else:
try:
probabilities[i] = PAstro(self.samples[i]).dual_classification()
except Exception as e:
logger.warning(
"Failed to generate em_bright probabilities because "
"{}".format(e)
)
probabilities[i] = None
if self.mcmc_samples:
if any(_probs is None for _probs in probabilities.values()):
......
# Licensed under an MIT style license -- see LICENSE.md
LIGO_EM_BRIGHT = True
try: # ligo.em-bright
from ligo.em_bright.computeDiskMass import computeDiskMass
except ImportError:
try: # p_astro
from ligo.computeDiskMass import computeDiskMass
except ImportError:
LIGO_EM_BRIGHT = False
import numpy as np
from pesummary.utils.utils import logger
__author__ = ["Charlie Hoy <charlie.hoy@ligo.org>"]
def get_probabilities(samples):
"""Return `HasNS` and `HasRemnant` probabilities from a dictionary of
samples
Parameters
----------
samples: dict
dictionary of samples
"""
from pesummary.utils.utils import RedirectLogger
default_error = (
"Failed to generate `HasNS` and `HasRemnant` probabilities because {}"
)
try:
with RedirectLogger("p_astro", level="DEBUG") as redirector:
data = PAstro.classifications(samples)
except ImportError:
logger.warning(default_error.format("'p_astro' is not installed"))
data = None
except Exception as e:
logger.warning(default_error.format("%s" % (e)))
data = None
return data
class PAstro(object):
"""Class to handle the p_astro package
"""
@staticmethod
def check_for_install():
"""Check that p_astro is installed
"""
if not LIGO_EM_BRIGHT:
raise ImportError(
"Failed to import 'ligo.em_bright' packages and therefore unable to "
"calculate `HasNS` and `HasRemnant` probabilities"
)
@staticmethod
def _classifications(samples):
"""
"""
required_params = ["mass_1_source", "mass_2_source", "a_1", "a_2"]
parameters = list(samples.keys())
if not all(i in parameters for i in required_params):
raise Exception(
"Failed to generate `HasNS` and `HasRemnant` probabilities "
"because not all required parameters have been provided."
)
M_rem = computeDiskMass(
samples["mass_1_source"].to_numpy(),
samples["mass_2_source"].to_numpy(),
samples["a_1"].to_numpy(),
samples["a_2"].to_numpy()
)
prediction_ns = float(
np.sum(samples["mass_2_source"] <= 3.0) / len(samples["mass_2_source"])
)
prediction_em = float(np.sum(M_rem > 0) / len(M_rem))
return {
"HasNS": np.round(prediction_ns, 5),
"HasRemnant": np.round(prediction_em, 5)
}
@staticmethod
def default_classification(samples):
"""
"""
return PAstro._classifications(samples)
@staticmethod
def population_classification(samples):
"""
"""
import pandas as pd
import copy
from pepredicates import rewt_approx_massdist_redshift
p_astro_samples = copy.deepcopy(samples)
mapping = {"mass_1_source": "m1_source",
"mass_2_source": "m2_source",
"luminosity_distance": "dist",
"redshift": "redshift",
"a_1": "a1",
"a_2": "a2"}
reverse_mapping = dict((value, key) for key, value in mapping.items())
keys = list(p_astro_samples.keys())
for key in keys:
if key in list(mapping.keys()):
p_astro_samples[mapping[key]] = p_astro_samples[key]
p_astro_samples = rewt_approx_massdist_redshift(pd.DataFrame.from_dict(
p_astro_samples
))
for key, item in p_astro_samples.items():
if key in list(reverse_mapping.keys()):
p_astro_samples[reverse_mapping[key]] = item
try:
return PAstro._classifications(p_astro_samples)
except KeyError:
logger.warning(
"Failed to generate 'em_bright' probabilities after "
"reweighting to a population prior because there were no "
"samples after reweighting"
)
return {"HasNS": "-", "HasRemnant": "-"}
@staticmethod
def classifications(samples):
"""Return the source classification probabilities using both the default
prior used in the analysis and the population prior
"""
pop = PAstro.population_classification(samples)
default = PAstro.default_classification(samples)
return default, pop
# Licensed under an MIT style license -- see LICENSE.md
try:
import pepredicates as pep
PEP = True
except ImportError:
PEP = False
import numpy as np
import pandas as pd
from pesummary.core.plots.figure import ExistingFigure
from pesummary.utils.utils import logger
__author__ = ["Charlie Hoy <charlie.hoy@ligo.org>"]
def get_classifications(samples):
"""Return the classifications from a dictionary of samples
Parameters
----------
samples: dict
dictionary of samples
"""
from pesummary.utils.utils import RedirectLogger
from pesummary.utils.samples_dict import SamplesDict
default_error = (
"Failed to generate source classification probabilities because {}"
)
try:
with RedirectLogger("PEPredicates", level="DEBUG") as redirector:
if isinstance(samples, SamplesDict):
parameters = samples.parameters
samples = samples.samples.T.tolist()
else:
parameters = list(samples.keys())
samples = np.array(list(samples.values())).T.tolist()
data = PEPredicates.classifications(samples, parameters)
classifications = {
"default": data[0], "population": data[1]
}
except ImportError:
logger.warning(
default_error.format("'PEPredicates' is not installed")
)
classifications = None
except Exception as e:
logger.warning(default_error.format("%s" % (e)))
classifications = None
return classifications
class PEPredicates(object):
"""Class to handle the PEPredicates package
"""
@staticmethod
def default_predicates():
"""Set the default possibilities
"""
default = {
'BNS': pep.BNS_p,
'NSBH': pep.NSBH_p,
'BBH': pep.BBH_p,
'MassGap': pep.MG_p}
return default
@staticmethod
def check_for_install():
"""Check that predicates is installed
"""
if not PEP:
raise ImportError(
"Failed to import 'predicates' and therefore unable to "
"calculate astro/terrestrial probabilities")
@staticmethod
def convert_to_PEPredicate_data_frame(samples, parameters):
"""Convert the inputs to a pandas data frame compatible with
PEPredicated
Parameters
----------
samples: list
list of samples for a specific result file
parameters: list
list of parameters for a specific result file
"""
PEPredicates.check_for_install()
psamps = pd.DataFrame()
mapping = {"mass_1_source": "m1_source",
"mass_2_source": "m2_source",
"luminosity_distance": "dist",
"redshift": "redshift",
"a_1": "a1",
"a_2": "a2"}
if not all(i in parameters for i in list(mapping.keys())):
raise Exception(
"Failed to generate classification probabilities because not "
"all required parameters have been provided.")
_samples = np.array(samples).T
for num, i in enumerate(list(mapping.keys())):
psamps[mapping[i]] = _samples[parameters.index(i)]
return psamps
@staticmethod
def resample_to_population(samples):
"""Return samples that have been resampled to a sensibile population
Parameters
----------
samples: list
list of samples for a specific result file
"""
PEPredicates.check_for_install()
return pep.rewt_approx_massdist_redshift(samples)
@staticmethod
def check_for_dataframe(samples=None, parameters=None, dataframe=None):
"""Return dataframe if dataframe is not None else make a PEPredicate
dataframe from samples and parameters.
Parameters
----------
samples: list
list of samples for a specific result file
parameters: list
list of parameters corresponding to samples
dataframe: pandas.DataFrame
pandas DataFrame containing samples for specific result file.
dataframe must have entries m1_source, m2_source, dist, redshift,
a1, a2
"""
if dataframe is None:
if (samples is None) and (parameters is None):
raise ValueError(
"Please provide list of samples and parameters or "
"PEPredicate DataFrame"
)
dataframe = PEPredicates.convert_to_PEPredicate_data_frame(
samples, parameters
)
return dataframe
@staticmethod
def default_classification(
samples=None, parameters=None, predicate_dataframe=None
):
"""Return the source classification probabilities using the default
prior used
Parameters
----------
samples: list
list of samples for a specific result file. Used only if
predicate_dataframe is None
parameters: list
list of parameters corresponding to samples. Used only if
predicate_dataframe is None
predicate_dataframe: pandas.DataFrame
pandas DataFrame containing samples for specific result file.
predicate_dataframe must have entries m1_source, m2_source, dist,
redshift, a1, a2.
"""
PEPredicates.check_for_install()
predicate_dataframe = PEPredicates.check_for_dataframe(
samples=samples, parameters=parameters, dataframe=predicate_dataframe
)
ptable = pep.predicate_table(
PEPredicates.default_predicates(), predicate_dataframe
)
for key, value in ptable.items():
ptable[key] = np.round(value, 5)
return ptable
@staticmethod
def population_classification(
samples=None, parameters=None, predicate_dataframe=None
):
"""Return the source classification probabilities using a population
prior
Parameters
----------
samples: list
list of samples for a specific result file. Used only if
predicate_dataframe is None
parameters: list
list of parameters corresponding to samples. Used only if
predicate_dataframe is None
predicate_dataframe: pandas.DataFrame
pandas DataFrame containing samples for specific result file.
predicate_dataframe must have entries m1_source, m2_source, dist,
redshift, a1, a2.
"""
PEPredicates.check_for_install()
predicate_dataframe = PEPredicates.check_for_dataframe(
samples=samples, parameters=parameters, dataframe=predicate_dataframe
)
psamps_resamples = PEPredicates.resample_to_population(
predicate_dataframe
)
ptable = pep.predicate_table(
PEPredicates.default_predicates(), psamps_resamples
)
for key, value in ptable.items():
ptable[key] = np.round(value, 5)
return ptable
@staticmethod
def classifications(samples, parameters):
"""Return the source classification probabilities using both the default
prior used in the analysis and the population prior
"""
df = PEPredicates.convert_to_PEPredicate_data_frame(samples, parameters)
pop = PEPredicates.population_classification(predicate_dataframe=df)
default = PEPredicates.default_classification(predicate_dataframe=df)
return default, pop
@staticmethod
def plot(samples, parameters, population_prior=True):
"""Make a plot of the samples classified by type
Parameters
----------
samples: list
list of samples for a specific result file
"""