Clean up the fitting_with_x_and_y_errors tutorial

58033cfc · Gregory Ashton · Christopher Philip Luke Berry · 0c06ce2b · 58033cfc
Commit 58033cfc authored 2 years ago by Gregory Ashton Committed by Christopher Philip Luke Berry 2 years ago
--- a/examples/tutorials/fitting_with_x_and_y_errors.ipynb
+++ b/examples/tutorials/fitting_with_x_and_y_errors.ipynb
@@ -6,20 +6,7 @@
   "source": [
    "# Fitting a model to data with both x and y errors with `Bilby`\n",
    "\n",
-    "Usually when we fit a model to data with a Gaussian Likelihood we assume that we know x values exactly. This is almost never the case. Here we show how to fit a model with errors in both x and y.\n",
-    "\n",
-    "Since we are using a very simple model we will use the `nestle` sampler.\n",
-    "This can be installed using\n",
-    "\n",
-    "```console\n",
-    "$ conda install -c conda-forge nestle\n",
-    "```\n",
-    "\n",
-    "or\n",
-    "\n",
-    "```console\n",
-    "$ pip install nestle\n",
-    "```"
+    "Usually when we fit a model to data with a Gaussian Likelihood we assume that we know x values exactly. This is almost never the case. Here we show how to fit a model with errors in both x and y."
   ]
  },
  {
@@ -62,10 +49,10 @@
    "    xtrue = np.linspace(0, 100, points)\n",
    "    ytrue = model(x=xtrue, m=m, c=c)\n",
    "\n",
-    "    xerr = xerr * np.random.randn(points)\n",
-    "    yerr = yerr * np.random.randn(points)\n",
-    "    xobs = xtrue + xerr\n",
-    "    yobs = ytrue + yerr\n",
+    "    xerr_vals = xerr * np.random.randn(points)\n",
+    "    yerr_vals = yerr * np.random.randn(points)\n",
+    "    xobs = xtrue + xerr_vals\n",
+    "    yobs = ytrue + yerr_vals\n",
    "\n",
    "    plt.errorbar(xobs, yobs, xerr=xerr, yerr=yerr, fmt=\"x\")\n",
    "    plt.errorbar(xtrue, ytrue, yerr=yerr, color=\"black\", alpha=0.5)\n",
@@ -108,7 +95,7 @@
    "    m=bilby.core.prior.Uniform(0, 30, \"m\"), c=bilby.core.prior.Uniform(0, 30, \"c\")\n",
    ")\n",
    "\n",
-    "sampler_kwargs = dict(priors=priors, sampler=\"nestle\", nlive=1000, outdir=\"outdir\", verbose=False)"
+    "sampler_kwargs = dict(priors=priors, sampler=\"bilby_mcmc\", nsamples=1000, printdt=5, outdir=\"outdir\", verbose=False, clean=True)"
   ]
  },
  {

 %% Cell type:markdown id: tags:

 # Fitting a model to data with both x and y errors with `Bilby`

 Usually when we fit a model to data with a Gaussian Likelihood we assume that we know x values exactly. This is almost never the case. Here we show how to fit a model with errors in both x and y.

-Since we are using a very simple model we will use the `nestle` sampler.
-This can be installed using
-
-```console
-$ conda install -c conda-forge nestle
-```
-
-or
-
-```console
-$ pip install nestle
-```
-
 %% Cell type:code id: tags:

 ``` 
 import bilby
 import matplotlib.pyplot as plt
 import numpy as np

 %matplotlib inline
 ```

 %% Cell type:markdown id: tags:

 ### Simulate data

 First we create the data and plot it

 %% Cell type:code id: tags:

 ``` 
 # define our model, a line
 def model(x, m, c, **kwargs):
    y = m * x + c
    return y


 # make a function to create and plot our data
 def make_data(points, m, c, xerr, yerr, seed):
    np.random.seed(int(seed))
    xtrue = np.linspace(0, 100, points)
    ytrue = model(x=xtrue, m=m, c=c)

-    xerr = xerr * np.random.randn(points)
-    yerr = yerr * np.random.randn(points)
-    xobs = xtrue + xerr
-    yobs = ytrue + yerr
+    xerr_vals = xerr * np.random.randn(points)
+    yerr_vals = yerr * np.random.randn(points)
+    xobs = xtrue + xerr_vals
+    yobs = ytrue + yerr_vals

    plt.errorbar(xobs, yobs, xerr=xerr, yerr=yerr, fmt="x")
    plt.errorbar(xtrue, ytrue, yerr=yerr, color="black", alpha=0.5)
    plt.xlim(0, 100)
    plt.show()
    plt.close()

    data = {
        "xtrue": xtrue,
        "ytrue": ytrue,
        "xobs": xobs,
        "yobs": yobs,
        "xerr": xerr,
        "yerr": yerr,
    }

    return data


 data = make_data(points=30, m=5, c=10, xerr=5, yerr=5, seed=123)
 ```

 %% Cell type:markdown id: tags:

 ### Define our prior and sampler settings

 Now lets set up the prior and bilby output directory/sampler settings

 %% Cell type:code id: tags:

 ``` 
 # setting up bilby priors
 priors = dict(
    m=bilby.core.prior.Uniform(0, 30, "m"), c=bilby.core.prior.Uniform(0, 30, "c")
 )

-sampler_kwargs = dict(priors=priors, sampler="nestle", nlive=1000, outdir="outdir", verbose=False)
+sampler_kwargs = dict(priors=priors, sampler="bilby_mcmc", nsamples=1000, printdt=5, outdir="outdir", verbose=False, clean=True)
 ```

 %% Cell type:markdown id: tags:

 ### Fit with exactly known x-values

 Our first step is to recover the straight line using a simple Gaussian Likelihood that only takes into account the y errors. Under the assumption we know x exactly. In this case, we pass in xtrue for x

 %% Cell type:code id: tags:

 ``` 
 known_x = bilby.core.likelihood.GaussianLikelihood(
    x=data["xtrue"], y=data["yobs"], func=model, sigma=data["yerr"]
 )
 result_known_x = bilby.run_sampler(
    likelihood=known_x,
    label="known_x",
    **sampler_kwargs,
 )
 ```

 %% Cell type:code id: tags:

 ``` 
 _ = result_known_x.plot_corner(truth=dict(m=5, c=10), titles=True, save=False)
 plt.show()
 plt.close()
 ```

 %% Cell type:markdown id: tags:

 ### Fit with unmodeled uncertainty in the x-values

 As expected this is easy to recover and the sampler does a good job. However this was made too easy - by passing in the 'true' values of x. Lets see what happens when we pass in the observed values of x

 %% Cell type:code id: tags:

 ``` 
 incorrect_x = bilby.core.likelihood.GaussianLikelihood(
    x=data["xobs"], y=data["yobs"], func=model, sigma=data["yerr"]
 )
 result_incorrect_x = bilby.run_sampler(
    likelihood=incorrect_x,
    label="incorrect_x",
    **sampler_kwargs,
 )
 ```

 %% Cell type:code id: tags:

 ``` 
 _ = result_incorrect_x.plot_corner(truth=dict(m=5, c=10), titles=True, save=False)
 plt.show()
 plt.close()
 ```

 %% Cell type:markdown id: tags:

 ### Fit with modeled uncertainty in x-values

 This is not good as there is unmodelled uncertainty in our `x` values.
 Getting around this requires marginalisation of the true x values or sampling over them.
 See discussion in section 7 of https://arxiv.org/pdf/1008.4686.pdf.

 For this, we will have to define a new likelihood class.
 By subclassing the base `bilby.core.likelihood.Likelihood` class we can do this fairly simply.

 %% Cell type:code id: tags:

 ``` 
 class GaussianLikelihoodUncertainX(bilby.core.likelihood.Likelihood):
    def __init__(self, xobs, yobs, xerr, yerr, function):
        """

        Parameters
        ----------
        xobs, yobs: array_like
            The data to analyse
        xerr, yerr: array_like
            The standard deviation of the noise
        function:
            The python function to fit to the data
        """
        super(GaussianLikelihoodUncertainX, self).__init__(dict())
        self.xobs = xobs
        self.yobs = yobs
        self.yerr = yerr
        self.xerr = xerr
        self.function = function

    def log_likelihood(self):
        variance = (self.xerr * self.parameters["m"]) ** 2 + self.yerr**2
        model_y = self.function(self.xobs, **self.parameters)
        residual = self.yobs - model_y

        ll = -0.5 * np.sum(residual**2 / variance + np.log(variance))

        return -0.5 * np.sum(residual**2 / variance + np.log(variance))
 ```

 %% Cell type:code id: tags:

 ``` 
 gaussian_unknown_x = GaussianLikelihoodUncertainX(
    xobs=data["xobs"],
    yobs=data["yobs"],
    xerr=data["xerr"],
    yerr=data["yerr"],
    function=model,
 )
 result_unknown_x = bilby.run_sampler(
    likelihood=gaussian_unknown_x,
    label="unknown_x",
    **sampler_kwargs,
 )
 ```

 %% Cell type:code id: tags:

 ``` 
 _ = result_unknown_x.plot_corner(truth=dict(m=5, c=10), titles=True, save=False)
 plt.show()
 plt.close()
 ```

 %% Cell type:markdown id: tags:

 Success! The inferred posterior is consistent with the true values.