============================== Basics of parameter estimation ============================== In this example, we'll go into some of the basics of parameter estimation and how they are implemented in :code:`bilby`. Firstly, consider a situation where you have discrete data :math:`\{y_0, y_1,\ldots, y_n\}` taken at a set of times :math:`\{t_0, t_1, \ldots, t_n\}`. Further, we know that this data is generated by a process which can be modelled by a linear function of the form :math:`y(t) = m t + c`. We will refer to this model as :math:`H`. Given a set of data, how can we figure out the coefficients :math:`m` and :math:`c`? This is a well studied problem known as `linear regression `_ and there already exists many ways to estimate the coefficients (you may already be familiar with a least squares estimator for example). Here, we will describe a Bayesian approach using `nested sampling `_ which might feel like overkill for this problem. However, it is a useful way to introduce some of the basic features of :code:`bilby` before seeing them in more complicated settings. The maths --------- Given the data, the posterior distribution for the model parameters is given by .. math:: P(m, c| \{y_i, t_i\}, H) \propto P(\{y_i, t_i\}| m, c, H) \times P(m, c| H)\,. where the first term on the right-hand-side is the *likelihood* while the second is the *prior*. In the model :math:`H`, the likelihood of the data point :math:`y_i, t_i`, given a value for the coefficients we will define to be Gaussian distributed as such: .. math:: P(y_i, t_i| m, c, H) = \frac{1}{\sqrt{2\pi\sigma^2}} \mathrm{exp}\left(\frac{-(y_i - (t_i m + c))^2}{2\sigma^2}\right) \,. Next, we assume that all data points are independent. As such, .. math:: P(\{y_i, t_i\}| m, c, H) = \prod_{i=1}^n P(y_i, t_i| m, c, H) \,. When solving problems on a computer, it is often convenient to work with the log-likelihood. Indeed, a :code:`bilby` *likelihood* must have a `log_likelihood()` method. For the normal distribution, the log-likelihood for *n* data points is .. math:: \log(P(\{y_i, t_i\}| m, c, H)) = -\frac{1}{2}\left[ \sum_{i=1}^n \left(\frac{(y_i - (t_i x + c))}{\sigma}\right)^2 + n\log\left(2\pi\sigma^2\right)\right] \,. Finally, we need to specify a *prior*. In this case we will use uncorrelated uniform priors .. math:: P(m, c| H) = P(m| H) \times P(c| H) = \textrm{Unif}(0, 5) \times \textrm{Unif}(-2, 2)\,. The choice of prior in general should be guided by physical knowledge about the system and not the data in question. The key point to take away here is that the **likelihood** and **prior** are the inputs to figuring out the **posterior**. There are many ways to go about this, we will now show how to do so in :code:`bilby`. In this case, we explicitly show how to write the `GaussianLikelihood` so that one can see how the maths above gets implemented. For the prior, this is done implicitly by the naming of the priors. The code -------- In the following example, also available under `examples/core_examples/linear_regression.py `_ we will step through the process of generating some simulated data, writing a likelihood and prior, and running nested sampling using `bilby`. .. literalinclude:: /../examples/core_examples/linear_regression.py :language: python :linenos: Running the script above will make a few images. Firstly, the plot of the data: .. image:: images/linear-regression_data.png The dashed red line here shows the simulated signal. Secondly, because we used the `plot=True` argument in `run_sampler` we generate a corner plot .. image:: images/linear-regression_corner.png The solid lines indicate the simulated values which are recovered quite easily. Note, you can also make a corner plot with `result.plot_corner()`. Final thoughts -------------- While this example is somewhat trivial, hopefully you can see how this script can be easily modified to perform parameter estimation for almost any time-domain data where you can model the background noise as Gaussian and write the signal model as a python function (i.e., replacing `model`).