Loading [MathJax]/extensions/TeX/AMSsymbols.js
LALBurst 2.0.7.1-ea7c608
All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Macros Modules Pages
lalburst.snglcoinc.LnLikelihoodRatioMixin Class Reference

Detailed Description

Mixin to assist in implementing a log likelihood ratio ranking statistic class.

The ranking statistic class that inherits from this must: (i) define a callable .numerator attribute that returns ln P(*args, **kwargs | signal); (ii) define a callable .denominator attribute that returns ln P(*args, **kwargs | noise).

Inheriting from this will:

  1. Add a .__call__() method that returns the natural logarithm of the likelihood ratio

ln P(*args, **kwargs | signal) - ln P(*args, **kwargs | noise)

The implementation handles various special cases sensibly, such as when either or both of the logarithms of the numerator and denominator diverge.

  1. Add a .ln_lr_samples() method that makes use of the .numerator and .denominator attributes, together with the .__call__() method to transform a sequence of (*args, **kwargs) into a sequence of log likelihood ratios and their respective relative frequencies. This can be used to construct histograms of P(ln L | signal) and P(ln L | noise). These distributions are required for, for example, signal rate estimation and false-alarm probability estimation.

Why is the likelihood ratio useful? Starting from Bayes' theorem, and using the fact that "signal" and "noise" are the only two choices:

               P(data | signal) * P(signal)

P(signal | data) = -------------------------— P(data)

P(data | signal) * P(signal) = ------------------------------------------------------— P(data | noise) * P(noise) + P(data | signal) * P(signal)

    [P(data | signal) / P(data | noise)] * P(signal)

= -------------------------------------------------------------— 1 - P(signal) + [P(data | signal) / P(data | noise)] * P(signal)

                Lambda * P(signal)

P(signal | data) = -------------------------— 1 + (Lambda - 1) * P(signal)

P(data | signal) where Lambda = -------------— P(data | noise)

Differentiating P(signal | data) w.r.t. Lambda shows the derivative is always positive, so the probability that a candidate is the result of a gravitiational wave is a monotonically increasing function of the likelihood ratio. Ranking events from "most likely to be a genuine signal" to "least likely to be a genuine signal" can be performed by sorting candidates by likelihood ratio. Or, if one wanted to set a threshold on P(signal | data) to select a subset of candidates such that the candidates selected have a given purity (the fraction of them that are real is fixed), that is equivalent to selecting candidates using a likelihood ratio threshold.

These are expressions of the Neyman-Pearson lemma which tells us that thresholding on Lambda creates a detector that extremizes the detection efficiency at fixed false-alarm rate.

Definition at line 2485 of file snglcoinc.py.

Inherits object.

Inherited by lalburst.burca_tailor.BurcaCoincParamsDistributions, and lalburst.stringutils.StringCoincParamsDistributions.

Public Member Functions

def __call__ (self, *args, **kwargs)
 Return the natural logarithm of the likelihood ratio for the given parameters,. More...
 
def ln_lr_samples (self, random_params_seq, signal_noise_pdfs=None)
 Generator that transforms a sequence of candidate parameter samples into a sequence of log likelihood ratio samples. More...
 

Member Function Documentation

◆ __call__()

def lalburst.snglcoinc.LnLikelihoodRatioMixin.__call__ (   self,
args,
**  kwargs 
)

Return the natural logarithm of the likelihood ratio for the given parameters,.

ln P(*args, **kwargs | signal) - ln P(*args, **kwargs | noise)

The arguments are passed verbatim to the .__call__() methods of the .numerator and .denominator attributes of self and the return value is computed from the results.

NOTE: sub-classes may override this method, possibly chaining to it if they wish. The .ln_lr_samples() mechanism does not require this method to return exactly the natural logarithm of the .numerator/.denominator ratio. The .ln_lr_samples() mechanism does not assume the numerator, denominator and ranking statistic are related to each other as the latter being the ratio of the former two, it evaluates all three separately. For this reason, the .__call__() method that implements the ranking statistic is free to include other logic, such as hierarchical cuts or bail-outs that are not stricly equivalent to the ratio of the numerator and denominator.

Special cases:

.numerator/.denominator=0/0 is mapped to ln Lambda = -inf, meaning P(signal | data) = 0. Although this condition is nonsensical because the data is claimed to be inconsistent with both noise and signal — the only two things it can be — our task here is to compute something that is monotonic in the probability the data is the result of a signal, and since this data cannot be from a signal the correct answer is -inf.

.numerator/.denominator = +inf/+inf is mapped to ln Lambda = NaN. This is sufficiently nonsensical that there is no correct interpretation. A warning will be displayed when this is encountered.

Reimplemented in lalburst.stringutils.StringCoincParamsDistributions.

Definition at line 2525 of file snglcoinc.py.

◆ ln_lr_samples()

def lalburst.snglcoinc.LnLikelihoodRatioMixin.ln_lr_samples (   self,
  random_params_seq,
  signal_noise_pdfs = None 
)

Generator that transforms a sequence of candidate parameter samples into a sequence of log likelihood ratio samples.

random_params_seq is a sequence (generator is OK) yielding 3-element tuples whose first two elements provide the *args and **kwargs values to be passed to the .numerator and .denominator functions, and whose third element is the natural logarithm of the probability density from which the (*args, **kwargs) parameters have been drawn evaluated at those parameters.

The output of this generator is a sequence of 3-element tuples, each of whose elements are:

  1. a value of the natural logarithm of the likelihood ratio,
  2. the natural logarithm of the relative frequency of occurance of that likelihood ratio in the signal population corrected for the relative frequency at which the random_params_seq sampler is causing that value to be returned, and
  3. the natural logarithm of the relative frequency of occurance of that likelihood ratio in the noise population similarly corrected for the relative frequency at which the random_params_seq sampler is causing that value to be returned.

The intention is for the first element of each tuple to be added to histograms using the two relative frequencies as weights, i.e., the two relative frequencies give the number of times one should consider this one draw of log likelihood ratio to have occured in the two populations.

On each iteration, the *args and **kwargs values yielded by random_params_seq are passed to our own .__call__() method to evalute the log likelihood ratio at that choice of parameter values. The parameters are also passed to the .__call__() mehods of our own .numerator and .denominator attributes to obtain the signal and noise population densities at those parameters.

If signal_noise_pdfs is not None then, instead of using our own .numerator and .denominator attributes, the parameters are passed to the .__call__() methods of its .numerator and .denominator attributes to obtain those densities. This allows the distribution of ranking statistic values obtained from alternative signal and noise populations to be modelled. This is sometimes useful for diagnostic purposes.

Normalizations:

Within the context of the intended application, it is sufficient for all of the probabilities involved (the .numerator and .denominator probability densities, and the probability density supplied by the random_params_seq geneator) to be correct up to unknown normalization constants, i.e., the natural logarithms of the probabilties to be correct up to unknown additive constants. That is why the two probability densities yielded by each iteration of this generator are described as relative frequencies: the ratios among their values are meaningful, but not their absolute values.

If all of the supplied probabilities are, in fact, properly normalized, then the relative frequencies returned by this generator are, also, correctly normalized probability densities.

Definition at line 2627 of file snglcoinc.py.