Loading [MathJax]/extensions/TeX/AMSsymbols.js
LALPulsar 7.1.1.1-b246709
All Data Structures Namespaces Files Functions Variables Typedefs Enumerations Enumerator Macros Modules Pages

Header file for the helper functions for the parameter estimation code for known pulsar searches using the nested sampling algorithm. More...

Prototypes

void compute_variance (LALInferenceIFOData *data, LALInferenceIFOModel *model)
 Compute the noise variance for each data segment. More...
 
COMPLEX16Vectorsubtract_running_median (COMPLEX16Vector *data)
 Subtract the running median from complex data. More...
 
UINT4Vectorget_chunk_lengths (LALInferenceIFOModel *ifo, UINT4 chunkMax)
 Split the data into segments. More...
 
UINT4Vectorchop_n_merge (LALInferenceIFOData *data, UINT4 chunkMin, UINT4 chunkMax, UINT4 outputchunks)
 Chops and remerges data into stationary segments. More...
 
UINT4Vectorchop_data (gsl_vector_complex *data, UINT4 chunkMin)
 Chops the data into stationary segments based on Bayesian change point analysis. More...
 
UINT4 find_change_point (gsl_vector_complex *data, REAL8 *logodds, UINT4 chunkMin)
 Find a change point in complex data. More...
 
void rechop_data (UINT4Vector **segs, UINT4 chunkMax, UINT4 chunkMin)
 Chop up the data into chunks smaller the the maximum allowed length. More...
 
void merge_data (COMPLEX16Vector *data, UINT4Vector **segs)
 Merge adjacent segments. More...
 
INT4 count_csv (CHAR *csvline)
 Counts the number of comma separated values in a string. More...
 
void check_and_add_fixed_variable (LALInferenceVariables *vars, const char *name, void *value, LALInferenceVariableType type)
 Add a variable, checking first if it already exists and is of type LALINFERENCE_PARAM_FIXED and if so removing it before re-adding it. More...
 
TimeCorrectionType XLALAutoSetEphemerisFiles (CHAR **efile, CHAR **sfile, CHAR **tfile, PulsarParameters *pulsar, INT4 gpsstart, INT4 gpsend)
 Automatically set the solar system ephemeris file based on environment variables and data time span. More...
 

Detailed Description

Header file for the helper functions for the parameter estimation code for known pulsar searches using the nested sampling algorithm.

Author
Matthew Pitkin, John Veitch, Colin Gill

Definition in file ppe_utils.h.

Go to the source code of this file.

Function Documentation

◆ compute_variance()

void compute_variance ( LALInferenceIFOData data,
LALInferenceIFOModel model 
)

Compute the noise variance for each data segment.

Once the data has been split into segments calculate the noise variance (using both the real and imaginary parts) in each segment and fill in the associated noise vector. To calculate the noise the running median should first be subtracted from the data.

Parameters
data[in] the LALInferenceIFOData variable
model[in] the LALInferenceIFOModel variable

Definition at line 38 of file ppe_utils.c.

◆ subtract_running_median()

COMPLEX16Vector * subtract_running_median ( COMPLEX16Vector data)

Subtract the running median from complex data.

This function uses gsl_stats_median_from_sorted_data to subtract a running median, calculated from the 30 consecutive point around a set point, from the data. At the start of the data running median is calculated from 30-15+(i-1) points, and at the end it is calculated from 15+(N-i) points, where i is the point index and N is the total number of data points.

Parameters
data[in] A complex data vector
Returns
A complex vector containing data with the running median removed

Definition at line 268 of file ppe_utils.c.

◆ get_chunk_lengths()

UINT4Vector * get_chunk_lengths ( LALInferenceIFOModel ifo,
UINT4  chunkMax 
)

Split the data into segments.

This function is deprecated to chop_n_merge, but gives the functionality of the old code.

It cuts the data into as many contiguous segments of data as possible of length chunkMax. Where contiguous is defined as containing consecutive point within 180 seconds of each other. The length of segments that do not fit into a chunkMax length are also included.

Parameters
ifo[in] the LALInferenceIFOModel variable
chunkMax[in] the maximum length of a data chunk/segment
Returns
A vector of chunk/segment lengths

Definition at line 98 of file ppe_utils.c.

◆ chop_n_merge()

UINT4Vector * chop_n_merge ( LALInferenceIFOData data,
UINT4  chunkMin,
UINT4  chunkMax,
UINT4  outputchunks 
)

Chops and remerges data into stationary segments.

This function finds segments of data that appear to be stationary (have the same standard deviation).

The function first attempts to chop up the data into as many stationary segments as possible. The splitting may not be optimal, so it then tries remerging consecutive segments to see if the merged segments show more evidence of stationarity. [NOTE: Remerging is currently turned off and will make very little difference to the algorithm]. It then, if necessary, chops the segments again to make sure there are none greater than the required chunkMax. The default chunkMax is 0, so this rechopping will not normally happen.

This is all performed on data that has had a running median subtracted, to try and removed any underlying trends in the data (e.g. those caused by a strong signal), which might affect the calculations (which assume the data is Gaussian with zero mean).

If the outputchunks is non-zero then a list of the segments will be output to a file called data_segment_list.txt, with a prefix of the detector name.

Parameters
data[in] A data structure
chunkMin[in] The minimum length of a segment
chunkMax[in] The maximum length of a segment
outputchunks[in] A flag to check whether to output the segments
Returns
A vector of segment/chunk lengths
See also
subtract_running_median
chop_data
merge_data
rechop_data

Definition at line 177 of file ppe_utils.c.

◆ chop_data()

UINT4Vector * chop_data ( gsl_vector_complex *  data,
UINT4  chunkMin 
)

Chops the data into stationary segments based on Bayesian change point analysis.

This function splits data into two (and recursively runs on those two segments) if it is found that the odds ratio for them being from two independent Gaussian distributions is greater than a certain threshold.

The threshold for the natural logarithm of the odds ratio is empirically set to be

\[ T = 4.07 + 1.33\log{}_{10}{N}, \]

where \( N \) is the length in samples of the dataset. This is based on Monte Carlo simulations of many realisations of Gaussian noise for data of different lengths. The threshold comes from a linear fit to the log odds ratios required to give a 1% chance of splitting Gaussian data (drawn from a single distribution) for data of various lengths. Note, however, that this relation is not good for stretches of data with lengths of less than about 30 points, and in fact is rather consevative for such short stretches of data, i.e. such short stretches of data will require relatively larger odds ratios for splitting than longer stretches.

Parameters
data[in] A complex data vector
chunkMin[in] The minimum allowed segment length
Returns
A vector of segment lengths
See also
find_change_point

Definition at line 352 of file ppe_utils.c.

◆ find_change_point()

UINT4 find_change_point ( gsl_vector_complex *  data,
REAL8 logodds,
UINT4  minlength 
)

Find a change point in complex data.

This function is based in the Bayesian Blocks algorithm of [23] that finds "change points" in data - points at which the statistics of the data change. It is based on calculating evidence, or odds, ratios. The function first computes the marginal likelihood (or evidence) that the whole of the data is described by a single Gaussian (with mean of zero). This comes from taking a Gaussian likelihood function and analytically marginalising over the standard deviation (using a prior on the standard deviation of \( 1/\sigma \) ), giving (see [[8]]) a Students-t distribution (see here). Following this the data is split into two segments (with lengths greater than, or equal to the minimum chunk length) for all possible combinations, and the joint evidence for each of the two segments consisting of independent Gaussian (basically multiplying the above equation calculated for each segment separately) is calculated and the split point recorded. However, the value required for comparing to that for the whole data set, to give the odds ratio, is the evidence that having any split is better than having no split, so the individual split data evidences need to be added incoherently to give the total evidence for a split. The index at which the evidence for a single split is maximum (i.e. the most favoured split point) is that which is returned.

Parameters
data[in] a complex data vector
logodds[in] a pointer to return the natural logarithm of the odds ratio/Bayes factor
minlength[in] the minimum chunk length
Returns
The position of the change point

Definition at line 428 of file ppe_utils.c.

◆ rechop_data()

void rechop_data ( UINT4Vector **  chunkIndex,
UINT4  chunkMax,
UINT4  chunkMin 
)

Chop up the data into chunks smaller the the maximum allowed length.

This function chops any chunks that are greater than chunkMax into chunks smaller than, or equal to chunkMax, and greater than chunkMin. On some occasions this might result in a segment smaller than chunkMin, but these are ignored in the likelihood calculation anyway.

Parameters
chunkIndex[in] a vector of segment split positions
chunkMax[in] the maximum allowed segment/chunk length
chunkMin[in] the minimum allowed segment/chunk length

Definition at line 518 of file ppe_utils.c.

◆ merge_data()

void merge_data ( COMPLEX16Vector data,
UINT4Vector **  segments 
)

Merge adjacent segments.

This function will attempt to remerge adjacent segments if statistically favourable (as calculated by the odds ratio). For each pair of adjacent segments the joint likelihood of them being from two independent distributions is compared to the likelihood that combined they are from one distribution. If the likelihood is highest for the combined segments they are merged.

Parameters
data[in] A complex data vector
segments[in] A vector of split segment indexes

Definition at line 591 of file ppe_utils.c.

◆ count_csv()

INT4 count_csv ( CHAR csvline)

Counts the number of comma separated values in a string.

This function counts the number of comma separated values in a given input string.

Parameters
csvline[in] Any string
Returns
The number of comma separated value in the input string

Definition at line 674 of file ppe_utils.c.

◆ check_and_add_fixed_variable()

void check_and_add_fixed_variable ( LALInferenceVariables vars,
const char name,
void *  value,
LALInferenceVariableType  type 
)

Add a variable, checking first if it already exists and is of type LALINFERENCE_PARAM_FIXED and if so removing it before re-adding it.

This function is for use as an alternative to LALInferenceAddVariable, which does not allow LALINFERENCE_PARAM_FIXED variables to be changed. If the variable already exists and is of type LALINFERENCE_PARAM_FIXED, then it will first be removed and then re-added

Definition at line 778 of file ppe_utils.c.

◆ XLALAutoSetEphemerisFiles()

TimeCorrectionType XLALAutoSetEphemerisFiles ( CHAR **  efile,
CHAR **  sfile,
CHAR **  tfile,
PulsarParameters pulsar,
INT4  gpsstart,
INT4  gpsend 
)

Automatically set the solar system ephemeris file based on environment variables and data time span.

This function will attempt to construct the file name for Sun, Earth and time correction ephemeris files based on the ephemeris used for the equivalent TEMPO(2) pulsar timing information. It assumes that the ephemeris files are those constructed between 2000 and 2020. The path to the file is not required as this will be found in XLALInitBarycenter.

Parameters
efile[in] a string that will return the Earth ephemeris file
sfile[in] a string that will return the Sun ephemeris file
tfile[in] a string that will return the time correction file
pulsar[in] the pulsar parameters read from a .par file
gpsstart[in] the GPS time of the start of the data
gpsend[in] the GPS time of the end of the data
Returns
The TimeCorrectionType e.g. TDB or TCB

Definition at line 717 of file ppe_utils.c.