Header file for the helper functions for the parameter estimation code for known pulsar searches using the nested sampling algorithm. More...
Prototypes | |
void | compute_variance (LALInferenceIFOData *data, LALInferenceIFOModel *model) |
Compute the noise variance for each data segment. More... | |
COMPLEX16Vector * | subtract_running_median (COMPLEX16Vector *data) |
Subtract the running median from complex data. More... | |
UINT4Vector * | get_chunk_lengths (LALInferenceIFOModel *ifo, UINT4 chunkMax) |
Split the data into segments. More... | |
UINT4Vector * | chop_n_merge (LALInferenceIFOData *data, UINT4 chunkMin, UINT4 chunkMax, UINT4 outputchunks) |
Chops and remerges data into stationary segments. More... | |
UINT4Vector * | chop_data (gsl_vector_complex *data, UINT4 chunkMin) |
Chops the data into stationary segments based on Bayesian change point analysis. More... | |
UINT4 | find_change_point (gsl_vector_complex *data, REAL8 *logodds, UINT4 chunkMin) |
Find a change point in complex data. More... | |
void | rechop_data (UINT4Vector **segs, UINT4 chunkMax, UINT4 chunkMin) |
Chop up the data into chunks smaller the the maximum allowed length. More... | |
void | merge_data (COMPLEX16Vector *data, UINT4Vector **segs) |
Merge adjacent segments. More... | |
INT4 | count_csv (CHAR *csvline) |
Counts the number of comma separated values in a string. More... | |
void | check_and_add_fixed_variable (LALInferenceVariables *vars, const char *name, void *value, LALInferenceVariableType type) |
Add a variable, checking first if it already exists and is of type LALINFERENCE_PARAM_FIXED and if so removing it before re-adding it. More... | |
TimeCorrectionType | XLALAutoSetEphemerisFiles (CHAR **efile, CHAR **sfile, CHAR **tfile, PulsarParameters *pulsar, INT4 gpsstart, INT4 gpsend) |
Automatically set the solar system ephemeris file based on environment variables and data time span. More... | |
Header file for the helper functions for the parameter estimation code for known pulsar searches using the nested sampling algorithm.
Definition in file ppe_utils.h.
Go to the source code of this file.
void compute_variance | ( | LALInferenceIFOData * | data, |
LALInferenceIFOModel * | model | ||
) |
Compute the noise variance for each data segment.
Once the data has been split into segments calculate the noise variance (using both the real and imaginary parts) in each segment and fill in the associated noise vector. To calculate the noise the running median should first be subtracted from the data.
data | [in] the LALInferenceIFOData variable |
model | [in] the LALInferenceIFOModel variable |
Definition at line 38 of file ppe_utils.c.
COMPLEX16Vector * subtract_running_median | ( | COMPLEX16Vector * | data | ) |
Subtract the running median from complex data.
This function uses gsl_stats_median_from_sorted_data
to subtract a running median, calculated from the 30 consecutive point around a set point, from the data. At the start of the data running median is calculated from 30-15+(i-1) points, and at the end it is calculated from 15+(N-i) points, where i is the point index and N is the total number of data points.
data | [in] A complex data vector |
Definition at line 268 of file ppe_utils.c.
UINT4Vector * get_chunk_lengths | ( | LALInferenceIFOModel * | ifo, |
UINT4 | chunkMax | ||
) |
Split the data into segments.
This function is deprecated to chop_n_merge
, but gives the functionality of the old code.
It cuts the data into as many contiguous segments of data as possible of length chunkMax
. Where contiguous is defined as containing consecutive point within 180 seconds of each other. The length of segments that do not fit into a chunkMax
length are also included.
ifo | [in] the LALInferenceIFOModel variable |
chunkMax | [in] the maximum length of a data chunk/segment |
Definition at line 98 of file ppe_utils.c.
UINT4Vector * chop_n_merge | ( | LALInferenceIFOData * | data, |
UINT4 | chunkMin, | ||
UINT4 | chunkMax, | ||
UINT4 | outputchunks | ||
) |
Chops and remerges data into stationary segments.
This function finds segments of data that appear to be stationary (have the same standard deviation).
The function first attempts to chop up the data into as many stationary segments as possible. The splitting may not be optimal, so it then tries remerging consecutive segments to see if the merged segments show more evidence of stationarity. [NOTE: Remerging is currently turned off and will make very little difference to the algorithm]. It then, if necessary, chops the segments again to make sure there are none greater than the required chunkMax
. The default chunkMax
is 0, so this rechopping will not normally happen.
This is all performed on data that has had a running median subtracted, to try and removed any underlying trends in the data (e.g. those caused by a strong signal), which might affect the calculations (which assume the data is Gaussian with zero mean).
If the outputchunks
is non-zero then a list of the segments will be output to a file called data_segment_list.txt
, with a prefix of the detector name.
data | [in] A data structure |
chunkMin | [in] The minimum length of a segment |
chunkMax | [in] The maximum length of a segment |
outputchunks | [in] A flag to check whether to output the segments |
Definition at line 177 of file ppe_utils.c.
UINT4Vector * chop_data | ( | gsl_vector_complex * | data, |
UINT4 | chunkMin | ||
) |
Chops the data into stationary segments based on Bayesian change point analysis.
This function splits data into two (and recursively runs on those two segments) if it is found that the odds ratio for them being from two independent Gaussian distributions is greater than a certain threshold.
The threshold for the natural logarithm of the odds ratio is empirically set to be
\[ T = 4.07 + 1.33\log{}_{10}{N}, \]
where \( N \) is the length in samples of the dataset. This is based on Monte Carlo simulations of many realisations of Gaussian noise for data of different lengths. The threshold comes from a linear fit to the log odds ratios required to give a 1% chance of splitting Gaussian data (drawn from a single distribution) for data of various lengths. Note, however, that this relation is not good for stretches of data with lengths of less than about 30 points, and in fact is rather consevative for such short stretches of data, i.e. such short stretches of data will require relatively larger odds ratios for splitting than longer stretches.
data | [in] A complex data vector |
chunkMin | [in] The minimum allowed segment length |
Definition at line 352 of file ppe_utils.c.
Find a change point in complex data.
This function is based in the Bayesian Blocks algorithm of [23] that finds "change points" in data - points at which the statistics of the data change. It is based on calculating evidence, or odds, ratios. The function first computes the marginal likelihood (or evidence) that the whole of the data is described by a single Gaussian (with mean of zero). This comes from taking a Gaussian likelihood function and analytically marginalising over the standard deviation (using a prior on the standard deviation of \( 1/\sigma \) ), giving (see [[8]]) a Students-t distribution (see here). Following this the data is split into two segments (with lengths greater than, or equal to the minimum chunk length) for all possible combinations, and the joint evidence for each of the two segments consisting of independent Gaussian (basically multiplying the above equation calculated for each segment separately) is calculated and the split point recorded. However, the value required for comparing to that for the whole data set, to give the odds ratio, is the evidence that having any split is better than having no split, so the individual split data evidences need to be added incoherently to give the total evidence for a split. The index at which the evidence for a single split is maximum (i.e. the most favoured split point) is that which is returned.
data | [in] a complex data vector |
logodds | [in] a pointer to return the natural logarithm of the odds ratio/Bayes factor |
minlength | [in] the minimum chunk length |
Definition at line 428 of file ppe_utils.c.
void rechop_data | ( | UINT4Vector ** | chunkIndex, |
UINT4 | chunkMax, | ||
UINT4 | chunkMin | ||
) |
Chop up the data into chunks smaller the the maximum allowed length.
This function chops any chunks that are greater than chunkMax
into chunks smaller than, or equal to chunkMax
, and greater than chunkMin
. On some occasions this might result in a segment smaller than chunkMin
, but these are ignored in the likelihood calculation anyway.
chunkIndex | [in] a vector of segment split positions |
chunkMax | [in] the maximum allowed segment/chunk length |
chunkMin | [in] the minimum allowed segment/chunk length |
Definition at line 518 of file ppe_utils.c.
void merge_data | ( | COMPLEX16Vector * | data, |
UINT4Vector ** | segments | ||
) |
Merge adjacent segments.
This function will attempt to remerge adjacent segments if statistically favourable (as calculated by the odds ratio). For each pair of adjacent segments the joint likelihood of them being from two independent distributions is compared to the likelihood that combined they are from one distribution. If the likelihood is highest for the combined segments they are merged.
data | [in] A complex data vector |
segments | [in] A vector of split segment indexes |
Definition at line 591 of file ppe_utils.c.
Counts the number of comma separated values in a string.
This function counts the number of comma separated values in a given input string.
csvline | [in] Any string |
Definition at line 674 of file ppe_utils.c.
void check_and_add_fixed_variable | ( | LALInferenceVariables * | vars, |
const char * | name, | ||
void * | value, | ||
LALInferenceVariableType | type | ||
) |
Add a variable, checking first if it already exists and is of type LALINFERENCE_PARAM_FIXED
and if so removing it before re-adding it.
This function is for use as an alternative to LALInferenceAddVariable
, which does not allow LALINFERENCE_PARAM_FIXED
variables to be changed. If the variable already exists and is of type LALINFERENCE_PARAM_FIXED
, then it will first be removed and then re-added
Definition at line 778 of file ppe_utils.c.
TimeCorrectionType XLALAutoSetEphemerisFiles | ( | CHAR ** | efile, |
CHAR ** | sfile, | ||
CHAR ** | tfile, | ||
PulsarParameters * | pulsar, | ||
INT4 | gpsstart, | ||
INT4 | gpsend | ||
) |
Automatically set the solar system ephemeris file based on environment variables and data time span.
This function will attempt to construct the file name for Sun, Earth and time correction ephemeris files based on the ephemeris used for the equivalent TEMPO(2) pulsar timing information. It assumes that the ephemeris files are those constructed between 2000 and 2020. The path to the file is not required as this will be found in XLALInitBarycenter
.
efile | [in] a string that will return the Earth ephemeris file |
sfile | [in] a string that will return the Sun ephemeris file |
tfile | [in] a string that will return the time correction file |
pulsar | [in] the pulsar parameters read from a .par file |
gpsstart | [in] the GPS time of the start of the data |
gpsend | [in] the GPS time of the end of the data |
Definition at line 717 of file ppe_utils.c.