# Running BayesWave: How-Tos
This page describes how to generate a BayesWave analysis through a few examples.

To see specific examples of how to use BayesWave for different analyses, see
[here](https://git.ligo.org/lscsoft/bayeswave/blob/master/doc/EXAMPLES.md)


Note: 
 * LDG=LIGO Data Grid: comprised of the "usual" LIGO clusters like CIT, LHO, LLO, Nemo, Atlas, ...
 * OSG=Open Science Grid: international network of shared compute resouces, outside of direct LIGO control.

The first examples below use a canonical GW150914 analysis to demonstrate how each
run option works.

**Table of contents**
 1. [Run BayesWave From A Local Installation](#run-bayeswave-using-a-local-installation) on a LIGO
    cluster like CIT
 1. [Run BayesWave On The OSG](#run-bayeswave-on-the-osg) 


## Run BayesWave Using A Local Installation

**WARNING**: we're about to set up an analysis of GW150914 using a MEANINGLESS
MCMC configuration.  The objective is to demonstrate the workflow, NOT a
scientific result.  Please adjust the configuration file accordingly if you
desire science.

**Example directory:** This example lives in the repository [here](https://git.ligo.org/lscsoft/bayeswave/tree/master/BayesWaveUtils/bayeswave_pipe_examples/LDG-GW150914)

This is the setup most LIGO users will be familiar with: clone a repository,
build and install software, execute an analysis.

Here, we assume you have compiled and installed an appropriate branch of
BayesWave and BayesWavePipe. See
[here](https://git.ligo.org/lscsoft/bayeswave/blob/master/doc/INSTALLATION.md)
for installation information.

 1. Copy
    [this](https://git.ligo.org/lscsoft/bayeswave/blob/master/BayesWaveUtils/bayeswave_pipe_examples/LDG-GW150914/LDG-GW150914.ini)
    configuration file to your working directory.
 1. Modify paths in the `[engine]` section to point at the desired version of
    the BayesWave executables and libraries.
 1. Run the pipeline to set up an analysis of a single trigger time (if you
    installed BayesWavePipe with e.g., --user, it should already be in your path):

 ```
 bayeswave_pipe LDG-GW150914.ini \
    --trigger-time 1126259462.420000076 \
    --workdir LDG-GW150914
 ```

 This sets up a workflow for a BayesWave analysis of a single trigger time: that
 of GW150914, of course.  

The configuration file specifies the various BayesWave commandline options, as
well as things like condor memory requests, accounting tags etc.

 For convenience, this command is provided in [makework-LDG-GW150914.sh](https://git.ligo.org/lscsoft/bayeswave/blob/master/BayesWaveUtils/bayeswave_pipe_examples/LDG-GW150914/makework-LDG-GW150914.sh).

This sets up four condor jobs:
 1. `BayesWave`: main MCMC sampling.
 1. `BayesWavePost`: combine samples to compute waveform reconstructions and
    moments.
 1. `megaplot.py`: plot waveforms, moment distributions & generate web output.
 1. `megasky.py`: compute and plot posterior probability density for source
    sky-location ("skymap").

Workflow files (e.g., .dag, .sub, ...) are written to the directory specified
with `--workdir`.  That then contains a single output directory for each
BayesWave analysis time specified (in this case 1) with all the usual BayesWave
analysis products, including webpage and plots.

That's it!  To start the analysis, simply follow the on-screen prompt:

```
    To submit:
        cd LDG-GW150914
        condor_submit_dag bayeswave_LDG-GW150914.dag
```

## Run BayesWave On The OSG

**Example directory:** This example lives in the repository [here](https://git.ligo.org/lscsoft/bayeswave/tree/master/BayesWaveUtils/bayeswave_pipe_examples/OSG-O2background)

In this example we compute the signal evidence for 100 CWB time-slide
background triggers, read from a CWB trigger file.  Again, job configuration is
designed to result in minimal run times, results should *not* be considered
scientifically valid.

The Open Science Grid (OSG) offers a multitude of additional resources which
are ideal for offline injection and background analyses.  BayesWave's OSG
deployment relies on singularity containers.  Briefly:
 * *Container image*: a lightweight, stand-alone, executable package of a
 piece of software that includes everything needed to run it: code, runtime,
 system tools, system libraries, settings.
 * *Container*: instantiation of an image.
 * *Docker*: popular software for creating and running containers.
 * *Singularity*: slightly less popular software for creating and running
 containers but favored by admins of scientific clusters for security reasons.
 * *Registry*: a service which manages images.  Sort of like a repository.
 * *CVMFS*: a scalable,  reliable  and  low-maintenance  software distribution
 service (The `/cvmfs` directory hosts singularity images, software and, in some
     sense, frame data).


From the user-perspective, the procedure for running from a container is nearly
identical to above (minus installing anything), we just add the path to the
container in the configuration file and point at the correct executables:

 ```
bayeswave_pipe \        
    --workdir O2background \
    --cwb-trigger-list 100_cwb_triggers.dat \
    --osg-jobs \
    --glide-in \
    --skip-post \
    --skip-megapy
 ```

where the `[engine]` section of
[LDG-GW150914-singularity.ini](https://git.ligo.org/lscsoft/bayeswave/blob/master/BayesWaveUtils/bayeswave_pipe_examples/LDG-GW150914-singularity/LDG-GW150914-singularity.ini)
now the path to the desired
singularity image:

```
singularity="/cvmfs/ligo-containers.opensciencegrid.org/lscsoft/bayeswave:master"
```

**Important Note**
BayesWave and all post-processing codes are baked into the container in
`/opt/bayeswave`.  To use the bayeswave executables in the container, the
`[engine]` section *must* read:

```
bayeswave=/opt/bayeswave/bin/BayesWave
bayeswave_post=/opt/bayeswave/bin/BayesWavePost
megaplot=/opt/bayeswave/postprocess/megaplot.py
megasky=/opt/bayeswave/postprocess/skymap/megasky.py
postprocess=/opt/bayeswave/postprocess
utils=/opt/bayeswave/utils
```

**Other features to note**
 * Python code is also installed to `/opt/bayeswave` in the container (contrast
     with the `src` location when running from your own build)
 * The container can see your `/home`: you are free to point to your own
  versions of the bayeswave executables for e.g., code development. 
 * No bayeswave installation required (You do still need BayesWavePipe, though)
 * All dependencies are baked into the image
 * You are guarenteed to find exactly the same image on all clusters (with
     CVMFS) when you use that image path:  no need to maintain multiple
 BayesWave installations at different sites!

### Power Users
An important point for power users who may wish to reproduce the exact command
a condor job runs at the commandline: singularity must be executed with the
`--writable` and `--bind` options in order that we can write to our /home and
to access frame data.  To run a singularity job which reads frames at CIT
(which live in /hdfs), you need to run:

```
singularity exec \
    --writable \
    --bind /hdfs \
    /cvmfs/ligo-containers.opensciencegrid.org/lscsoft/bayeswave:master \
    /opt/bayeswave/bin/BayesWave "$@"
```


### Power Users
There are a host of practical differences and additional options available
which the general user might not care about and which are handled by the pipeline:
 * OSG workflows require file transfers: input files must be transferred with
 the jobs and the job output must be shipped back to the submission site.  This
 is handled by submission file directives like `should_transfer_files` and are
 set up by BayesWavePipe.  
 * Frame data is distributed using the CernVM file system (CVMFS).
 Consequently, the datafind command must specify a specific server
 (`datafind.ligo.org:443`) which returns frame locations in CVMFS, which are
 then common to all sites.  This removes the need for data discovery at
 specific sites and we don't have to deal with Pegasus.  This server is used
 whenever `--osg-jobs` is passed.
 * At some OSG and LDG sites (e.g., CIT), the CVMFS directories for frames are
 really symlinks.  The underlying parent directory for CVMFS frame data must be
 bound into the singularity container.  That is, the image must contain
 directories like `/cvmfs`, `/hdfs`, `/hadoop` and more as we get more sites.
 * Parts of our CVMFS-based container images contain the `@` symbol.
 Singularity versions equal to and earlier than v2.2 cannot handle this symbol.
 The submission file contains a `regexp` requirement which ensures the
 `OSG_SINGULARITY_VERSION` attribute is later than 2.2.