.. _cbc-analysis:

CBC Analysis (Offline)
========================

To start an offline CBC analysis, you'll need a configuration file
to point at the start/end times to analyze, input data products
(e.g. template bank, mass model) and other workflow-related configuration needed.

All the below steps assume a Singularity container with the GstLAL software
stack installed. Other methods of installation will follow a similar
procedure, however, with one caveat that workflows will not work on the
Open Science Grid (OSG).

For a dag on the OSG IGWN grid, you must use a Singularity container on 
cvmfs, set the ``profile`` in ``config.yaml`` to ``osg`` and make sure 
to submit the dag from a OSG node. 
Otherwise the workflow is the same. 

When running without a Singularity container, the commands below should be 
modified. (Such as running ``gstlal_inspiral_workflow init -c config.yml``) 
instead of ``singularity exec <image> gstlal_inspiral_workflow init -c config.yml``). 

For ICDS gstlalcbc shared accounts, the ``env.sh`` contents much be changed 
and instead of running ``$ X509_USER_PROXY=/path/to/x509_proxy ligo-proxy-init -p albert.einstein``
run ``source env.sh``. (Details are below.)

Running Workflows
^^^^^^^^^^^^^^^^^^

1 Build Singularity image (optional)
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""

NOTE: If you are using a reference Singularity container (suitable in most
cases), you can skip this step.  The ``<image>`` throughout this doc refers to
``singularity-image`` specified in the ``condor`` section of your configuration.

If not using the reference Singularity container, say for local development, you
can specify a path to a local container and use that for the workflow (non-OSG).

To pull a container with gstlal installed, run:

.. code:: bash

    $ singularity build --sandbox --fix-perms <image-name> docker://containers.ligo.org/lscsoft/gstlal:master

To use a branch other than master, you can replace `master` in the above command with the name of the desired branch. To use a custom build instead, gstlal will need to be installed into the container from your modified source code. For installation instructions, see the
`installation page <https://docs.ligo.org/lscsoft/gstlal/installation.html>`_

2. Set up workflow
""""""""""""""""""""

First, we create a new analysis directory and switch to it:

.. code:: bash

   $ mkdir <analysis-dir>
   $ cd <analysis-dir>
   $ mkdir bank mass_model idq dtdphi

Default configuration files and environment (``env.sh``) for a
variety of different banks are contained in the
`offline-configuration <https://git.ligo.org/gstlal/offline-configuration>`_
repository.
One can run the commands below to grab the configuration files, or clone the 
repository and copy the files as needed into the analysis directory. 
To download data files (mass model, template banks) that may be needed for 
offline runs, see the
`README <https://git.ligo.org/gstlal/offline-configuration/-/blob/main/README.md>`_
in the offline-configuration repo. Move the template bank(s) into ``bank`` and the mass model into ``mass_model``.


For example, to grab all the relevant files for a small BNS dag:

.. code:: bash

    $ curl -O https://git.ligo.org/gstlal/offline-configuration/-/raw/main/configs/bns-small/config.yml
    $ curl -O https://git.ligo.org/gstlal/offline-configuration/-/raw/main/env.sh
    $ source /cvmfs/oasis.opensciencegrid.org/ligo/sw/conda/etc/profile.d/conda.sh
    $ conda activate igwn
    $ dcc archive --archive-dir=. --files -i T2200318-v2
    $ conda deactivate


Then move the template bank, mass model, idq file, and dtdphi file into their corresponding directories.


When running an analysis on the ICDS cluster in the gstlalcbc shared account, 
the contents of ``env.sh`` must be changed to what is given below.
In addition, below in the tutorial, where it says to run ``ligo-proxy-init -p``, 
instead, run ``source env.sh`` on the modified ``env.sh``. 
When running on non gstlalcbc shared accounts on ICDS or when running on other 
clusters, the ``env.sh`` does not need to be modifed, and ``ligo-proxy-init -p`` 
can be run as in the tutorial. 

.. code-block:: yaml
   export PYTHONUNBUFFERED=1
   unset X509_USER_PROXY
   export X509_USER_CERT=/ligo/home/ligo.org/gstlalcbc/.cert/gstlalcbc_icds_robot.key.pem
   export X509_USER_KEY=/ligo/home/ligo.org/gstlalcbc/.cert/gstlalcbc_icds_robot.key.pem
   export GSTLAL_FIR_WHITEN=0


Now, we'll need to modify the configuration as needed to run the analysis. At
the very least, setting the start/end times and the instruments to run over:

.. code-block:: yaml

    start: 1187000000
    stop: 1187100000

    instruments: H1L1

Ensure the template bank, mass model, idq file, and dtdphi file are pointed to in the configuration:

.. code-block:: yaml

    data: 
      template-bank: bank/gstlal_bank_small.xml.gz

.. code-block:: yaml

    prior:
      mass-model: bank/mass_model_small.h5
      idq-timeseries: idq/H1L1-IDQ_TIMESERIES-1239641219-692847.h5
      dtdphi: dtdphi/inspiral_dtdphi_pdf.h5

If you're creating a summary page for results, you'll need to point at a
location where they are web-viewable:

.. code-block:: yaml

    summary:
      webdir: ~/public_html/

If you're running on LIGO compute resources and your username doesn't match your
albert.einstein username, you'll also additionally need to specify the
accounting group user for condor to track accounting information:

.. code-block:: yaml

    condor:
      accounting-group-user: albert.einstein

In addition, update the ``singularity-image`` in the ``condor`` section of your configuration if needed:

.. code-block:: yaml

    condor:
      singularity-image: /cvmfs/singularity.opensciencegrid.org/lscsoft/gstlal:master

If not using a reference Singularity image, you can replace this with the
full path to a local singularity container ``<image>``.

For more detailed configuration options, take a look at the :ref:`configuration
section <analysis-configuration>` below.

If you haven't installed site-specific profiles yet (per-user), you can run:

.. code:: bash

    $ singularity exec <image> gstlal_grid_profile install

which will install configurations that are site-specific, i.e. ``ldas`` and ``icds``.
You can select which profile to use in the ``condor`` section:

.. code-block:: yaml

    condor:
      profile: ldas

For a OSG IGWN grid run, use ``osg``.
To view which profiles are available, you can run:

.. code:: bash

    $ singularity exec <image> gstlal_grid_profile list


Note, you can install :ref:`custom profiles <install-custom-profiles>` as well.

Once you have the configuration, data products, and grid profiles installed, you
can set up the Makefile using the configuration, which we'll then use for
everything else, including the data file needed for the workflow, the workflow
itself, the summary page, etc.

.. code:: bash

    $ singularity exec <image> gstlal_inspiral_workflow init -c config.yml

By default, this will generate the full workflow. If you want to only run the
filtering step, a rerank, or an injection-only workflow, you can instead specify
the workflow as well, e.g.

.. code:: bash

    $ singularity exec <image> gstlal_inspiral_workflow init -c config.yml -w injection

for an injection-only workflow.

If you already have a Makefile and need to update it based on an updated
configuration, run ``gstlal_inspiral_workflow`` with ``--force``.

Next, if you accessing non-public (GWOSC) data, you'll need to set up your proxy
to ensure you can get access to LIGO data:

.. code:: bash

    $ X509_USER_PROXY=/path/to/x509_proxy ligo-proxy-init -p albert.einstein

Note that we are running this step outside of Singularity. This is because ``ligo-proxy-init``
is not installed within the image currently.
If you are running on the ICDS gstlalcbc shared account, do not run the command 
above. 
Instead, run:

.. code:: bash

    $ source env.sh


Also update the configuration accordingly (if needed):

.. code-block:: yaml

    source:
      x509-proxy: /path/to/x509_proxy

Finally, set up the rest of the workflow including the DAG for submission:

.. code:: bash

    $ singularity exec -B $TMPDIR <image> make dag

If running on the OSG IGWN grid, make sure to submit the dags from the OSG node.
This should create condor DAGs for the workflow. Mounting a temporary directory
is important as some of the steps will leverage a temporary space to generate files.

If one desires to see detailed error messages, add ``<PYTHONUNBUFFERED=1>`` to 
``environment`` in the submit (``*.sub``) files by running:

.. code:: bash

    $ sed -i '/^environment = / s/\"$/ PYTHONUNBUFFERED=1\"/' *.sub


3. Launch workflows
"""""""""""""""""""""""""

.. code:: bash

    $ source env.sh
    $ make launch

This is simply a thin wrapper around `condor_submit_dag` launching the DAG in question.

You can monitor the dag with Condor CLI tools such as ``condor_q`` and ``tail -f full_inspiral_dag.dag.dagman.out``.

4. Generate Summary Page
"""""""""""""""""""""""""

After the DAG has completed, you can generate the summary page for the analysis:

.. code:: bash

    $ singularity exec <image> make summary

To make an open-box page after this, run:

.. code:: bash

    $ make unlock

.. _analysis-configuration:

Configuration
^^^^^^^^^^^^^^

The top-level configuration consists of the analysis times and detector configuration:

.. code-block:: yaml

    start: 1187000000
    stop: 1187100000

    instruments: H1L1
    min-instruments: 1

These set the start and stop gps times of the analysis, plus the detectors to use
(H1=Hanford, L1=Livingston, V1=Virgo). There is a nice online converter for gps times
here: https://www.gw-openscience.org/gps/. You can also use the program `gpstime` as
well. Note that these start and stop times have no knowledge about science
quality data, the actual science quality data that are analyzed is typically a
subset of the total time. Information about which detectors were on at different
times is available here: https://www.gw-openscience.org/data/.

``min-instruments`` sets the minimum number of instruments we will allow to form
an event, e.g. setting it to 1 means the analysis will consider single detector
events, 2 means we will only consider events that are coincident across at least
2 detectors.

Section: Data
""""""""""""""

.. code-block:: yaml

    data:
      template-bank: bank/gstlal_bank_small.xml.gz
      analysis-dir: /path/to/analysis/dir

The ``template-bank`` option points to the template bank file. These
are xml files that follow the LIGOLW (LIGO light weight) schema. The template
bank in particular contains a table that lists the parameters of all of the
templates, it does not contain the actual waveforms themselves. Metadata such as
the waveform approximant and the frequency cutoffs are also listed in this file.

The ``analysis-dir`` option is used if the user wishes to point to an existing
analysis to perform a rerank or an injection-only workflow. This grabs existing files
from this directory to seed the rerank/injection workflows.

One can use multiple sub template banks. In this case, the configuration might look like:

.. code-block:: yaml

    data:
      template-bank: 
        bns: bank/sub_bank/bns.xml.gz
        nsbh: bank/sub_bank/nsbh.xml.gz
        bbh_1: bank/sub_bank/bbh_low_q.xml.gz
        bbh_2: bank/sub_bank/other_bbh.xml.gz
        imbh: bank/sub_bank/imbh_low_q.xml.gz


Section: Source
""""""""""""""""

.. code-block:: yaml

    source:
      data-source: frames
      data-find-server: datafind.gw-openscience.org
      frame-type:
        H1: H1_GWOSC_O2_16KHZ_R1
        L1: L1_GWOSC_O2_16KHZ_R1
      channel-name:
        H1: GWOSC-16KHZ_R1_STRAIN
        L1: GWOSC-16KHZ_R1_STRAIN
      sample-rate: 4096
      frame-segments-file: segments.xml.gz
      frame-segments-name: datasegments
      x509-proxy: x509_proxy

The ``data-find-server`` option points to a server that is queried to find the
location of frame files. The address shown above is a publicly available server
that will return the locations of public frame files on cvmfs. Each frame file
has a type that describes the contents of the frame file, and may contain
multiple channels of data, hence the channel names must also be specified.
``frame-segments-file`` points to a LIGOLW xml file that describes the actual
times to analyze, i.e. it lists the time that science quality data are
available. These files are generalized enough that they could describe different
types of data, so ``frame-segments-name`` is used to specify which segment to
consider. In practice, the segments file we produce will only contain the
segments we want. Users will typically not change any of these options once they
are set for a given instrument and observing run. ``x509-proxy`` is the path to 
your ``x509-proxy``. 

Section: Segments
""""""""""""""""""

The ``segments`` section specifies how to generate segments and vetoes for the
workflow. There are two backends to determine where to query segments and vetoes
from, ``gwosc`` (public) and ``dqsegdb`` (authenticated).

An example of configuration with the ``gwosc`` backend looks like:

.. code-block:: yaml

    segments:
      backend: gwosc
      vetoes:
        category: CAT1

Here, the ``backend`` is set to ``gwosc`` so both segments and vetoes are determined
by querying the GWOSC server. There is no additional configuration needed to query
segments, but for vetoes, we also need to specify the ``category`` used for vetoes.
This can be one of ``CAT1``, ``CAT2``, or ``CAT3``. By default, segments are generated
by applying ``CAT1`` vetoes as recommended by the Detector Characterization group.

An example of configuration with the ``dqsegdb`` backend looks like:

.. code-block:: yaml

    segments:
      backend: dqsegdb
      science:
        H1: DCS-ANALYSIS_READY_C01:1
        L1: DCS-ANALYSIS_READY_C01:1
        V1: ITF_SCIENCE:2
      vetoes:
        category: CAT1
        veto-definer:
          file: H1L1V1-HOFT_C01_V1ONLINE_O3_CBC.xml
          version: O3b_CBC_H1L1V1_C01_v1.2
          epoch: O3

Here, the ``backend`` is set to ``dqsegdb`` so both segments and vetoes are determined
by querying the DQSEGDB server. To query segments, one needs to specify the flag used
per instrument to query segments from. For vetoes, we need to specify the ``category``
used for vetoes as with the ``dqsegdb`` backend. Additionally, a veto definer file is
used to determine which flags are used for which veto categories. The file need not be
provided, the ``file``, ``version`` and ``epoch`` fully specify how to access the veto
definer file used for generating vetoes.

Section: PSD
""""""""""""""

.. code-block:: yaml

    psd:
      fft-length: 8
      sample-rate: 4096

The PSD estimation method used by GstLAL is a modified median-Welch method that
is described in detail in Section IIB of Ref [1]. The FFT length sets the length
of each section that is Fourier transformed. The default whitener will use
zero-padding of one-fourth the FFT length on either side and will overlap
fourier transformed segments by one-fourth the FFT length. For example, an
``fft-length`` of 8 means that each Fourier transformed segment used in the PSD
estimation (and consequently the whitener) will contain 4 seconds of data with 2
seconds of zero padding on either side, and will overlap the next segment by 2
seconds (i.e. the last two seconds of data in one segment will be the first two
seconds of data in the following window).

Section: SVD
""""""""""""""

.. code-block:: yaml

    svd:
      f-low: 20.0
      num-chi-bins: 1
      sort-by: mchirp
      approximant:
        - 0:1.73:TaylorF2
        - 1.73:1000:SEOBNRv4_ROM
      tolerance: 0.9999
      max-f-final: 1024.0
      num-split-templates: 200
      overlap: 30
      num-banks: 5
      samples-min: 2048
      samples-max-64: 2048
      samples-max-256: 2048
      samples-max: 4096
      autocorrelation-length: 701
      max-duration: 128
      manifest: svd_manifest.json

``f-low`` sets the lower frequency cutoff for the analysis in Hz. 

``num-chi-bins`` is a tunable parameter related to the template bank binning
procedure; specifically, sets the number of effective spin parameter bins to use
in the chirp-mass / effective spin binning procedure described in Sec. IID and
Fig. 6 of [1].

``sort-by`` selects the template sort column. This controls how to bin the 
bank in sub-banks suitable for the svd decomposition. It can be ``mchirp`` 
(sorts by chirp mass), ``mu`` (sorts by mu1 and mu2 coordiantes), or 
``template_duration`` (sorts by template duration). 

``approximant`` specifies the waveform approximant that should be used along
with chirp mass bounds to use that approximant in. 0:1000:TaylorF2 means use the
TaylorF2 approximant for waveforms from systems with chirp-masses between 0 and
1000 solar masses. Multiple waveforms and chirp-mass bounds can be provided.

``tolerance`` is a tunable parameter related to the truncation of SVD basis
vectors. A tolerance of 0.9999 means the targeted matched-filter inner-product
of the original waveform and the waveform reconstructed from the SVD is 0.9999.

``max-f-final`` sets the max frequency of the template.

``num-split-templates``, ``overlap``, ``num-banks``, are tunable parameters
related to the SVD process. ``num-split-templates`` sets the number of templates
to decompose at a time; ``overlap`` sets the number of templates from adjacent
template bank regions to pad to the region being considered in order to actually
compute the SVD (this helps the performance of the SVD, and these pad templates
are not reconstructed); ``num-banks`` sets the number of sets of decomposed
templates to include in a given bin for the analysis. For example,
``num-split-templates`` of 200, ``overlap`` of 30, and ``num-banks`` of 5 means
that each SVD bank file will contain 5 decomposed sets of 200 templates, where
the SVD was computed using an additional 15 templates on either side of the 200
(as defined by the binning procedure). 

``samples-min``, ``samples-max-64``, ``samples-max-256``, and ``samples-max``
are tunable parameters related to the template time slicing procedure used by
GstLAL (described in Sec. IID and Fig. 7 of Ref. [1], and references therein).
Templates are slice in time before the SVD is applied, and only sampled at the
rate necessary for the highest frequency in each time slice (rounded up to a
power of 2). For example, the low frequency part of a waveform may only be
sampled at 32 Hz, while the high frequency part may be sampled at 2048 Hz
(depending on user settings). ``samples-min`` sets the minimum number of samples
to use in any time slice. ``samples-max`` sets the maximum number of samples to
use in any time slice with a sample rate below 64 Hz; ``samples-max-64`` sets
the maximum number of samples to use in any time slice with sample rates between
64 Hz and 256 Hz; ``samples-max-256`` sets the maximum number of samples to use
in any time slice with a sample rate greater than 256 Hz.

``autocorrelation-length`` sets the number of samples to use when computing the
autocorrelation-based test-statistic, described in IIIC of Ref [1].

``max-duration`` sets the maximum template duration in seconds. One can choose 
not to use ``max-duration``. 

``manifest`` sets the name of a file that will contain metadata about the
template bank bins.

If one uses multiple sub template banks, SVD configurations can be specified 
for each sub template bank. Reference `mario config <https://git.ligo.org/gstlal/offline-configuration/configs/mario/config.yml>`_ .

Users will typically not change these options.

Section: Filter
""""""""""""""""

.. code-block:: yaml

    filter:
      fir-stride: 1
      min-instruments: 1
      coincidence-threshold: 0.01
      ht-gate-threshold: 0.8:15.0-45.0:100.0
      veto-segments-file: vetoes.xml.gz
      time-slide-file: tisi.xml
      injection-time-slide-file: inj_tisi.xml
      time-slides:
        H1: 0:0:0
        L1: 0.62831:0.62831:0.62831
      injections:
        bns:
          file: bns_injections.xml
          range: 0.01:1000.0

``fir-stride`` is a tunable parameter related to the matched-filter procedure,
setting the length in seconds of the output of the matched-filter element.

``coincidence-threshold`` is the time in seconds to add to the light-travel time
when searching for coincidences between detectors.

``ht-gate-threshold`` sets the h(t) gate threshold as a function of chirp-mass.
The h(t) gate threshold is a value over which the output of the whitener plus
some padding will be set to zero (as described in IIC of Ref. [1]).
0.8:15.0-45.0:100.0 mean that a template bank bin that that has a max chirp-mass
template of 0.8 solar masses will use a gate threshold of 15, a bank bin with a
max chirp-mass of 100 will use a threshold of 45, and all other thresholds are
described by a linear function between those two points.

``veto-segments-file`` sets the name of a LIGOLW xml file that contains any
vetoes used for the analysis, even if there are no vetoes.

``time-slide-file`` and ``inj-time-slide-file`` are LIGOLW xml files that
describe any time slides used in the analysis. A typical analysis will only
analyze injections with the zerolag “time slide” (i.e. the data are not slid in
time), and will consider the zerolag and one other time slide for the
non-injection analysis. The time slide is used to perform a blind sanity check
of the noise model.

injections will list a set of injections, each with their own label. In this
example, there is only one injection set, and it is labeled “bns”. file is a
relative path to the injection file (a LIGOLW xml file that contains the
parameters of the injections, but not the actual waveforms themselves). range
sets the chirp-mass range that should be considered when searching for this
particular set of injections. Multiple injection files can be provided, each
with their own label, file, and range. 

The only option here that a user will normally interact with is the injections
option. 

When using multiple sub template banks, replace ``bns:`` under ``injections:`` 
with ``inj:``


Section: Injections
""""""""""""""""""""

.. code-block:: yaml

    injections:
      sets:
        expected-snr:
          f-low: 15.0
        bns:
          f-low: 14.0
          seed: 72338
          time:
            step: 32
            interval: 1
            shift: 0
          waveform: SpinTaylorT4threePointFivePN
          mass-distr: componentMass
          mass1:
            min: 1.1
            max: 2.8
          mass2:
            min: 1.1
            max: 2.8
          spin1:
            min: 0
            max: 0.05
          spin2:
            min: 0
            max: 0.05
          distance:
            min: 10000
            max: 80000
          spin-aligned: True
          file: bns_injections.xml

The ``sets`` subsection is used to create injection sets to be used within the
analysis, and referenced to by name in the ``filter`` section. In ``sets``, the
injections are grouped by key. In this case, one ``bns`` injection set which
creates the ``bns_injections.xml`` file and used in the ``injections`` section
of the ``filter`` section.

For multiple injections, the chunk for ``bns:`` should be repeated for each 
injection. Reference `mario config <https://git.ligo.org/gstlal/offline-configuration/configs/mario/config.yml>`_ .

Besides creating injection sets, the ``expected-snr`` subsection is used for the
expected SNR jobs. These settings are used to override defaults as needed.

``spin-aligned`` specifies whether the injections should be spin-(mis)aligned 
spins (if ``spin-aligned: True``) or precessing spins (if ``spin-aligned: False``).

In the case of multiple injection sets that need to be combined, one can add
a few options to create a combined file and reference that within the filter
jobs. This can be useful for large banks with a large set of templates. To
do this, one can add the following:

.. code-block:: yaml

    injections:
      combine: true
      combined-file: combined_injections.xml

The injections created are generated from the ``lalapps_inspinj`` program, with
the following mapping between configuration and command line options:

* ``f-low``: ``--f-lower``
* ``seed``: ``--seed``
* ``time`` section: ``-time-step``, ``--time-interval``. ``shift`` adjusts the
  start time appropriately.
* ``waveform``: ``--waveform``
* ``mass-distr``: ``--m-distr``
* ``mass/spin/distance`` sections: maps to options like ``--min-mass1``

Section: Prior
""""""""""""""""

.. code-block:: yaml

    prior:
      mass-model: mass_model/mass_model_small.h5

``mass-model`` is a relative path to the file that contains the mass model. This
model is used to weight templates appropriately when assigning ranking
statistics based on our understanding of the astrophysical distribution of
signals. Users will not typically change this option.

An optional ``dtdphi-file`` and ``idq-timeseries`` can be provided here. If not 
given, a default model (included in the standard installation) will be used. 
The dtdph file will specify a probability distribution function for the 
probability of measuring a given time shift and phase shift in mulitple detector 
observation. It enters in the ranking statistics.
The idq file will give information about the data quality around the time of 
coalescence. 
If specifying idq files and dtdphi files, create a directory for idq and dtdphi 
each in the ``<analysis-dir>``, and put the idq files and dtdphi files in the 
respective directory. 
Reference `mario config <https://git.ligo.org/gstlal/offline-configuration/configs/mario/config.yml>`_ .

Section: Rank
""""""""""""""""

.. code-block:: yaml

    rank:
      ranking-stat-samples: 4194304

``ranking-stat-samples`` sets the number of samples to draw from the noise model
when computing the distribution of log likelihood-ratios (the ranking statistic)
under the noise hypothesis. Users will not typically change this option.

Section: Summary
""""""""""""""""""

.. code-block:: yaml

    summary:
      webdir: /path/to/public_html/folder

``webdir`` sets the path of the output results webpages produced by the
analysis. Users will typically change this option for each analysis.

Section: Condor
""""""""""""""""""

.. code-block:: yaml

    condor:
      profile: osg-public
      accounting-group: ligo.dev.o3.cbc.uber.gstlaloffline
      accounting-group-user: <albert.einstein>
      singularity-image: <image>

``profile`` sets a base level of configuration options for condor.

``accounting-group`` sets accounting group details on LDG resources. Currently
the machinery to produce an analysis dag requires this option, but the option is
not actually used by analyses running on non-LDG resources.

``singularity-image`` sets the path of the container on cvmfs that the analysis
should use. Users will not typically change this option 
(use ``/cvmfs/singularity.opensciencegrid.org/lscsoft/gstlal:master``).

.. _install-custom-profiles:

Installing Custom Site Profiles
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

You can define a site profile as YAML. As an example, we can create a file called ``custom.yml``:

.. code-block:: yaml

    scheduler: condor
    requirements:
      - "(IS_GLIDEIN=?=True)"

Both the directives and requirements sections are optional.

To install one so it's available for use, run:

.. code:: bash

    $ singularity exec <image> gstlal_grid_profile install custom.yml