Validating datacards

This section covers the main features of the datacard validation tool that helps you spot potential problems with your datacards at an early stage. The tool is implemented in the CombineHarvester/CombineTools subpackage. See the combineTool section of the documentation for checkout instructions for the full tool, which is needed for this task.

The datacard validation tool contains a number of checks. It is possible to call subsets of these checks when creating datacards within CombineHarvester. However, for now we will only describe the usage of the validation tool on already existing datacards. If you create your datacards with CombineHarvester and would like to include the checks at the datacard creation stage, please contact us via https://cms-talk.web.cern.ch/c/physics/cat/cat-stats/279.

How to use the tool

The basic syntax is:

ValidateDatacards.py datacard.txt

This will write the results of the checks to a json file (default: validation.json), and will print a summary to the screen, for example:

================================
=======Validation results=======
================================
>>>There were  7800 warnings of type  'up/down templates vary the yield in the same direction'
>>>There were  5323 warnings of type  'up/down templates are identical'
>>>There were no warnings of type  'At least one of the up/down systematic uncertainty templates is empty'
>>>There were  4406 warnings of type  'Uncertainty has normalisation effect of more than 10.0%'
>>>There were  8371 warnings of type  'Uncertainty probably has no genuine shape effect'
>>>There were no warnings of type 'Empty process'
>>>There were no warnings of type 'Bins of the template empty in background'
>>>INFO: there were  169  alerts of type  'Small signal process'

The meaning of each of these warnings/alerts is discussed below.

The following arguments are possible:

usage: ValidateDatacards.py [-h] [--printLevel PRINTLEVEL] [--readOnly]
                            [--checkUncertOver CHECKUNCERTOVER]
                            [--reportSigUnder REPORTSIGUNDER]
                            [--jsonFile JSONFILE] [--mass MASS]
                            cards

positional arguments:
  cards                 Specifies the full path to the datacards to check

optional arguments:
  -h, --help            show this help message and exit
  --printLevel PRINTLEVEL, -p PRINTLEVEL
                        Specify the level of info printing (0-3, default:1)
  --readOnly            If this is enabled, skip validation and only read the
                        output json
  --checkUncertOver CHECKUNCERTOVER, -c CHECKUNCERTOVER
                        Report uncertainties which have a normalization effect
                        larger than this fraction (default:0.1)
  --reportSigUnder REPORTSIGUNDER, -s REPORTSIGUNDER
                        Report signals contributing less than this fraction of
                        the total in a channel (default:0.001)
  --jsonFile JSONFILE   Path to the json file to read/write results from
                        (default:validation.json)
  --mass MASS           Signal mass to use (default:*)

printLevel adjusts how much information is printed to the screen. When set to 0, the results are only written to the json file, but not to the screen. When set to 1 (default), the number of warnings/alerts of a given type is printed to the screen. Setting this option to 2 prints the same information as level 1, and additionally prints which uncertainties are affected (if the check is related to uncertainties) or which processes are affected (if the check is related only to processes). When printLevel is set to 3, the information from level 2 is printed, and additionaly for checks related to uncertainties it prints which processes are affected.

To print information to screen, the script parses the json file that contains the results of the validation checks. Therefore, if you have already run the validation tool and produced this json file, you can simply change the printLevel by re-running the tool with printLevel set to a different value, and enabling the --readOnly option.

The options --checkUncertOver and --reportSigUnder will be described in more detail in the section that discusses the checks for which they are relevant.

Note: the --mass argument should only be set if you normally use it when running Combine, otherwise you can leave it at the default.

The datacard validation tool is primarily intended for shape (histogram) based analyses. However, when running on a parametric model or counting experiment the checks for small signal processes, empty processes, and uncertainties with large normalization effects can still be performed.

Details on checks

Uncertainties with large normalization effect

This check highlights nuisance parameters that have a normalization effect larger than the fraction set by the option --checkUncertOver. The default value is 0.1, meaning that any uncertainties with a normalization effect larger than 10% are flagged up.

The output file contains the following information for this check:

largeNormEff: {
  <Uncertainty name>: {
    <analysis category>: {
      <process>: {
        "value_d":<value>
        "value_u":<value>
      } 
    }
  }
}

Where value_u and value_d are the values of the 'up' and 'down' normalization effects.

At least one of the Up/Down systematic templates is empty

For shape uncertainties, this check reports all cases where the up and/or down template(s) are empty, when the nominal template is not.

The output file contains the following information for this check:

emptySystematicShape: {
  <Uncertainty name>: {
    <analysis category>: {
      <process>: {
        "value_d":<value>
        "value_u":<value>
      } 
    }
  }
}

Where value_u and value_d are the values of the 'up' and 'down' normalization effects.

Identical Up/Down templates

This check applies to shape uncertainties only, and will highlight cases where the shape uncertainties have identical Up and Down templates (identical in shape and in normalization).

The information given in the output file for this check is:

uncertTemplSame: {
  <Uncertainty name>: {
    <analysis category>: {
      <process>: {
        "value_d":<value>
        "value_u":<value>
      } 
    }
  }
}

Where value_u and value_d are the values of the 'up' and 'down' normalization effects.

Up and Down templates vary the yield in the same direction

Again, this check only applies to shape uncertainties - it highlights cases where the 'Up' template and the 'Down' template both have the effect of increasing or decreasing the normalization of a process.

The information given in the output file for this check is:

uncertVarySameDirect: {
  <Uncertainty name>: {
    <analysis category>: {
      <process>: {
        "value_d":<value>
        "value_u":<value>
      } 
    }
  }
}

Where value_u and value_d are the values of the 'up' and 'down' normalization effects.

Uncertainty probably has no genuine shape effect

In this check, applying only to shape uncertainties, the normalized nominal templates are compared with the normalized templates for the 'up' and 'down' systematic variations. The script calculates \(\Sigma_i \frac{2|\text{up}(i) - \text{nominal}(i)|}{|\text{up}(i)| + |\text{nominal}(i)|}\) and \(\Sigma_i \frac{2|\text{down}(i) - \text{nominal}(i)|}{|\text{down}(i)| + |\text{nominal}(i)|}\).

where the sums run over all bins in the histograms, and 'nominal', 'up', and 'down' are the central template and up and down varied templates, all normalized.

If both sums are smaller than 0.001, the uncertainty is flagged up as probably not having a genuine shape effect. This means a 0.1% variation in one bin is enough to avoid being reported, but many smaller variations can also sum to be large enough to pass the threshold. It should be noted that the chosen threshold is somewhat arbitrary: if an uncertainty is flagged up as probably having no genuine shape effect you should take this as a starting point to investigate.

The information given in the output file for this check is:

smallShapeEff: {
  <Uncertainty name>: {
    <analysis category>: {
      <process>: {
        "diff_d":<value>
        "diff_u":<value>
      } 
    }
  }
}

Where diff_d and diff_u are the values of the sums described above for the 'down' variation and the 'up' variation.

Empty process

If a process is listed in the datacard, but the yield is 0, it is flagged up by this check.

The information given in the output file for this check is:

emptyProcessShape: {
  <analysis category>: {
    <process1>,
    <process2>,
    <process3>
  }
}

Bins that have signal but no background

For shape-based analyses, this checks whether there are any bins in the nominal templates that have signal contributions, but no background contributions.

The information given in the output file for this check is:

emptyBkgBin: {
  <analysis category>: {
    <bin_nr1>,
    <bin_nr2>,
    <bin_nr3>
  }
}

Small signal process

This reports signal processes that contribute less than the fraction specified by --reportSigUnder (default 0.001 = 0.1%) of the total signal in a given category. This produces an alert, not a warning, as it does not hint at a potential problem. However, in analyses with many signal contributions and with long fitting times, it can be helpful to remove signals from a category in which they do not contribute a significant amount.

The information given in the output file for this check is:

smallSignalProc: {
  <analysis category>: {
    <process>: {
      "sigrate_tot":<value>
      "procrate":<value>
    } 
  }
}

Where sigrate_tot is the total signal yield in the analysis category and procrate is the yield of signal process <process>.

What to do in case of a warning

These checks are mostly a tool to help you investigate your datacards: a warning does not necessarily mean there is a mistake in your datacard, but you should use it as a starting point to investigate. Empty processes and emtpy shape uncertainties connected to nonempty processes will most likely be unintended. The same holds for cases where the 'up' and 'down' shape templates are identical. If there are bins that contain signal but no background contributions, this should be corrected. See the FAQ for more information on that point.

For other checks it depends on the situation whether there is a problem or not. Some examples:

An analysis-specific nonclosure uncertainty could be larger than 10%. A theoretical uncertainty in the ttbar normalization probably not.
In an analysis with a selection that requires the presence of exactly 1 jet, 'up' and 'down' variations in the jet energy uncertainty could both change the process normalization in the same direction. (But they do not have to!)

As always: think about whether you expect a check to yield a warning in case of your analysis, and if not, investigate to make sure there are no issues.