This page documents the CombineHarvester framework for the production and analysis of datacards for use with the CMS combine tool. The central part of this framework is the CombineHarvester class, which provides a representation of the text-format datacards and the associated shape input.
The production of new datacards typically requires several steps, for example:
All of these operations are performed either directly by methods of the CombineHarvester class, or by using higher-level helper functions (see High-level tools below). By design all of the input required for these steps can be specified directly in the code. This makes it possible to quickly build a datacard in a single, self-contained file, without the use of any external scripts or configuration files.
Other functions include extracting information about the fit model:
As well as histogram-based templates, the production of datacards with arbitrary RooFit PDFs and datasets is also supported.
This repository is a "top-level" CMSSW package, i.e. it should be located at $CMSSW_BASE/src/CombineHarvester
. It currently provides two sub-packages:
The CMSSW version that should be used with CombineHarvester is driven by the recommendation for the HiggsAnalysis/CombinedLimit package, which is also required. The latest instructions can be found here. The CombineHarvester framework is compatible with the CMSSW 14_1_X and 11_3_X series releases. A new release area can be set up and compiled in the following steps:
cmsrel CMSSW_14_1_0_pre4 cd CMSSW_14_1_0_pre4/src cmsenv git clone https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit.git HiggsAnalysis/CombinedLimit # IMPORTANT: Checkout the recommended tag on the link above git clone https://github.com/cms-analysis/CombineHarvester.git CombineHarvester git checkout v3.0.0-pre1 scram b
If you are using this framework for the first time we recommend taking a look through some of the examples below which demonstrate the main features:
combineTool.py
script to build workspaces, compute asymptotic limits and plot the ouput.$CMSSW_BASE/src/auxiliaries
and up-to-date: git clone https://github.com/roger-wolf/HiggsAnalysis-HiggsToTauTau-auxiliaries.git auxiliariesThe input root files will be sourced from here.
More realistic, though less well documented, examples can be found in the following files:
CombineTools/bin/PostFitShapesFromWorkspace.cpp
(source code) - see separate page hereCombineTools/bin/SMLegacyExample.cpp
(source code) - produces a complete set of htt datacards for the legacy Run I SM analysis (HIG-13-004). The same workflow is also possible in python, see CombineTools/scripts/SMLegacyExample.py
CombineTools/bin/MSSMYieldTable.cpp
(source code) - produces the latex yield tables for the MSSM htt analysis (HIG-13-021). Run via the script CombineTools/scripts/yield_tables_mssm_example.sh
. You will first need to copy the input datacards: cd CombineTools; cp -r /afs/cern.ch/work/a/agilbert/public/CombineTools/data/mssm-paper-cmb ./input/
A number of high-level tools have been developed to provide a more convenient interface for some common CombineHarvester tasks:
CombineHarvester::WriteDatacard
so you don't have to. It can be used to write a set of datacards into the familiar LIMITS directory structure, or any other structure based on simple pattern strings.combine
workspace. May be useful for extracting post-fit yields and shapes for more complex physics models. Should be considered experimental at the moment.Creating a new package: It is planned that each new analysis will create their own package within the CombineHarvester
directory, where all the datacard creation, plotting and other tools specific to the analysis will be stored. This keeps the analysis-specific code self-contained, and ensures different analyses do not disrupt each other's workflow during the development phase. We expect that some tools or functions developed for specific analyses will be of more general use, in which case they will be promoted to the common CombineTools
package. Please raise an issue here if you would like a new package to be created for your analysis.
Code developments: New features and developments, or even just suggestions, are always welcome - either contact the developers directly or make a pull request.
Python interface: A python interface is also available - see the documentation page here for usage instructions. This is ultimately just a wrapper around the C++ code and most functions can be called in python in exactly the same way as their C++ counterparts. There are some functions however, especially those using template arguments, which have been adapted for python usage and may not provide exactly the same interface. Furthermore, as each C++ function has to be wrapped by hand, the python interface may occasionally lag behind the C++ one.
Error handling: It is quite possible to do things in a CombineHarvester instance that don't make sense, or at least don't allow for the production of a valid datacard - for example objects with missing shape information, negative process yields and categories without either observed data or background processes. The framework has been designed to detect many of these issues at the point in which they become a problem for a given function to proceed as intended. However there is no guarantee that all such issues will be detected. If a problem is encountered, a runtime exception will be thrown indicated the nature of the problem, for example, trying to extract a histogram missing from the input root file will produce this message:
terminate called after throwing an instance of 'std::runtime_error' what(): ******************************************************************************* Context: Function ch::GetClonedTH1 at CombineHarvester/CombineTools/src/TFileIO.cc:21 Problem: TH1 eleTau_0jet_medium/ggH not found in CMSSW_7_1_5/src/auxiliaries/shapes/htt_et.inputs-sm-7TeV-hcg.root *******************************************************************************
If the cause of such an error message is unclear, or if you believe the error message should not have been produced, please raise an issue here with full details on reproducing the problem: https://github.com/cms-analysis/CombineHarvester/issues/new
Please also raise an issue if you encounter any bugs, unintended behaviour, abrupt errors or segmentation faults - these will be addressed promptly by the developers.