Introduction
These pages document the
RooStats /
RooFit - based software tool used for
statistical analysis within the CMS experiment - Combine. Note that while this tool was originally developed in the Higgs Physics Analysis Group (PAG), its usage is now widespread within CMS.
Combine provides a command-line interface to many different statistical techniques, available inside RooFit/RooStats, that are used widely inside CMS.
The package exists on GitHub under https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit
For more information about Git, GitHub and its usage in CMS, see http://cms-sw.github.io/cmssw/faq.html
The code can be checked out from GitHub and compiled on top of a CMSSW release that includes a recent RooFit/RooStats, or via standalone compilation without CMSSW dependencies. See the instructions for installation of Combine below.
Installation instructions
Installation instructions and recommended versions can be found below. Since v9.0.0, the versioning follows the semantic versioning 2.0.0 standard. Earlier versions are not guaranteed to follow the standard.
Within CMSSW (recommended for CMS users)
The instructions below are for installation within a CMSSW environment. For end users that do not need to commit or do any development, the following recipes should be sufficient. To choose a release version, you can find the latest releases on github under https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit/releases
Combine v10 - recommended version
The nominal installation method is inside CMSSW. The current release targets
the CMSSW 14_1_X series because of the recent switch to el9 at lxplus machines.
Currently, the recommended tag is v10.3.3: see release notes
The git clone command below contains this tag and is optimised to reduce disk usage.
cmsrel CMSSW_14_1_0_pre4
cd CMSSW_14_1_0_pre4/src
cmsenv
git -c advice.detachedHead=false clone --depth 1 --branch v10.3.3 https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit.git HiggsAnalysis/CombinedLimit
cd HiggsAnalysis/CombinedLimit
scramv1 b clean; scramv1 b -j$(nproc --ignore=2) # always make a clean build, with n - 2 cores on the system
Legacy versions
Combine v9
The nominal installation method is inside CMSSW. The current release targets
the CMSSW 11_3_X series because this release has both python2 and python3 ROOT
bindings, allowing a more gradual migration of user code to python3. Combine is
fully python3-compatible and, with some adaptations, can also work in 12_X releases.
CMSSW 11_3_X runs on slc7, which can be setup using apptainer (see detailed instructions).
Currently, the recommended tag is v9.2.1: see release notes
The git clone command below contains this tag and is optimised to reduce disk usage.
cmssw-el7
cmsrel CMSSW_11_3_4
cd CMSSW_11_3_4/src
cmsenv
git -c advice.detachedHead=false clone --depth 1 --branch v9.2.1 https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit.git HiggsAnalysis/CombinedLimit
cd HiggsAnalysis/CombinedLimit
scramv1 b clean; scramv1 b -j$(nproc --ignore=2) # always make a clean build, with n - 2 cores on the system
Combine v8: CMSSW_10_2_X release series
Setting up the environment (once) is described below.
Currently, the recommended tag is v8.2.0: see release notes
The git clone command below contains this tag and is optimised to reduce disk usage.
cmssw-el7
cmsrel CMSSW_10_2_13
cd CMSSW_10_2_13/src
cmsenv
git -c advice.detachedHead=false clone --depth 1 --branch v8.2.0 https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit.git HiggsAnalysis/CombinedLimit
scramv1 b clean; scramv1 b -j$(nproc --ignore=2) # always make a clean build, with n - 2 cores on the system
SLC6/CC7 release CMSSW_8_1_X
Setting up OS using apptainer (see detailed instructions):
# For CC7:
cmssw-el7
# For SLC6:
cmssw-el6
cmsrel CMSSW_8_1_0
cd CMSSW_8_1_0/src
cmsenv
git clone https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit.git HiggsAnalysis/CombinedLimit
cd HiggsAnalysis/CombinedLimit
cd $CMSSW_BASE/src/HiggsAnalysis/CombinedLimit
git fetch origin
git checkout v7.0.13
scramv1 b clean; scramv1 b -j$(nproc --ignore=2) # always make a clean build, with n - 2 cores on the system
Oustide of CMSSW
Standalone compilation
Combine can be built as a CMake project. It has four required dependencies on the C++ side:
- ROOT, mostly for RooFit
- The boost library, for command line options parsing
- Eigen for linear algebra, used in the
CMSInterferenceFuncandRooSplineND
There are two more optional dependencies:
- The vdt library for fast vectorized math (can be disabled with the CMake configuration option
-DUSE_VDT=FALSE) - The gtest library for unit tests, if you build with
-DBUILD_TESTS=TRUE
Any environment that provides the dependencies can be used to build combine.
To build, run the following commands inside the cloned repository:
mkdir build
cd build
cmake .. # additional CMake configuration options like -DUSE_VDT=FALSE go here
cmake --build . -j8
cd ..
To use your build of Combine, you have to append to the following environment variables:
export PATH=$PWD/build/bin:$PATH
export LD_LIBRARY_PATH=$PWD/build/lib:$LD_LIBRARY_PATH
export PYTHONPATH=$PWD/build/python:$PYTHONPATH
For advanced users or packagers who want to install the build, there are some more relevant options to steer the CMake installation step:
CMAKE_INSTALL_BINDIR: the binary directory inside the install prefix forcombineand Python scripts liketext2workspace.py(bin/by default)CMAKE_INSTALL_LIBDIR: shared library directory (lib/by default)CMAKE_INSTALL_PYTHONDIR: python module directory (python/by default)CMAKE_INSTALL_INCLUDEDIR: header file directory (include/by default)
Standalone compilation with LCG
A typical environment that can be used on lxplus is the LCG software stack.
It can be activated as follows
LCG_RELEASE=LCG_106 # includes ROOT 6.32, like CMSSW_14_1_0_pre4
# LCG_RELEASE=dev3/latest # includes nightly build of ROOT master, useful for development
LCG_PATH=/cvmfs/sft.cern.ch/lcg/views/$LCG_RELEASE/x86_64-el9-gcc13-opt
source $LCG_PATH/setup.sh
source $LCG_PATH/bin/thisroot.sh
After activating the environment, you can follow the usual CMake build procedure explained above.
Standalone compilation with conda (CMake-based)
This recipe mirrors the setup used in our GitHub Actions builds and works on both Linux and macOS (Intel or Apple silicon):
git clone https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit.git HiggsAnalysis/CombinedLimit
cd HiggsAnalysis/CombinedLimit
# configure conda-forge as the preferred channel
conda config --set channel_priority strict
conda config --add channels conda-forge
# create and activate the environment
conda create -n combine python=3.12 root=6.34 gsl boost-cpp vdt eigen tbb cmake ninja
conda activate combine
# configure and build with CMake
cmake -S . -B build -DCMAKE_INSTALL_PREFIX=$CONDA_PREFIX -DCMAKE_INSTALL_PYTHONDIR=lib/python3.12/site-packages -DUSE_VDT=OFF
cmake --build build -j$(nproc --ignore=2)
cmake --install build
After installation the binaries and Python modules live inside the environment, so a new shell only requires:
conda activate combine
Pre-compiled versions in Docker
Pre-compiled versions of the tool are available as container images from the CMS cloud. These containers can be downloaded and run using Docker. If you have docker running you can pull and run the image using,
docker run --name combine -it gitlab-registry.cern.ch/cms-cloud/combine-standalone:<tag>
<tag> with a particular version of the tool. At the moment the available container versions are v9.2.1 and v9.2.1-slim, both build with Combine tag v9.2.1, and the v9.2.1-slim correspond to a slim version. If no tag is specified the latest version of the container will be loaded, which is v9.2.1-slim at the moment. The containers for v10.X.X versions are being developed and are not yet availble for the users.
You will now have the compiled Combine binary available as well as the complete package of tool.
The container can be re-started using docker start -i combine.
Standalone compilation with CernVM
Combine, either standalone or not, can be compiled via CVMFS using access to /cvmfs/cms.cern.ch/ obtained using a virtual machine - CernVM. To use CernVM You should have access to CERN IT resources. If you are a CERN user you can use your account, otherwise you can request a lightweight account.
If you have a CERN user account, we strongly suggest you simply run one of the other standalone installations, which are simpler and faster than using a VM.
You should have a working VM on your local machine, compatible with CernVM, such as VirtualBox. All the required software can be downloaded here.
At least 2GB of disk space should be reserved on the virtual machine for Combine to work properly and the machine must be contextualized to add the CMS group to CVMFS. A minimal working setup is described below.
-
Download the CernVM-launcher for your operating system, following the instructions available here for your operating system
-
Prepare a CMS context. You can use the CMS open data one already available on gitHub:
wget https://raw.githubusercontent.com/cernvm/public-contexts/master/cms-opendata-2011.context) -
Launch the virtual machine
cernvm-launch create --name combine --cpus 2 cms-opendata-2011.context -
In the VM, proceed with an installation of combine
Installation through CernVM is maintained on a best-effort basis and these instructions may not be up to date.
What has changed between tags?
You can generate a diff of any two tags (eg for v9.2.1 and v9.2.0) by using the following url:
https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit/compare/v9.2.0...v9.2.1
Replace the tag names in the url to any tags you would like to compare.
For developers
We use the Fork and Pull model for development: each user creates a copy of the repository on GitHub, commits their requests there, and then sends pull requests for the administrators to merge.
Prerequisites
-
Register on GitHub, as needed anyway for CMSSW development: http://cms-sw.github.io/cmssw/faq.html
-
Register your SSH key on GitHub: https://help.github.com/articles/generating-ssh-keys
-
Fork the repository to create your copy of it: https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit/fork (more documentation at https://help.github.com/articles/fork-a-repo )
You will now be able to browse your fork of the repository from https://github.com/your-github-user-name/HiggsAnalysis-CombinedLimit
We strongly encourage you to contribute any developments you make back to the main repository. See contributing.md for details about contributing.
CombineHarvester/CombineTools
CombineHarvester/CombineTools is a package for the creation of datacards/workspaces used with Combine v10 for a number of analyses in CMS. See the CombineHarvester documentation pages for more details on using this tool and additional features available in the full package.
This package also comes with useful features for Combine such as the automated datacard validation (see instructions). The repository can be checked out and compiled using,
git clone https://github.com/cms-analysis/CombineHarvester.git CombineHarvester
scram b -j$(nproc --ignore=2)
See the CombineHarvester documentation for full instructions and reccomended versions.
Info
Starting with Combine v10, specific ombineTool functionalities for job submition and parallelization (combineTool.py) as well as many plotting functions have been integrated into the Combine package. For these tasks you no longer have to follow the instructions above.
Citation
If you use Combine, please cite the following CMS publication here.
Show BibTex Entry
@article{
CMS:2024onh,
author = "Hayrapetyan, Aram and others",
collaboration = "CMS",
title = "The {CMS} statistical analysis and combination tool: {\textsc{Combine}}",
eprint = "2404.06614",
archivePrefix = "arXiv",
primaryClass = "physics.data-an",
reportNumber = "CMS-CAT-23-001, CERN-EP-2024-078",
year = "2024",
journal = "Comput. Softw. Big Sci.",
doi = "10.1007/s41781-024-00121-4",
volume = "8",
pages = "19"
}