CMSConnect

CMSConnect is a more powerful alternative to condor that allows users to submit jobs to multiple sites on the CMS computing grid.

In particular, this page shows how to add the public SSH key to your account:

When using the CMSConnect, one has to specify the “project” name in the submit script. For UF, it is:

+ProjectName="cms.org.ufl"

To submit jobs to SLC7 machines, one has to specify:

+REQUIRED_OS="rhel7"

I’m using the CMSConnet “client” on UF HiPerGator as it is the only way to submit condor jobs from there.


hadd with files on LPC EOS disk

There are a number of “Don’t do this” when working with the FNAL LPC EOS disk. They are described here. For example, don’t merge root files that are on EOS, because the EOS disk is mounted via FUSE, so it can cause trouble if there are heavy I/O. Instead, one should use the dedicated EOS or Xrootd commands.

Recently I had to merge root files (using hadd) in multiple directories on EOS, and it turned out to be not so straight forward using the EOS or Xrootd commands. So I had to do some python, listed below.

#!/usr/bin/env python

directories = [
  '/eos/uscms/store/group/l1upgrades/L1MuonTrigger/P2_10_4_0/SingleMuon_Overlap_4GeV/ParticleGuns/CRAB3/190125_014345/0000/',
  '/eos/uscms/store/group/l1upgrades/L1MuonTrigger/P2_10_4_0/SingleMuon_Overlap_4GeV/ParticleGuns/CRAB3/190125_014345/0001/',
]

outfile = '/tmp/jiafu/ntuple.root'

def call_cmd(cmd):
  import shlex, subprocess
  p = subprocess.Popen(shlex.split(cmd), stdout=subprocess.PIPE)
  lines = p.stdout.read().split()
  return lines

def list_input_files(directories):
  all_lines = []
  for directory in directories:
    cmd = 'xrdfs root://cmseos.fnal.gov ls -u {0}'.format(directory)
    lines = call_cmd(cmd)
    lines = [line for line in lines if line.endswith('.root')]
    all_lines += lines
  return ' '.join(all_lines)

# Main
if __name__ == '__main__':
  infiles = list_input_files(directories)
  cmd = 'hadd -f {0} {1}'.format(outfile, infiles)
  lines = call_cmd(cmd)
  #print '\n'.join(lines)

CRAB: Resubmit without the project directory

The CRAB project directory is the directory that is created when you make a new CRAB project (i.e. when you do crab submit). Sometimes you might have removed the project directory too quickly, before you realize that you want to resubmit some of the jobs. But without the project directory, you cannot call crab resubmit.

If you know the “task name”, which looks like YYMMDD_HHMMSS:request_name, then it’s possible to recreate the project directory. The timestamp is the time when you call crab submit, whereas the request_name is config.General.requestName from your crab.py. If you don’t remember the task name, you can always check the Task Monitoring dashboard to find out.

First, make an empty directory to be used as the CRAB project directory:

mkdir PROJDIR

Then, do the following in python:

from CRABClient.UserUtilities import config
from CRABClient.ClientUtilities import createCache

requestarea = PROJDIR
uniquerequestname = TASKNAME

host = 'cmsweb.cern.ch'
port = ''
voRole = ''
voGroup = ''
instance = 'prod'
originalConfig = config()
createCache(requestarea, host, port, uniquerequestname, voRole, voGroup, instance, originalConfig)

Please replace PROJDIR and TASKNAME in the above with the project directory and the task name.


Binary cross entropy in TensorFlow

In Tensorflow, the binary cross entropy loss function is implemented in a way to ensure stability and avoid overflow. The formulation can be found in the official doc. But it’s not very easy to follow when it’s written in pseudo-code. So I decided to type it in TeX (replacing the notation $z$ by $y$).

The logistic loss is

\[\begin{align*} \mathcal{L} &= - y \log(p) - (1 - y) \log(1-p) \\ &= - y \log(\operatorname{sigmoid}(x)) - (1 - y) \log(1-\operatorname{sigmoid}(x)) \\ &= - y \log \left(\frac{1}{1+e^{-x}} \right) - (1 - y) \log \left(1-\frac{1}{1+e^{-x}} \right) \\ &= - y \log \left(\frac{1}{1+e^{-x}} \right) - (1 - y) \log \left(\frac{e^{-x}}{1+e^{-x}} \right) \\ &= y \log({1+e^{-x}}) + (1 - y)\left[- \log(e^{-x}) + \log({1+e^{-x}}) \right] \\ &= y \log({1+e^{-x}}) + (1 - y)\left[x + \log({1+e^{-x}}) \right] \\ &= (1 - y)(x) + \log({1+e^{-x}}) \\ &= x - x \times y + \log({1+e^{-x}}) \end{align*}\]

For $x < 0$, to avoid overflow in $e^{-x}$, we reformulate the above

\[\begin{align*} \mathcal{L} &= x - x \times y + \log({1+e^{-x}}) \\ &= \log(e^{x}) - x \times y + \log({1+e^{-x}}) \\ &= - x \times y + \log(e^{x} \times ({1+e^{-x}})) \\ &= - x \times y + \log(1 + e^{x}) \end{align*}\]

Hence, to ensure stability and avoid overflow, the implementation uses this equivalent formulation

\[\begin{align*} \mathcal{L} &= \max(x,0) - x \times y + \log({1+e^{-|x|}}) \\ &= \operatorname{ReLU(x)} - x \times y + \log({1+e^{-|x|}}) \end{align*}\]

(To be more clear, the last formulation is used to combine $x - x \times y + \log({1+e^{-x}})$ when $x \geq 0$ and $- x \times y + \log(1 + e^{x})$ when $x < 0$).


Python optimizations

The following links provide very useful tips to help speed up your Python codes, some are even useful beyond Python: