Virtualenv issue in CMSSW_9_3_X

I ran into a strange issue related to Python virtualenv and pip in CMSSW_9_3_X. Python version 2.7.11 and Virtualenv version 15.1.0. Doing the following will cause an error:

virtualenv venv
source venv/bin/activate
pip install -U pip

The error message reads:

Traceback (most recent call last):
  File "/tmp/venv/bin/pip", line 7, in <module>
    from pip._internal import main
ImportError: No module named _internal

Apparently it is due to the environment variable $PYTHONPATH not set properly. I fixed it by patching the file venv/bin/activate. Here’s the patch file:

diff --git a/venv/bin/activate b/venv/bin/activate
index 03fa903..c104cf0 100644
--- a/venv/bin/activate
+++ b/venv/bin/activate
@@ -11,6 +11,11 @@ deactivate () {
         export PATH
         unset _OLD_VIRTUAL_PATH
     fi
+    if ! [ -z "${_OLD_PYTHONPATH+_}" ] ; then
+        PYTHONPATH="$_OLD_PYTHONPATH"
+        export PYTHONPATH
+        unset _OLD_PYTHONPATH
+    fi
     if ! [ -z "${_OLD_VIRTUAL_PYTHONHOME+_}" ] ; then
         PYTHONHOME="$_OLD_VIRTUAL_PYTHONHOME"
         export PYTHONHOME
@@ -47,6 +52,10 @@ _OLD_VIRTUAL_PATH="$PATH"
 PATH="$VIRTUAL_ENV/bin:$PATH"
 export PATH
 
+_OLD_PYTHONPATH="$PYTHONPATH"
+PYTHONPATH="$VIRTUAL_ENV/lib/python2.7/site-packages:$PYTHONPATH"
+export PYTHONPATH
+
 # unset PYTHONHOME if set
 if ! [ -z "${PYTHONHOME+_}" ] ; then
     _OLD_VIRTUAL_PYTHONHOME="$PYTHONHOME"

To apply, download it as mypatch.txt in the same directory where virtualenv venv was called. Then do:

patch -p1 < mypatch.txt

Now pip install -U pip should work.


Install TensorFlow with GPU support on Red Hat Linux

I had the chance to play with Tensorflow, a high performance machine learning framework/library originally developed by Google. These are my installation notes.

I am working on the system with Red Hat Linux

cat /etc/redhat-release
# Output: Red Hat Enterprise Linux Server release 7.4 (Maipo)

The easiest option to install Tensorflow seems to be using Anaconda. I used the more lightweight version of Anaconda called Miniconda. To download and install Miniconda 3:

wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
sh Miniconda3-latest-Linux-x86_64.sh

Accept the license, enter your preferred install location, then say ‘yes’ to prepend the install location to your $PATH environment variable.

Once conda has been installed, now it’s time to install Tensorflow. The instructions come from this Tensorflow page, but adapted a little bit for my purpose. I just downloaded the tensorflow-gpu package that is provided by Anaconda.

conda update conda
conda create -n tensorflow_conda pip python=2.7
source activate tensorflow_conda
conda install -c anaconda cudatoolkit=9.0
conda install -c anaconda tensorflow-gpu

To validate the installation, try the following in python:

import tensorflow as tf
print(tf.__version__)
# Output: 1.8.0
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))
# Output: Hello, TensorFlow!

When you leave, you can call source deactivate to exit the conda environment. To get back again, call source activate tensorflow_conda.

Note that when the conda environment isactivated, the $PATH is prepended with <your-install-location>/envs/tensorflow_conda/bin. In some cases, you might also want to prepend $LD_LIBRARY_PATH with <your-install-location>/envs/tensorflow_conda/lib. This will help tensorflow find and import all the necessary CUDA libraries such as libcudart.so.XYZ, libcublas.so.XYZ, libcudnn.so.XYZ and whatnot.

Finally, to also install other machine learning-related libraries:

pip install -U pip
pip install keras sklearn matplotlib jupyter

In case you want to remove the environment:

conda remove --name tensorflow_conda --all

CMSSW and Git

Useful links about Git:

If you see this error message:

Permission denied (publickey)
fatal : Could not read from remote repository

Please make sure you have the correct access rights
and the repository exists

You are probably trying to download/clone/pull from a repository using an address that looks like git@github.com:<username>/<repo>.git. If you just want the read-only access to the repo (not committing any changes back), you can simply change the address to https://github.com/<username>/<repo>.git. And you are done. :relaxed:

However, if you want to be a collaborator and get the read & write access to the repo, you have to use the git@github.com:<username>/<repo>.git address. (Certain CMSSW tools might force you to use it even if you don’t need the read & write access). In this case, you have to follow the instructions here: http://cms-sw.github.io/faq.html#how-do-i-subscribe-to-github.

In particular, make sure you register in github your ssh key. It means that you must do the following:

  1. Create a GitHub account.
  2. Generate a SSH key.
    • Typically it’s saved as ~/.ssh/id_rsa
  3. Associate the SSH key to your GitHub account.
  4. Activate the SSH key every time you connect to GitHub.
    • On Linux, call eval "$(ssh-agent -s)" followed by ssh-add ~/.ssh/id_rsa

CMSSW and CVS

CMSSW has moved to GitHub in Summer 2013. The read-only CVS repository is maintained at http://cmscvs.web.cern.ch/cmscvs/cgi/viewvc.cgi/cvsroot/CMSSW/.


CMSSW GlobalTag

CMS Global Tags for Conditions Data:

How to change it?

process.load("Configuration.StandardSequences.FrontierConditions_GlobalTag_cff")
from Configuration.AlCa.GlobalTag import GlobalTag
process.GlobalTag = GlobalTag(process.GlobalTag, 'auto:run2_mc')