This is Lite Plone Theme
You are here: Home User guide Using Xeon Phi on COSMIC

Using Xeon Phi on COSMIC

A guide to using the Xeon Phi cards on COSMIC.

Overview

There are currently 25 Intel Xeon Phi 5110P cards on Cosmic. These can currently be utilised via offload mode only. This means that we currently don’t support symmetric MPI mode.

Compiling your Code for Offload

  • If your code contains offload pragmas then you don’t need to add anything extra to your compile line to make it compile offload code.
  • If you wish to disable offload then you need to add the -no-offload flag to your compilation.
  • You should use the latest version of Intel’s compiler to get the most complete features and best performance, e.g.:
module swap icomp icomp/15.0.3.187

Running in Offload Mode

To run in offload mode you need to request nodes on Cosmic that have Xeon Phis (aka mics). To do this you need your job to request to be ran using the small2 or large2 queues and request that the nodes have mics on them via the PBS resource list. The following will give you 2 mics and 2 processors:

#PBS -l nodes=2:ppn=8:mics=1

Read this as: “give me 2 nodes that each have 8 cores and 1 mic on them”. (N.B. mics=1 is a bit of a misnomer, think of it as “mics per node”).

Setting up the Environment

You need to set the environment variables in your job script so that the mics are used properly. The following evironment variables are recommended for most cases:

export MIC_ENV_PREFIX=MIC
export MIC_OMP_NUM_THREADS=236
export MIC_KMP_AFFINITY=compact,granularity=fine
export OMP_NUM_THREADS=1
export OFFLOAD_DEVICES=`set_offload_devices`
  • The first environment variable sets the prefix prepended to subsequent environment variables to designate that they are valid on the Xeon Phi offload side only.
  • The next two set the number of threads and the affinity on the Xeon Phi.
  • The OMP_NUM_THREADS variable is separate and says that the code running on the host will use 1 OMP thread only.
  • The last environment variable, OFFLOAD_DEVICES, is very important because it correctly sets the list of physical mic IDs that exist in the nodes assigned to your job. This ensures mic locality which is necessary for good performance.

Physical and Logical Mic IDs

It is worth clarifying the different IDs used to refer to mics when running in offload mode.

  • Physical ID — this is ID of the physical device on Cosmic. Its value can be 0-24. This is used in the environment via the environment variable OFFLOAD_DEVICES.
  • Logical ID — This is the MIC ID used inside of your code to refer to the offload target mic that you wish to offload to. It’s value can be 0 to N-1 where N is the number of physical IDs listed in OFFLOAD_DEVICES.
  • Example — If your list of physical IDs is (1,2,4) then the logical IDs in our code used for offload target will be (0,1,2). The logical IDs are also cycle so if you reference logical ID 3 then this will cycle back and refer to logical ID 0 (which is physical device 1).

Example Job Script

#!/bin/bash

#PBS -q small2
#PBS -N my_mic_job
#PBS -j oe
#PBS -V

#PBS -l nodes=2:ppn=8:mics=1
#PBS -l walltime=04:00:00

export MIC_ENV_PREFIX=MIC
export MIC_OMP_NUM_THREADS=236
export MIC_KMP_AFFINITY=compact,granularity=fine
export OMP_NUM_THREADS=1
export OFFLOAD_DEVICES=`set_offload_devices`

cd $PBS_O_WORKDIR

# The following will run an MPI job with NP ranks
# and place a single MPI process on each numa node. This ensures that each MPI rank will have its own local mic.
NP=$PBS_NP
LASTCPU=`expr $PBS_NP-1`
mpirun -np $NP dplace -s1 -c 0-$LASTCPU:8 ./my_exe