Graham Tips: Difference between revisions
m (→Check Job Status and Nodes Usage: Improving formatting for sqm and sqa output) |
|||
(24 intermediate revisions by 2 users not shown) | |||
Line 8: | Line 8: | ||
Graham does not have any nodes set aside specifically for interactive work. Instead, users can request time on the regular compute nodes for interactive jobs. This can be done using <code>salloc</code> (Slurm allocation). Some things to keep in mind are: | Graham does not have any nodes set aside specifically for interactive work. Instead, users can request time on the regular compute nodes for interactive jobs. This can be done using <code>salloc</code> (Slurm allocation). Some things to keep in mind are: | ||
* Once granted, you have sole access to the requested processors | * Once granted, you have sole access to the requested processors | ||
* You won't necessarily get immediate access to the nodes. Depending on how many processors / for how long / how much memory you request, you may have to wait several minutes or more before being granted the allocation. | * You won't necessarily get immediate access to the nodes. Depending on how many processors / for how long / how much memory you request, you may have to wait several minutes or more before being granted the allocation. | ||
* After calling <code>salloc</code>, your terminal session will be automatically redirected to the assigned compute node. Once you're finished, <code>exit</code> relinquishes the compute nodes. | * After calling <code>salloc</code>, your terminal session will be automatically redirected to the assigned compute node. Once you're finished, <code>exit</code> relinquishes the compute nodes. | ||
Line 15: | Line 15: | ||
To request interactive processors, use: | To request interactive processors, use: | ||
<syntaxhighlight lang="bash"> | <syntaxhighlight lang="bash"> | ||
salloc --time=DD-HH:MM -- | salloc --time=DD-HH:MM --mem-per-cpu=<number>G --ntasks=<number> --account=<your_account> | ||
</syntaxhighlight> | </syntaxhighlight> | ||
where <code>--time=DD-HH:MM</code> specifies the days (DD), hours (HH), and minutes (MM), for which you would like the compute nodes, <code>-- | where <code>--time=DD-HH:MM</code> specifies the days (DD), hours (HH), and minutes (MM), for which you would like the compute nodes, <code>--mem-per-cpu=<number>G</code> specifies the number of GB per processor that you require, <code>--ntasks=<number></code> specifies the number of processors, and <code>--account-<your_account></code> specifies to which account the cpu-hours should be charged. In general, your default account should be fine. | ||
It is worth noting that you can add the <syntaxhighlight lang="bash" inline>--begin=2018-01-15T12:34:00</syntaxhighlight> flag to specify a start time for the job. | |||
This means that you can submit the job the night before in order to have it be up and ready for you in the morning. | |||
In the example, the job will wait until 15 January 2018, 12h34:00 before starting. The year is not necessary as the current one (or the next instance of that time) will be used. For example --begin=16:00 means that the job will be scheduled to start the next time 4pm rolls around. | |||
Other notations exist; such as noon, midnight, or teatime (4pm). Usage for these is --begin=noon, and will run at the next instance of the given time. | |||
=== Notes === | === Notes === | ||
*The allocation, and any processes running therein, will be terminated, ''without warning'', after the requested amount of run time expires, so be sure to request a sufficient amount of time. | * The allocation, and any processes running therein, will be terminated, ''without warning'', after the requested amount of run time expires, so be sure to request a sufficient amount of time. | ||
* If you exceed the amount of requested memory, the salloc session will sometimes terminate. It appears unclear when it does and does not kill your session. | * If you exceed the amount of requested memory, the salloc session will sometimes terminate. It appears unclear when it does and does not kill your session. | ||
* Depending on your usage style, it may be worth using [[LINUX/UNIX#screen|screen]], as it could let you run a few things concurrently within an interactive session. | |||
See [[#Changing Prompt Colour when on Compute Nodes|Changing Prompt Colour when on Compute Nodes]] to set up your terminal prompt to automatically change colour when using a compute node. | See [[#Changing Prompt Colour when on Compute Nodes|Changing Prompt Colour when on Compute Nodes]] to set up your terminal prompt to automatically change colour when using a compute node. | ||
For more information, see [https://docs.computecanada.ca/wiki/Running_jobs#Interactive_jobs the official Graham documention]. | For more information, see [https://docs.computecanada.ca/wiki/Running_jobs#Interactive_jobs the official Graham documention]. | ||
==== Setting OMP_NUM_THREADS ==== | |||
It seems that the system may not automatically set the environment variable <code>OMP_NUM_THREADS</code> when you start an interactive session. | |||
If you want to have this value be automatically set according to the allocated resources, add the following snippet to your <code>~/.bashrc</code> file. | |||
<syntaxhighlight lang=bash> | |||
# Set the OMP_NUM_THREADS appropriately | |||
if [ -z "${SLURM_CPUS_PER_TASK+x}" ]; then | |||
export OMP_NUM_THREADS=1 | |||
else | |||
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK} | |||
fi | |||
</syntaxhighlight> | |||
=== VNC === | === VNC === | ||
Line 33: | Line 53: | ||
==== VNC Installation ==== | ==== VNC Installation ==== | ||
You will need to install the application TigerVNC on your local machine to create a virtual network. Follow these steps: | You will need to install the application TigerVNC on your local machine to create a virtual network. Follow these steps: | ||
# Go to [https://sourceforge.net/projects/tigervnc/ TigerVNC binaries] | |||
# Install the appropriate tigervnc file (.exe for windows, .dmg for mac) | |||
The developers now host their binaries on SourceForge. While it has a spotty past, the new owners have worked hard to rehabilitate this service by no longer bundling adware. If you're still weary of using this site, the developers have their source code available on [https://github.com/TigerVNC/tigervnc/releases github]. | |||
==== Virtual Desktop ==== | |||
Perhaps the simplest way to get a desktop-like interface with Graham is to use <code>gra-vdi.computecanada.ca</code> as it gives you a full desktop experience. | |||
Using TigerVNC: | |||
# Connect to <code>gra-vdi.computecanada.ca</code> and log in with your ComputeCanada credentials | |||
# Install needed software through [https://www.sharcnet.ca/help/index.php/NIX Nix] | |||
#* Some programs exist already on Graham but need to be loaded from the module manager or through Nix | |||
For example, installing ffmpeg can be achieved via: | |||
<syntaxhighlight lang="bash"> | |||
module load nix | |||
nix-env -iA nixpkgs.ffmpeg | |||
</syntaxhighlight> | |||
You can search for nix packages [https://nixos.org/nixos/packages.html# here]. | |||
Running Matlab is a little different. It can be achieved via: | |||
<syntaxhighlight lang="bash"> | |||
module load nixpkgs | |||
module load matlab/2019a | |||
matlab | |||
</syntaxhighlight> | |||
==== Create a virtual network ==== | ==== Create a virtual network ==== | ||
Though it requires more steps, you can also create a virtual network. | |||
After TigerVNC is installed you can connect to a virtual network on Graham by following these steps: | After TigerVNC is installed you can connect to a virtual network on Graham by following these steps: | ||
# Log into Graham | # Log into Graham | ||
Line 49: | Line 94: | ||
#* Opening TigerVNC, type localhost in VNC server and select Connect. | #* Opening TigerVNC, type localhost in VNC server and select Connect. | ||
#*:- if you needed to use the port 5910 then type localhost:5910 instead of just localhost | #*:- if you needed to use the port 5910 then type localhost:5910 instead of just localhost | ||
=== Interactive Matlab === | === Interactive Matlab === | ||
Line 58: | Line 101: | ||
=== Interactive Jupyter (iPython) === | === Interactive Jupyter (iPython) === | ||
[https:// | [https://docs.computecanada.ca/wiki/Jupyter This sharcnet page] provides information on running interactive Jupyter sessions on Graham. | ||
It allows you to host the session on Graham (using multiple processors, a large amount of memory, as well as GPUs), while interacting with the session on your local machine via your favourite web browser. | It allows you to host the session on Graham (using multiple processors, a large amount of memory, as well as GPUs), while interacting with the session on your local machine via your favourite web browser. | ||
There are a few steps required to set it up, but once done it is very straight-forward. | There are a few steps required to set it up, but once done it is very straight-forward. | ||
Line 65: | Line 108: | ||
Jupyter sessions are killed ''without warning'' when the requested time limit has been exceeded, so remember to save often! | Jupyter sessions are killed ''without warning'' when the requested time limit has been exceeded, so remember to save often! | ||
== Visualization == | == Visualization == | ||
Line 202: | Line 243: | ||
# for separate simulation sets. | # for separate simulation sets. | ||
function updateJobSummary() { | function updateJobSummary() { | ||
# check if job ended with an error | |||
if [ $1 -ne 0 ]; then | |||
exit $1 | |||
fi | |||
# If a log file is not provided, use the default | # If a log file is not provided, use the default | ||
DEFAULT_LOG_FILE="${HOME}/my_jobs.log" | DEFAULT_LOG_FILE="${HOME}/my_jobs.log" | ||
LOGFILE=${ | LOGFILE=${2:-${DEFAULT_LOG_FILE}}; | ||
# Check if something has previously been logged for this jobid | # Check if something has previously been logged for this jobid | ||
Line 266: | Line 312: | ||
# Update the job summary with the final time | # Update the job summary with the final time | ||
updateJobSummary | updateJobSummary $? | ||
</syntaxhighlight> | </syntaxhighlight> | ||
Line 360: | Line 406: | ||
=== Job details === | === Job details === | ||
The following | The built-in function <code>seff</code> provides a nice summary of the job resource usage. You should check this often for the amount of memory used and the efficiency of the parallel job. | ||
The following commands give information about a particular job. | |||
* seff (slurm efficiency): job memory and processors used and their efficiencies | |||
* scj (slurm control job): processors and nodes used by a job | * scj (slurm control job): processors and nodes used by a job | ||
* saj (slurm account job): memory and time used by a job (use on completed jobs) | * saj (slurm account job): memory and time used by a job (use on completed jobs) | ||
* ssj (slurm status job): job memory (use on running jobs) | * ssj (slurm status job): job memory (use on running jobs) | ||
* job_summary: print all of the above | * job_summary: print all of the above | ||
<syntaxhighlight lang="bash"> | <syntaxhighlight lang="bash"> | ||
Line 380: | Line 427: | ||
} | } | ||
function job_summary() { | function job_summary() { | ||
seff $1 | |||
scj $1; | scj $1; | ||
saj $1; | saj $1; | ||
Line 387: | Line 435: | ||
Usage is: | Usage is: | ||
$ scj < | $ seff <jobID> | ||
=== Getting the Path to a Log File === | |||
Using the above functions, you can get the log file (i.e. the output of stdout) directly from the jobID. | |||
This lets you check the state of a job without having to move to that directory. | |||
Three functions are given below. | |||
The first two simply get the path to the log/error file when provided a valid jobID. | |||
The third prints the tail of both the log and error files, with a bit of formatting to help readability. | |||
<syntaxhighlight lang="bash"> | |||
function get_job_log() { | |||
tmp_str=`scj $1 | grep StdOut` | |||
log_name=${tmp_str#*=} | |||
echo ${log_name} | |||
} | |||
function get_job_err() { | |||
tmp_str=`scj $1 | grep StdErr` | |||
log_name=${tmp_str#*=} | |||
echo ${log_name} | |||
} | |||
function job_tail() { | |||
# Define some colours to make output prettier | |||
COL='\033[1;37m' | |||
NC='\033[0m' # No Color | |||
# Get the job name | |||
tmp_str=`scj $1 | grep JobName` | |||
job_name=${tmp_str##*=} | |||
# Print a header | |||
echo -e "${COL}== Summary of job ${1}: (${job_name}) ==${NC}"; | |||
echo "" | |||
# Print the tail of the log file | |||
echo -e "${COL}Tail of log for job ${1}${NC}" | |||
tail `get_job_log ${1}` | |||
echo "" | |||
# Print the tail of the error file | |||
echo -e "${COL}Tail of err for job ${1}${NC}" | |||
tail `get_job_err ${1}` | |||
} | |||
</syntaxhighlight> | |||
=== Move to a Simulation/Job Directory === | === Move to a Simulation/Job Directory === | ||
Line 396: | Line 491: | ||
<syntaxhighlight lang="bash"> | <syntaxhighlight lang="bash"> | ||
# Move to job directory | |||
function cdJob() { | function cdJob() { | ||
pth=$(squeue -o %Z -j $1 | sed '1d') | if [ -z "${1+x}" ]; then | ||
echo "No job ID given." | |||
else | |||
pth=$(squeue -o %Z -j $1 | sed '1d') | |||
echo "cd-ing to ${pth}"; | |||
cd ${pth}; | |||
fi | |||
} | } | ||
export -f cdJob | |||
</syntaxhighlight> | </syntaxhighlight> | ||
Line 407: | Line 508: | ||
=== List Submitted Jobs === | === List Submitted Jobs === | ||
sq_hist defaults to showing the jobs | sq_hist defaults to showing the jobs that finished in the last three days. As an optional argument, you can specify the number of days, i.e. <syntaxhighlight lang="bash" inline>sq_hist 1</syntaxhighlight> for just jobs that finished in the last day. | ||
Note that this does not display any jobs from salloc / interactive jobs that are not assigned a name. | Note that this does not display any jobs from salloc / interactive jobs that are not assigned a name. | ||
Line 413: | Line 514: | ||
<syntaxhighlight lang="bash"> | <syntaxhighlight lang="bash"> | ||
function sq_hist() { | function sq_hist() { | ||
# | # Default to 3 day history | ||
# (Optional) input argument | |||
echo $TIME; | DEFAULT=3; | ||
sacct --starttime ${TIME} -X --format=jobid,jobname, | NUM_DAYS=${1:-${DEFAULT}}; | ||
NUM_SECS=$((${NUM_DAYS}*86400)) | |||
TIME=$(perl -e 'use POSIX;print strftime "%Y-%m-%d",localtime time-'${NUM_SECS};) | |||
COL='\033[1;37m' | |||
NC='\033[0m' # No Color | |||
echo -e "${COL}Listing jobs completed (or failed) since ${TIME}${NC}"; | |||
sacct --starttime ${TIME} --state=CA,CD,DL,F,NF,OOM,TO -X --format=jobid%-9,jobname%-24,end,ncpus%5,state,exitcode | grep -v " sh "; | |||
} | |||
export -f sq_hist | |||
</syntaxhighlight> | |||
=== Polite Parallel Compiling === | |||
If you're compiling multi-file programs using <code>make</code>, then you may find the <code>-j#</code> flag useful, where <code>#</code> is some number. | |||
This flag tells <code>make</code> to split the compilation across multiple processors to compile in parallel. | |||
If however, you're like me, you often forget to do this, or omit the number, which can then swamp the node | |||
(no number is given, then <code>make</code> will not limit the number of jobs that can run simultaneously). | |||
The following alias <code>pmake</code>, automatically applies a default number of processors (hard-coded as <code>make_num_cpus</code>), while | |||
avoiding overloading the node (since you're likely on a login node with many other people) by setting an upper bound on cpu load (hard-coded as <code>make_max_load</code>). | |||
With the values hard-coded below, <code>pmake</code> will use up to 10 processors, but will not push the processor load past 20 | |||
(so that if the processor load was 15 before you started compiling, only 5 processors would be used to compile). | |||
How you set these values will depend on your system / willingness to consume resources. | |||
Presumably you could leave <code>make_num_cpus</code> blank, which would tell <code>make</code> to use as many as possible up to <code>make_max_load</code>. | |||
Prepending the <code>nice</code> command tells the OS that your compiling isn't high priority. | |||
<syntaxhighlight lang=bash> | |||
# parallel make - because I'm too lazy to add j every time | |||
export make_num_cpus=10 # the target number of processors to use | |||
export make_max_load=20 # make won't push the processor usage past this | |||
# useful for shared resources | |||
alias pmake='time nice make -j${make_num_cpus} --load-average=${make_max_load}' | |||
</syntaxhighlight> | |||
== Python on Graham == | |||
Graham comes with a variety of python modules available. | |||
For your average user, the basic python module (e.g. <code>python/3.6.3</code>) and packages module (e.g. <code>scipy-stack/2018b</code>) should provide a lot of the functionality that you'll need. | |||
Off-shoot packages can then be installed via <code>pip</code>, (e.g. <code>pip install --user cmocean</code>). | |||
Note that <code>--user</code> is necessary to avoid permissions issues. | |||
=== Maintaining your own Python === | |||
Python on Graham isn't always as up-to-date as you may like it to be, and sometimes it easier to just maintain your own packages. | |||
One option for this is to define a bash function that switches over to your desired python install. | |||
After including the below code snippet into your <code>~/.bashrc</code>, you can then activate your desired python settings by simply calling the function <code>loadPython3</code> in terminal. | |||
This can be mildly headache inducing, since you now have to maintain all of your own packages, but it does give you better control. | |||
Installing through <code>pip</code> is the same, but again make sure to use the <code>--user</code> flag. | |||
<syntaxhighlight lang="bash" line> | |||
function loadPython3() { | |||
# The system-installed python packages are old | |||
# so instead just load python and pip | |||
# and install them manually | |||
module load mpi4py; | |||
module unload python35-scipy-stack/2017a | |||
module load imkl; # for reasons, the above line removes this | |||
module load python/3.5.4; | |||
} | } | ||
export -f loadPython3 | |||
</syntaxhighlight> | </syntaxhighlight> | ||
== Data Transferring == | |||
Compute Canada has the following tool which makes data transfers between their machines and any other machine extremely easy. This tool will automatically restart dropped connections and will test for file integrity. | |||
Go to [https://docs.computecanada.ca/wiki/Globus Globus] for more information. |
Latest revision as of 12:57, 10 August 2021
The following are a list of scripts, functions and aliases that will make your life on the Sharcnet Graham cluster much easier. Section 1, Interactive Jobs, provides information on running interactive jobs on compute nodes. Section 2, Submitting Jobs, provides submission scripts to your jobs. Section 3, ~/.bashrc, lists other useful commands for checking job information (such as memory usage, expected start time, etc), changing directories to a running job. Section 4, Visualization provides some information on running visualization software on Graham.
Interactive Jobs
Graham does not have any nodes set aside specifically for interactive work. Instead, users can request time on the regular compute nodes for interactive jobs. This can be done using salloc
(Slurm allocation). Some things to keep in mind are:
- Once granted, you have sole access to the requested processors
- You won't necessarily get immediate access to the nodes. Depending on how many processors / for how long / how much memory you request, you may have to wait several minutes or more before being granted the allocation.
- After calling
salloc
, your terminal session will be automatically redirected to the assigned compute node. Once you're finished,exit
relinquishes the compute nodes. - The environment variable
SLURM_NTASKS
stores the number of requested processors, in case you forget.
To request interactive processors, use:
salloc --time=DD-HH:MM --mem-per-cpu=<number>G --ntasks=<number> --account=<your_account>
where --time=DD-HH:MM
specifies the days (DD), hours (HH), and minutes (MM), for which you would like the compute nodes, --mem-per-cpu=<number>G
specifies the number of GB per processor that you require, --ntasks=<number>
specifies the number of processors, and --account-<your_account>
specifies to which account the cpu-hours should be charged. In general, your default account should be fine.
It is worth noting that you can add the --begin=2018-01-15T12:34:00
flag to specify a start time for the job.
This means that you can submit the job the night before in order to have it be up and ready for you in the morning.
In the example, the job will wait until 15 January 2018, 12h34:00 before starting. The year is not necessary as the current one (or the next instance of that time) will be used. For example --begin=16:00 means that the job will be scheduled to start the next time 4pm rolls around.
Other notations exist; such as noon, midnight, or teatime (4pm). Usage for these is --begin=noon, and will run at the next instance of the given time.
Notes
- The allocation, and any processes running therein, will be terminated, without warning, after the requested amount of run time expires, so be sure to request a sufficient amount of time.
- If you exceed the amount of requested memory, the salloc session will sometimes terminate. It appears unclear when it does and does not kill your session.
- Depending on your usage style, it may be worth using screen, as it could let you run a few things concurrently within an interactive session.
See Changing Prompt Colour when on Compute Nodes to set up your terminal prompt to automatically change colour when using a compute node.
For more information, see the official Graham documention.
Setting OMP_NUM_THREADS
It seems that the system may not automatically set the environment variable OMP_NUM_THREADS
when you start an interactive session.
If you want to have this value be automatically set according to the allocated resources, add the following snippet to your ~/.bashrc
file.
# Set the OMP_NUM_THREADS appropriately
if [ -z "${SLURM_CPUS_PER_TASK+x}" ]; then
export OMP_NUM_THREADS=1
else
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
fi
VNC
One feature you might want to have is the ability to view figures, files, and images that are not renderable on the command line. A VNC (Virtual Network Computing) client/server application is what you want. You might have heard about x11, but this can be insecure and slower than a VNC, and should not be used.
VNC Installation
You will need to install the application TigerVNC on your local machine to create a virtual network. Follow these steps:
- Go to TigerVNC binaries
- Install the appropriate tigervnc file (.exe for windows, .dmg for mac)
The developers now host their binaries on SourceForge. While it has a spotty past, the new owners have worked hard to rehabilitate this service by no longer bundling adware. If you're still weary of using this site, the developers have their source code available on github.
Virtual Desktop
Perhaps the simplest way to get a desktop-like interface with Graham is to use gra-vdi.computecanada.ca
as it gives you a full desktop experience.
Using TigerVNC:
- Connect to
gra-vdi.computecanada.ca
and log in with your ComputeCanada credentials - Install needed software through Nix
- Some programs exist already on Graham but need to be loaded from the module manager or through Nix
For example, installing ffmpeg can be achieved via:
module load nix
nix-env -iA nixpkgs.ffmpeg
You can search for nix packages here.
Running Matlab is a little different. It can be achieved via:
module load nixpkgs
module load matlab/2019a
matlab
Create a virtual network
Though it requires more steps, you can also create a virtual network.
After TigerVNC is installed you can connect to a virtual network on Graham by following these steps:
- Log into Graham
- Log into a compute node (using salloc)
- run
vncserver
- - You will get something like
New 'gra827:1 (USERID)' desktop is gra827:1
- - You will get something like
- In a new local terminal run
ssh -NL 5900:gra827:5901 graham.sharcnet.ca
where gra827 matches that given in the previous step and :1 has become :5901.- If this runs properly it will appear to hang, but that's normal so move onto the next step.
- If you get an error about an "Address already in use", then change 5900 to another number (like 5910).
- Run TigerVNC by either:
- Running
vncviewer localhost
in another terminal, or - Opening TigerVNC, type localhost in VNC server and select Connect.
- - if you needed to use the port 5910 then type localhost:5910 instead of just localhost
- Running
Interactive Matlab
See MATLAB on Graham for instructions on running Matlab on Graham.
Interactive Jupyter (iPython)
This sharcnet page provides information on running interactive Jupyter sessions on Graham. It allows you to host the session on Graham (using multiple processors, a large amount of memory, as well as GPUs), while interacting with the session on your local machine via your favourite web browser. There are a few steps required to set it up, but once done it is very straight-forward.
Jupyter notebooks (formerly iPython notebooks) allow an interactive environment for python (it can also be used for other things, such as Matlab, R, etc., but in this instance is only set up for python), which, when combined with access to the Graham servers, can provide a useful way to interact with your data.
Jupyter sessions are killed without warning when the requested time limit has been exceeded, so remember to save often!
Visualization
See Visualization for more information about using ParaView or VisIt on Graham.
Submitting Jobs
The following are recommendations for the use of our contributed resources on Graham. As the UW fluids group, we have been allocated 832 processor years (for 2017-2018). Until these processor years have been expended, we have priority to run jobs on any processor ahead of regular users. Much of what you need to know for submitting and monitoring jobs is presented here. Further questions may be answered on the Compute Canada Graham page Running jobs.
Remember to be courteous about your memory usage and use only what you need! However, if you are using entire nodes then this does not apply since no one else will have access to that memory.
Submit script
An MPI job may be submitted to the Graham scheduler with either of the following bash scripts (A suggested name for them is submit.sh). The first script is for use of entire nodes, while the second allows for the processors to be spread out over many nodes. The first will take longer in queue, but should in theory run quicker since all processors have fewer connections. The second will likely start quicker since there is no requirement to wait for entire nodes to become available and can start whenever enough processors become available. Each script is broken into two parts: 1) the run dependent parameters and 2) the permanent parameters.
To submit the job execute sbatch submit.sh
in a login window. Since this file won't go anywhere it is a handy way to look up what submission parameters you used (number of processors, memory, etc).
One thing to be aware of is that the memory requirement MUST be an integer. Decimals will not be accepted. The unit can be changed to Gigs (G) or other.
The requested time will play are large role in the time the job spends in queue. This is because the nodes on Graham have been specified to only accept jobs with a run time less than specific values. That is, there are far fewer nodes that accept jobs with a run time of 28 days than there are which accept a 3 hour run time. This is because the 3 hour job can run on any node (through back-filling into gaps on the other nodes), while the larger duration job will only run on nodes which accept it. The partitions are split into the following ways:
- 3 hours or less
- 12 hours or less
- 24 hours (1 day) or less
- 72 hours (3 days) or less
- 7 days or less
- 28 days or less
In summary, pick a value on this list and not something just larger than that (for example, pick 24 hours, not 25).
Complete Nodes
The run dependent parameters are the first three items: the number of nodes (and therefore the total number of processors since each node has 32), the duration of the job, and an identifiable name for the job. The remaining permanent parameters need not be changed from run to run. You will need to replace the mail option to use your email address. Additional options exist for when to be emailed (i.e. start/end/failure of job), and many other possibilities such as waiting for another job to complete (which is useful as a post-processing job). See the sbatch manual page for more information.
The following script will use the UW Fluids contributed account which has high priority. If for any reason you decide to submit the job to the regular queue, change the account info to --account=def-supervisor
where supervisor is the username of your supervisor (this could be def-mmstastn, or def-kglamb, ...).
Here is the script to put into submit.sh
:
#!/bin/bash
# bash script for submitting a job to the sharcnet Graham queue
#SBATCH --nodes=2 # number of nodes to use
#SBATCH --time=03-00:00:00 # time (DD-HH:MM:SS)
#SBATCH --job-name="A name" # job name
#SBATCH --ntasks-per-node=32 # tasks per node
#SBATCH --mem=128000M # memory per node
#SBATCH --output=sim-%j.log # log file
#SBATCH --error=sim-%j.err # error file
#SBATCH --mail-user=username@uwaterloo.ca # who to email
#SBATCH --mail-type=FAIL # when to email
#SBATCH --account=ctb-mmstastn # UW Fluids designated resource allocation
srun ./case1.x
Partial Nodes
The run dependent parameters are the first four items: the number of processors ("tasks"), the memory per processor, the duration of the job, and an identifiable name for the job. As in the complete node section the remaining permanent parameters need not be changed from run to run.
Here is the script to put into submit.sh
:
#!/bin/bash
# bash script for submitting a job to the sharcnet Graham queue
#SBATCH --ntasks=64 # number of MPI processes
#SBATCH --mem-per-cpu=3G # memory per processor (default in Mb)
#SBATCH --time=03-00:00:00 # time (DD-HH:MM:SS)
#SBATCH --job-name="A name" # job name
#SBATCH --output=sim-%j.log # log file
#SBATCH --error=sim-%j.err # error file
#SBATCH --mail-user=username@uwaterloo.ca # who to email
#SBATCH --mail-type=FAIL # when to email
#SBATCH --account=ctb-mmstastn # UW Fluids designated resource allocation
srun ./case1.x
Automatically Specifying SPINS Runtime
Newer versions of SPINS include the spins.conf
-specified variable compute_time
.
This variable tells SPINS how much time has been allocated, so that SPINS can save the state when nearing the end of the run-time, even if it's not near a regular output time (such saves have the suffix .dump
).
Previously, the user had to specify this variable each time.
However, thanks to the SLURM scheduler used on Graham, it is possible to have the submit script automatically detect the requested run-time and update the spins.conf
accordingly.
There are two steps to using this feature:
bash function
First, include the following function in your ~/.bashrc
.
Note: it is very important that the string lines DO NOT begin with whitespace. Ugly though it may look, it is necessary, so please be careful to copy the function as-is.
# Determine, in seconds, the amount of time remaining for a job
# Input to the function is the jobid
function remainingJobTime() {
SLURM_JOB_ID=$1
# A bit of bash magic, courtesy of Tyson Whitehead:
# https://www.sharcnet.ca/my/problems/ticket/33549
local secondsleft=$( sacct -n -j ${SLURM_JOB_ID} --format=elapsed%12,timelimit%12 \
| awk -e '$2 != ""'\
'{ '\
' patsplit($1,now,"[0-9]+"); '\
' patsplit($2,end,"[0-9]+"); '\
'} END '\
'{ '\
' endraw = length(end)==4 ? ((end[1]*24+end[2])*60+end[3])*60+end[4] : (end[1]*60+end[2])*60+end[3]; '\
' nowraw = length(now)==4 ? ((now[1]*24+now[2])*60+now[3])*60+now[4] : (now[1]*60+now[2])*60+now[3]; '\
' print(endraw-nowraw); '\
'}' )
echo $secondsleft
}
export -f remainingJobTime
# Clean function wrapper to handle updating the spins.conf with accurate run-time variables
function updateSPINSruntime() {
sed -i '/compute_time/c\'"compute_time = `remainingJobTime ${SLURM_JOB_ID}`" spins.conf
}
export -f updateSPINSruntime
submit script command
Next, include the following line in your submit.sh
script after the #SBATCH
flags but before calling mpiexec
(or mpirun
, srun
, etc.).
updateSPINSruntime
Recording submitted jobs
It is sometimes useful to be able to review your submitted jobs that have already finished (trying to recall what you ran last week, etc.). While the sq_hist function provides that, it may not be the most convenient. Another option is to store a log of your submitted jobs. The following will allow you to automatically log information about jobs that start running into the file ~/my_jobs.log
.
~/.bashrc function
# Update a logfile with job information
# first (and only) argument is a string indicating the
# log-file. This is useful for keeping separate log files
# for separate simulation sets.
function updateJobSummary() {
# check if job ended with an error
if [ $1 -ne 0 ]; then
exit $1
fi
# If a log file is not provided, use the default
DEFAULT_LOG_FILE="${HOME}/my_jobs.log"
LOGFILE=${2:-${DEFAULT_LOG_FILE}};
# Check if something has previously been logged for this jobid
# if yes: we're now recording the end
# if no : we're recording the beginning
oldSum=`grep ${SLURM_JOB_ID} ${LOGFILE}`
if [ "${oldSum}" == "" ]; then
FLAG="BEG";
else
FLAG="END";
fi
date_beg=`squeue --noheader -o "%S" -j ${SLURM_JOB_ID}`
if [ "$FLAG" == "BEG" ]; then
date_end=`squeue --noheader -o "%e" -j ${SLURM_JOB_ID}`
date_end="${date_end} (est.)";
else
date_end=$(perl -e 'use POSIX;print strftime "%Y-%m-%dT%H:%M:%S",localtime time;');
fi
if [ -z "${SLURM_NTASKS_PER_NODE+x}" ] || [ "${SLURM_NTASKS_PER_NODE}" == "" ] ; then
procs="${SLURM_NPROCS} (total procs)"
else
procs="${SLURM_NSTASKS_PER_NODE} (procs per node)";
fi
if [ -z "${SLURM_MEM_PER_CPU+x}" ]; then
mem="${SLURM_MEM_PER_NODE} (mem per node)"
else
mem="${SLURM_MEM_PER_CPU} (mem per cpu)";
fi
jobsum="${date_beg} : ${date_end} : ${SLURM_JOB_ID} : ${SLURM_JOB_NAME} : ${SLURM_NNODES} (nodes) : ${procs} : ${mem} : ${SLURM_SUBMIT_DIR} "
if [ "$FLAG" == "BEG" ]; then
echo ${jobsum} >> ${LOGFILE};
else
sed -i "s@${oldSum}@${jobsum}@" $LOGFILE
fi
}
export -f updateJobSummary
submit.sh
After defining the updateJobSummary
function in ~/.bashrc
, all that you need to do is call updateJobSummary
both before and after calling you code in your submit.sh
script. An example is given below.
#SBATCH flags...
# Update job summary with initial information
updateJobSummary
# Update spins.conf with accruate run-time information
updateSPINSruntime
# run spins
mpiexec casefile.x
# Update the job summary with the final time
updateJobSummary $?
~/.bashrc
Copy the following aliases and scripts into ~/.bashrc to make them available to you at the command line.
Changing Prompt Colour when on Compute Nodes
The following snippet will set the prompt to a golden yellow colour when on a Graham login node, and a red-purple, when on a compute (via salloc) node.
if [ -z "${SLURM_NTASKS+x}" ]; then
export PS1='\[\e[1;33m\][\u@\h \W]\$\[\e[0m\] '
else
export PS1='\[\e[38;5;125m\][\u@\h \W]\$\[\e[0m\] '
fi
Check Job Status and Nodes Usage
- sqm gives a summary of all of the running jobs by ${USER}
- sqa gives a summary of all jobs running with the UW Fluids group contributed resources
- ssm gives the the fair share values and recent cpu-seconds used by ${USER}
- ssa gives the the fair share values and recent cpu-seconds used of all users of the UW fluids contributed resources
${USER} will be automatically replaced by your userid when called.
alias sqm='squeue -u ${USER} --format="%.9i %.8j %.4C %.4D %.7m %.3t %.21S %.21e %.12L"'
alias sqa='squeue --account=ctb-mmstastn_cpu,rrg-mmstastn_cpu --format="%.9i %.8j %.8u %.4C %.4D %.7m %.3t %.21S %.21e %.12L"'
alias ssm='sshare -U ${USER}'
alias ssa='sshare -a -A ctb-mmstastn_cpu -A rrg-mmstastn_cpu'
See Scheduling Policies to find out more about fair share, but the basic idea is that values closer to 1 have highest priority and values close to 0 have lowest. 0.5 will result in a job wait time being roughly the "average" wait time for all jobs on the cluster. Depending on recent usage of your default account or the ctb-mmstastn account, either account could have higher priority. The ssm
command will easily show the fair share value of each of the accounts your group is allocated. A job submitted with the account that has the higher fair share value will start faster than any other account.
Currently, we have 832 processor-years allocated to our group for the year (2017-Apr 2018). We would like to have, on average, about 832 processors running at any given moment. There is no harm in going over or under this number, so long as we roughly complete 832 processor-years by the time our allocation expires. To check the current amount of cpus that are running or pending run
function nodeUsage() {
jobsR=`squeue --account=ctb-mmstastn_cpu,rrg-mmstastn_cpu -o %C -t R`
jobsP=`squeue --account=ctb-mmstastn_cpu,rrg-mmstastn_cpu -o %C -t PD`
cpuSUM_R=0
cpuSUM_P=0
count=0
IFS='
'
for x in $jobsR;
do
if [ "$count" -gt "0" ]; then
cpuSUM_R=$((x + cpuSUM_R));
else
count=1;
fi
done
count=0;
IFS='
'
for x in $jobsP;
do
if [ "$count" -gt "0" ]; then
cpuSUM_P=$((x + cpuSUM_P));
else
count=1;
fi
done
perc_R=$((100*$cpuSUM_R/832 + 200*$cpuSUM_R/832 % 2)) # 2nd term is for rounding
perc_P=$((100*$cpuSUM_P/832 + 200*$cpuSUM_P/832 % 2)) # 2nd term is for rounding
perc_all=$((100*($cpuSUM_R + $cpuSUM_P)/832 + 200*($cpuSUM_R + $cpuSUM_P)/832 % 2))
echo "Processors running: $cpuSUM_R ($perc_R%)";
echo "Processors pending: $cpuSUM_P ($perc_P%)";
echo "Processors total: $((cpuSUM_P+cpuSUM_R)) ($perc_all%)";
echo "Processors available: $((832 - cpuSUM_R-cpuSUM_P))";
}
Resource Allocation Usage
The total amount of cpu-years charged to the our contributed resources is not accessible within a login node and must be accessed online. Follow these steps to see the allocated resource usage:
- sign into the Compute Canada Database
- Select View Group Usage under My Account
- Select By Resource Allocation Project
- Select a given year and then choose a project (Currently it is pim-260-ab)
Memory storage can be presented in a login window with the command diskusage_report
Job details
The built-in function seff
provides a nice summary of the job resource usage. You should check this often for the amount of memory used and the efficiency of the parallel job.
The following commands give information about a particular job.
- seff (slurm efficiency): job memory and processors used and their efficiencies
- scj (slurm control job): processors and nodes used by a job
- saj (slurm account job): memory and time used by a job (use on completed jobs)
- ssj (slurm status job): job memory (use on running jobs)
- job_summary: print all of the above
function scj() {
scontrol show jobid -dd $1
}
function saj() {
sacct --format=jobid,JobName,ncpus,ntasks,state,reserved,elapsed,End,MaxVMSize,AveVMSize,MaxRSS,ReqMem -j $1
}
function ssj() {
sstat --format=jobid,ntasks,AveVMSize,MaxVMSize,AveRSS,MaxRSS -j $1
}
function job_summary() {
seff $1
scj $1;
saj $1;
ssj $1;
}
Usage is:
$ seff <jobID>
Getting the Path to a Log File
Using the above functions, you can get the log file (i.e. the output of stdout) directly from the jobID. This lets you check the state of a job without having to move to that directory. Three functions are given below. The first two simply get the path to the log/error file when provided a valid jobID. The third prints the tail of both the log and error files, with a bit of formatting to help readability.
function get_job_log() {
tmp_str=`scj $1 | grep StdOut`
log_name=${tmp_str#*=}
echo ${log_name}
}
function get_job_err() {
tmp_str=`scj $1 | grep StdErr`
log_name=${tmp_str#*=}
echo ${log_name}
}
function job_tail() {
# Define some colours to make output prettier
COL='\033[1;37m'
NC='\033[0m' # No Color
# Get the job name
tmp_str=`scj $1 | grep JobName`
job_name=${tmp_str##*=}
# Print a header
echo -e "${COL}== Summary of job ${1}: (${job_name}) ==${NC}";
echo ""
# Print the tail of the log file
echo -e "${COL}Tail of log for job ${1}${NC}"
tail `get_job_log ${1}`
echo ""
# Print the tail of the error file
echo -e "${COL}Tail of err for job ${1}${NC}"
tail `get_job_err ${1}`
}
Move to a Simulation/Job Directory
You may find that your directory tree becomes rather involved after a while, and so changing into a simulation directory (or just remembering the path) can start to be cumbersome. A useful function is cdJob, which takes you into the working directory for a submitted job, provided that you know the jobID (which can be given by sqm).
This command only works for running jobs.
# Move to job directory
function cdJob() {
if [ -z "${1+x}" ]; then
echo "No job ID given."
else
pth=$(squeue -o %Z -j $1 | sed '1d')
echo "cd-ing to ${pth}";
cd ${pth};
fi
}
export -f cdJob
Usage is:
$ cdJob <jobID>
List Submitted Jobs
sq_hist defaults to showing the jobs that finished in the last three days. As an optional argument, you can specify the number of days, i.e. sq_hist 1
for just jobs that finished in the last day.
Note that this does not display any jobs from salloc / interactive jobs that are not assigned a name.
function sq_hist() {
# Default to 3 day history
# (Optional) input argument
DEFAULT=3;
NUM_DAYS=${1:-${DEFAULT}};
NUM_SECS=$((${NUM_DAYS}*86400))
TIME=$(perl -e 'use POSIX;print strftime "%Y-%m-%d",localtime time-'${NUM_SECS};)
COL='\033[1;37m'
NC='\033[0m' # No Color
echo -e "${COL}Listing jobs completed (or failed) since ${TIME}${NC}";
sacct --starttime ${TIME} --state=CA,CD,DL,F,NF,OOM,TO -X --format=jobid%-9,jobname%-24,end,ncpus%5,state,exitcode | grep -v " sh ";
}
export -f sq_hist
Polite Parallel Compiling
If you're compiling multi-file programs using make
, then you may find the -j#
flag useful, where #
is some number.
This flag tells make
to split the compilation across multiple processors to compile in parallel.
If however, you're like me, you often forget to do this, or omit the number, which can then swamp the node
(no number is given, then make
will not limit the number of jobs that can run simultaneously).
The following alias pmake
, automatically applies a default number of processors (hard-coded as make_num_cpus
), while
avoiding overloading the node (since you're likely on a login node with many other people) by setting an upper bound on cpu load (hard-coded as make_max_load
).
With the values hard-coded below, pmake
will use up to 10 processors, but will not push the processor load past 20
(so that if the processor load was 15 before you started compiling, only 5 processors would be used to compile).
How you set these values will depend on your system / willingness to consume resources.
Presumably you could leave make_num_cpus
blank, which would tell make
to use as many as possible up to make_max_load
.
Prepending the nice
command tells the OS that your compiling isn't high priority.
# parallel make - because I'm too lazy to add j every time
export make_num_cpus=10 # the target number of processors to use
export make_max_load=20 # make won't push the processor usage past this
# useful for shared resources
alias pmake='time nice make -j${make_num_cpus} --load-average=${make_max_load}'
Python on Graham
Graham comes with a variety of python modules available.
For your average user, the basic python module (e.g. python/3.6.3
) and packages module (e.g. scipy-stack/2018b
) should provide a lot of the functionality that you'll need.
Off-shoot packages can then be installed via pip
, (e.g. pip install --user cmocean
).
Note that --user
is necessary to avoid permissions issues.
Maintaining your own Python
Python on Graham isn't always as up-to-date as you may like it to be, and sometimes it easier to just maintain your own packages.
One option for this is to define a bash function that switches over to your desired python install.
After including the below code snippet into your ~/.bashrc
, you can then activate your desired python settings by simply calling the function loadPython3
in terminal.
This can be mildly headache inducing, since you now have to maintain all of your own packages, but it does give you better control.
Installing through pip
is the same, but again make sure to use the --user
flag.
function loadPython3() {
# The system-installed python packages are old
# so instead just load python and pip
# and install them manually
module load mpi4py;
module unload python35-scipy-stack/2017a
module load imkl; # for reasons, the above line removes this
module load python/3.5.4;
}
export -f loadPython3
Data Transferring
Compute Canada has the following tool which makes data transfers between their machines and any other machine extremely easy. This tool will automatically restart dropped connections and will test for file integrity.
Go to Globus for more information.