Difference between revisions of "Running With Slurm"

From New IAC Wiki
Jump to navigation Jump to search
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
[Slurm https://computing.llnl.gov/linux/slurm/] is the queuing system used on Brems. It allows multiple users to put jobs into a queue and the system to negotiate running them optimally. To use slurm with MCNPX do the following: '''(beta instructions)'''
+
[Slurm https://computing.llnl.gov/linux/slurm/] is the queuing system used on Brems. It allows multiple users to put jobs into a queue and the system to negotiate running them optimally.
 +
 
 +
The following instructions are for running MCNPX on Brems.
  
 
==Adding you run to the queue==
 
==Adding you run to the queue==
Line 31: Line 33:
 
  sbatch: Submitted batch job 3
 
  sbatch: Submitted batch job 3
  
====Checking runs currently in the queue====
+
==Checking runs currently in the queue==
 
  brian@brems:~/work$ squeue
 
  brian@brems:~/work$ squeue
 
   JOBID PARTITION    NAME    USER  ST      TIME  NODES NODELIST(REASON)
 
   JOBID PARTITION    NAME    USER  ST      TIME  NODES NODELIST(REASON)
Line 40: Line 42:
 
     106    brems    1.0mm    neba  R    1:01:30      1 brems
 
     106    brems    1.0mm    neba  R    1:01:30      1 brems
  
====Check the output of your run====
+
==Check the output of your run==
 
Replace 33 with the '''JOBID''' of your run:
 
Replace 33 with the '''JOBID''' of your run:
 
  brian@brems:~/work$ less slurm-33.out
 
  brian@brems:~/work$ less slurm-33.out
Line 53: Line 55:
 
  * This material was produced under U.S. Government contract *
 
  * This material was produced under U.S. Government contract *
  
====Cancel a queued or running job====
+
==Cancel a queued or running job==
 
Use the '''scancel''' command, replacing 33 with your '''JOBID'''
 
Use the '''scancel''' command, replacing 33 with your '''JOBID'''
 
  brian@brems:~/work$ scancel 33
 
  brian@brems:~/work$ scancel 33
  
  
====Attach to a currently running job====
+
==Attach to a currently running job==
 
Use the '''sattach''' command, replacing 33 with your '''JOBID''', but making sure to retain the .0
 
Use the '''sattach''' command, replacing 33 with your '''JOBID''', but making sure to retain the .0
 
  brian@brems:~/work$ sattach 106.0
 
  brian@brems:~/work$ sattach 106.0

Latest revision as of 22:26, 2 August 2010

[Slurm https://computing.llnl.gov/linux/slurm/] is the queuing system used on Brems. It allows multiple users to put jobs into a queue and the system to negotiate running them optimally.

The following instructions are for running MCNPX on Brems.

Adding you run to the queue

Method 1: Easy

Use the squeue script to submit multiple input files. This will run each of them in parallel with the -n inputfile option given to MCNPX.

brian@brems:~/work$ queuemcnpx 14MeV.i 18MeV.i 22MeV.i 9MeV.i 
adding mcnpx n=14MeV.i | sbatch: Submitted batch job 119
adding mcnpx n=18MeV.i | sbatch: Submitted batch job 120
adding mcnpx n=22MeV.i | sbatch: Submitted batch job 121
adding mcnpx n=9MeV.i | sbatch: Submitted batch job 122

The new jobs are now in the queue

brian@brems:~/work$ squeue
 JOBID PARTITION     NAME     USER  ST       TIME  NODES NODELIST(RE
   119     brems  14MeV.i    brian  PD       0:00      1 (Resources)
   120     brems  18MeV.i    brian  PD       0:00      1 (Resources)
   121     brems  22MeV.i    brian  PD       0:00      1 (Resources)
   122     brems   9MeV.i    brian  PD       0:00      1 (Resources)

Method 2: Allows custom options

Create a script file like this one, substituting your MCNPX parameters at the end of the last line:

#!/bin/bash
#number of processes to run:
#SBATCH -n 8 
export DATAPATH=/opt/mcnpx/data/
srun /opt/mcnpx/v27b_64_mpi_i8_slurm/bin/mcnpx i=14MeV.i o=14MeV.o

Add your run to the queue using sbatch. The -J is optional and allows you to specify a name for your job so you can keep track on them in the queue better.

brian@brems:~/work$ sbatch ./runmcnpx -J myjobname
sbatch: Submitted batch job 3

Checking runs currently in the queue

brian@brems:~/work$ squeue
 JOBID PARTITION     NAME     USER  ST       TIME  NODES NODELIST(REASON)
   119     brems  14MeV.i    brian  PD       0:00      1 (Resources)
   120     brems  18MeV.i    brian  PD       0:00      1 (Resources)
   121     brems  22MeV.i    brian  PD       0:00      1 (Resources)
   122     brems   9MeV.i    brian  PD       0:00      1 (Resources)
   106     brems    1.0mm     neba   R    1:01:30      1 brems

Check the output of your run

Replace 33 with the JOBID of your run:

brian@brems:~/work$ less slurm-33.out
mcnpx    ver=27b   ld=Tue Aug 18 08:00:00 MST 2009   03/31/10 19:03:50
*************************************************************
*                                                           *
*                   MCNPX                                   *
*                                                           *
* Copyright 2007. Los Alamos National Security, LLC.        *
* All rights reserved.                                      *
*                                                           *
* This material was produced under U.S. Government contract *

Cancel a queued or running job

Use the scancel command, replacing 33 with your JOBID

brian@brems:~/work$ scancel 33


Attach to a currently running job

Use the sattach command, replacing 33 with your JOBID, but making sure to retain the .0

brian@brems:~/work$ sattach 106.0
mcnpx    ver=27b   ld=Tue Aug 18 08:00:00 MST 2009   04/14/10 15:32:29
*************************************************************
*                                                           *
*                   MCNPX                                   *
*                                                           *
* Copyright 2007. Los Alamos National Security, LLC.        *
* All rights reserved.                                      *