Difference between revisions of "Brems"

From New IAC Wiki
Jump to navigation Jump to search
 
(Replaced content with 'See Running With Slurm for information on running brems.')
 
Line 1: Line 1:
==Brems==
+
See [[Running With Slurm]] for information on running brems.
 
 
===Specfications===
 
 
 
Brems is made of 21 nodes and 60 CPU cores.
 
*12 nodes have two 2.0 GHz Opteron CPUs, 2 GB of ECC RAM, and a 40 GB hard drive
 
*9 nodes have two dual-core 2.0 GHz Opteron CPUs, 4 GB of ECC RAM, and a 80 GB hard drive
 
*The head node also has 1.75 TB of redundant RAID5 storage.
 
 
 
==Running on Brems==
 
===MPI===
 
 
 
 
 
 
 
===Querying Brems usage===
 
This is a typical output from the command cps2
 
<pre>
 
brian@brems:~$ cps2
 
************************* brems *************************
 
--------- brems1---------
 
pctcpu: 26.00    fname: process.pl    time: 260000    uname: brian   
 
pctcpu: 2.50    fname: python    time: 50000    uname: brian   
 
pctcpu: 1.00    fname: ssh    time: 20000    uname: brian   
 
--------- brems2---------
 
pctcpu: inf    fname: process.pl    time: 210000    uname: brian   
 
pctcpu: 1.31    fname: mcnpx-mpi-e2    time: 16761650000    uname: makavakh   
 
pctcpu: 1.08    fname: mcnpx-mpi-e2    time: 25739210000    uname: makavakh   
 
--------- brems3---------
 
pctcpu: 23.00    fname: process.pl    time: 230000    uname: brian   
 
pctcpu: 0.01    fname: rpciod/1    time: 517950000    uname: root   
 
pctcpu: 0.00    fname: sshd    time: 0    uname: brian   
 
 
 
...
 
 
 
--------- brems21---------
 
pctcpu: 20.00    fname: process.pl    time: 200000    uname: brian   
 
pctcpu: 0.00    fname: sshd    time: 0    uname: brian   
 
pctcpu: 0.00    fname: sshd    time: 0    uname: root   
 
pctcpu: 0.00    fname: lockd    time: 0    uname: root   
 
pctcpu: 0.00    fname: rpciod/3    time: 20000    uname: root   
 
pctcpu: 0.00    fname: rpciod/2    time: 0    uname: root   
 
</pre>
 
The cps2 command shows all the processes taking up CPU time on all the cluster nodes. If someone is already running, you will see output similar to the following:
 
<pre>
 
Paste busy cps2 here
 
</pre>
 
 
 
===Screen===
 
Screen is currently the preferred way to run processes on Brems. It allows you to run programs in such a way that they won't terminate when you log off, and you can control the application and see the output in real time.
 
 
 
====Basic screen usage====
 
*Use screen -R to connect a screen session, or start a new one if none exist.
 
*Ctrl-a disconnects from within a screen session
 
=====Examples=====
 
Start a screen session:
 
<pre>
 
brian@brems:~$ screen -R
 
</pre>
 
Do some work in the screen session:
 
<pre>
 
brian@brems:~/work$ /brems/bin/mpirun -np 4 /brems/bin/mcnpx-mpi-e2 i=test1.i
 
mcnpx    ver=2.5e  ld=Mon Feb 23 09:00:00 MST 2004  06/26/08 15:34:56
 
 
 
*****************************************************
 
*                                                  *
 
*            Copyright Notice for MCNPX            *
 
</pre>
 
Hitting Ctrl-a disconnects from the screen, and ps shows MCNPX still running:
 
<pre>
 
brian@brems:~$ screen -R
 
[detached]
 
brian@brems:~$ ps ux | grep mcnpx
 
brian    30339  0.5  0.0  9132  1844 pts/10  S+  15:40  0:00 /bin/sh /brems/bin/mpirun -np 4 /brems/bin/mcnpx-mpi-e2 i=test1.i
 
brian    30467  9.5  0.4  43300  9400 pts/10  S+  15:40  0:00 /brems/bin/mcnpx-mpi-e2 i=test1.i -p4pg /home/brian/work/PI30339 -p4wd /home/brian/work
 
brian    30468  0.0  0.0  21476  544 pts/10  S+  15:40  0:00 /brems/bin/mcnpx-mpi-e2 i=test1.i -p4pg /home/brian/work/PI30339 -p4wd /home/brian/work
 
brian    30469  0.3  0.1  21344  2380 pts/10  S+  15:40  0:00 /usr/bin/rsh brems2 -l brian -n /brems/bin/mcnpx-mpi-e2 brems.iac.isu.edu 45735 \-p4amslave \-p4yourname brems2 \-p4rmrank 1
 
brian    30470  0.0  0.1  21340  2376 pts/10  S+  15:40  0:00 /usr/bin/rsh brems3 -l brian -n /brems/bin/mcnpx-mpi-e2 brems.iac.isu.edu 45735 \-p4amslave \-p4yourname brems3 \-p4rmrank 2
 
brian    30471  0.3  0.1  21340  2376 pts/10  S+  15:40  0:00 /usr/bin/rsh brems4 -l brian -n /brems/bin/mcnpx-mpi-e2 brems.iac.isu.edu 45735 \-p4amslave \-p4yourname brems4 \-p4rmrank 3
 
brian    30473  0.0  0.0  4372  656 pts/0    S+  15:40  0:00 grep mcnpx
 
http://www.hpcwire.com/topic/processors/GPGPUs_Make_Headway_in_Bioscience.html
 
</pre>
 
And we can connect back to the screen session to see the MCNPX output:
 
<pre>
 
brian@brems:~$ screen -R
 
</pre>
 
<pre>
 
run terminated when    100000 particle histories were done.
 
warning.  tally  11 tfc bin did not pass  1 of 10 statistical checks.
 
warning.    1 of  7 tallies did not pass all 10 statistical checks.
 
warning.    2 of  7 tallies had bins with large relative errors.
 
dump  11 on file runtpr    nps =      100000  coll =      31827944
 
                              ctm =      3.75    nrn =      514286097
 
mcrun  is done
 
FORTRAN STOP
 
FORTRAN STOP
 
FORTRAN STOP
 
FORTRAN STOP
 
brian@brems:~/work$
 
</pre>
 
====Advanced screen tricks====
 
*Using screen without -r or -R allows you to start a new screen session even if one already exists
 
*Using screen -list shows you the current screen sessions you have
 
<pre>
 
brian@brems:~$ screen 
 
[detached]
 
brian@brems:~$ screen -list
 
There are screens on:
 
29819.pts-0.brems (Detached)
 
30502.pts-0.brems (Detached)
 
2 Sockets in /var/run/screen/S-brian.
 
 
 
brian@brems:~$
 
</pre>
 
You can choose which one to reattach to by specifying a unique part of the name:
 
<pre>
 
brian@brems:~$ screen -r 298
 
</pre>
 
 
 
 
 
===MCNPX===
 
 
 
==Compiling on Brems==
 
 
 
===gcc===
 
 
 
===Portland Group===
 

Latest revision as of 17:56, 1 June 2011

See Running With Slurm for information on running brems.