Installing A Cluster

From New IAC Wiki
Jump to navigation Jump to search

Network configuration

Head nodes works as NAT for slave nodes

eth0 connects to outside world eth1 is internal

Internal network is 10.0.200.0/255.0.0.0

OS installation

Normal server Linux install with the following packages:

  • ssh server
  • tftpd-hpa
  • dhcp3-server
  • nfs-kernel-server
  • debootstrap
  • libpmi
  • mpich2
  • slurm-llnl

Grab pxelinux from the web

Netbooting

Setting up dhcp

Edit /etc/dhcp3/dhcpd.conf as follows:

dhcpd.conf - note comma between dns servers

Edit /etc/default/dhcp3-server

INTERFACES=eth1

This will avoid dhcp serving on the outside network!

service dhcp3-server start


Setting up tftp

Edit /etc/default/tftpd-hpa

RUN_DAEMON="yes" #had problems with inetd in the past
OPTIONS="-l -a 10.0.200.1 -s /var/lib/tftpboot"


Setting up nfs

Testing

Scheduler installation

The Quick Start Administrator Guide is very helpful.

  1. Install
  • slurm-llnl
  • slurm-llnl-slurmdbd
  • slurm-llnl-doc
    • mkdir /var/run/slurm-llnl

Munge

Munge is an authentication framework recommended by slurm. All the configuration it needs is:

root@brems:# /usr/sbin/create-munge-key
Generating a pseudo-random key using /dev/urandom completed.
root@brems:# /etc/init.d/munge start