Installing pdsh on HPC cluster
Download pdsh source code or src.rpm from http://sourceforge.net/projects/pdsh .
[root@headnode data]# rpmbuild --rebuild pdsh-2.17-1.src.rpm
[root@headnode data]# rpm -ivh /usr/src/redhat/RPMS/x86_64/pdsh-*
Preparing... ########################################### [100%]
1:pdsh-rcmd-ssh ########################################### [ 14%]
2:pdsh ########################################### [ 29%]
3:pdsh-debuginfo ########################################### [ 43%]
4:pdsh-mod-dshgroup ########################################### [ 57%]
5:pdsh-mod-machines ########################################### [ 71%]
6:pdsh-mod-netgroup ########################################### [ 86%]
7:pdsh-rcmd-exec ########################################### [100%]
[root@headnode data]#
Now you can try executing commands on the cluster nodes, all at the same time. For example, let’s run “uptime” on all nodes:
[root@headnode data]# pdsh -w headnode,sm,node1,node2 uptime
failed to install module options for "misc/dshgroup"
headnode: Warning: Permanently added the RSA host key for IP address '192.168.20.100' to the list of known hosts.
sm: 16:03:00 up 7:14, 1 user, load average: 0.00, 0.00, 0.00
headnode: 16:53:16 up 7:23, 1 user, load average: 0.00, 0.04, 0.01
node2: 15:55:21 up 2:31, 1 user, load average: 0.04, 0.01, 0.00
node1: 15:56:07 up 2:31, 1 user, load average: 0.00, 0.00, 0.00
[root@headnode data]#
You can have a list of all your machines listed in a file (/etc/machines) by default.
[root@headnode ~]# vi /etc/machines
headnode
sm
node1
node2
Note: -a will read hostnames from the default machine file (/etc/machines). As shown below:-
[root@headnode ~]# pdsh -a uptime
failed to install module options for "misc/dshgroup"
sm: 14:09:58 up 5:10, 1 user, load average: 0.01, 0.00, 0.00
headnode: 15:00:11 up 5:11, 1 user, load average: 0.08, 0.02, 0.01
node2: 14:02:17 up 5:10, 1 user, load average: 0.00, 0.00, 0.00
node1: 14:03:05 up 5:10, 1 user, load average: 0.00, 0.00, 0.00
[root@headnode ~]#
The other utility included in pdsh RPM is pdcp. Which will copy a file to multiple machines. Hoever, for pdcp to work, all nodes (involved in pdcp operation), must have a local copy of pdcp installed for a successful operation.
So for convenience, we will copy the pdsh-* RPMs to all the compute nodes as well.
[root@headnode modules-3.2.6]# scp /usr/src/redhat/RPMS/i386/pdsh-* node1:/root/
pdsh-2.17-1.i386.rpm 100% 252KB 251.5KB/s 00:00
pdsh-mod-dshgroup-2.17-1.i386.rpm 100% 10KB 10.2KB/s 00:00
pdsh-mod-machines-2.17-1.i386.rpm 100% 8293 8.1KB/s 00:00
pdsh-mod-netgroup-2.17-1.i386.rpm 100% 10KB 9.9KB/s 00:00
pdsh-rcmd-exec-2.17-1.i386.rpm 100% 9888 9.7KB/s 00:00
pdsh-rcmd-ssh-2.17-1.i386.rpm 100% 11KB 10.7KB/s 00:00
[root@headnode modules-3.2.6]# scp /usr/src/redhat/RPMS/i386/pdsh-* node2:/root/
pdsh-2.17-1.i386.rpm 100% 252KB 251.5KB/s 00:00
pdsh-mod-dshgroup-2.17-1.i386.rpm 100% 10KB 10.2KB/s 00:00
pdsh-mod-machines-2.17-1.i386.rpm 100% 8293 8.1KB/s 00:00
pdsh-mod-netgroup-2.17-1.i386.rpm 100% 10KB 9.9KB/s 00:00
pdsh-rcmd-exec-2.17-1.i386.rpm 100% 9888 9.7KB/s 00:00
pdsh-rcmd-ssh-2.17-1.i386.rpm 100% 11KB 10.7KB/s 00:00
You can use a for loop to copy these files from master node to all compute nodes.
Install these RPMs on all nodes:-
[root@headnode ~]# pdsh -a "rpm -ivh /root/pdsh-*.rpm"
Now I can copy a file to all nodes, such as:
[root@headnode modules-3.2.6]# pdcp -a /etc/hosts /etc
The options of pdcp are almost the same as pdsh.
That is all! Happy super computing!