This combination should allow me to have a single DNS name that maps to a single Virtual IP that can move back and forth between two Virtual Machines (running on my vSphere environment). This way I can have them on different storage (I.E. one on local storage and the other on iSCSI storage) and be able to turn one storage unit off for maintenance and still allow access to the Windows share. This setup automatically handles the replication between the nodes.
I was trying to follow these instructions: http://www.howtoforge.com/setting-up-an-active-active-samba-ctdb-cluster-using-gfs-and-drbd-centos-5.5 – but there are a number of errors in the instructions, plus some things weren’t clear to me.
We will be using Samba running on top of the GFS clustered filesystem, using CTDB (the clustered version of the TDB database used by Samba) and DRBD to handle all the replication duties.
We will have Static IPs for each machine, in my case
smb1 - 192.168.1.30 and 10.10.10.1
smb2 - 192.168.1.31 and 10.10.10.2
The 2nd IP on each VM is for the DRBD interface for replication
Lastly, there will be two Virtual IP’s that is shared between them – 192.168.1.28 and 192.168.1.29
- First installed 32-bit version of Centos 5.5 with no options checked. Single hard drive only – 20 GB Thin provisioned. Set a static IP. Give it 2 NICs (one for network connectivity and the other for DRDB replication).
- setup – turn off SELinux
- Next, perform a: yum update, then reboot. This updates to Centos 5.8 (as of this writing).
- Install VMware Tools.
- Clone the machine
- Add 2nd hard drive, add 2nd NIC to both VMs
- Boot the VM, fdisk /dev/sdb, n (new), p (primary partition), 1, accept defaults for cylinder start/end, w (write)
- no need to create filesystem or format the partition as we will get DRBD to use the new partition directly (/dev/sdb1)
- make sure that /etc/hosts file only has 127.0.0.1 localhost.localdomain localhost and not the hostname of the machine
- Adjust /etc/hosts on 2nd VM
- Adjust IP of 2nd NIC on both VMs (setup and then service network restart – you might want to erase ifcfg-eth0.bak from /etc/sysconfig/network-scripts on the 2nd VM)
- Confirm that you can ping both machines from each other using the private IP as well as the Public IP
- allow root login (edit /etc/ssh/sshd_config and change PermitRootLogin to yes, we are behind a firewall right?!), then service sshd restart
- yum -y install drbd82 kmod-drbd82 samba joe autoconf automake gcc-c++
(I like the joe editor)
- yum -y groupinstall "Cluster Storage" "Clustering"
- Adjust /etc/drbd.conf on both nodes:
1: global {
2: usage-count yes;
3: }
4:
5: common {
6: syncer {
7: rate 100M;
8: al-extents 257;
9: }
10: }
11:
12: resource r0 {
13:
14: protocol C;
15:
16: startup {
17: become-primary-on both; ### For Primary/Primary ###
18: degr-wfc-timeout 60;
19: wfc-timeout 30;
20: }
21:
22: disk {
23: on-io-error detach;
24: }
25:
26: net {
27: allow-two-primaries; ### For Primary/Primary ###
28: cram-hmac-alg sha1;
29: shared-secret "mysecret";
30: after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
33: }
34:
35: on smb1.yniw.local {
36: device /dev/drbd0;
37: disk /dev/sdb1;
38: address 10.10.10.1:7788;
39: meta-disk internal;
40: }
41:
42: on smb2.yniw.local {
43: device /dev/drbd0;
44: disk /dev/sdb1;
45: address 10.10.10.2:7788;
46: meta-disk internal;
47: }
48: }
On both nodes do:
drbdadm create-md r0
To put the two nodes as primary, on both nodes do:
drbdsetup /dev/drbd0 primary –o
On both nodes (at almost the same time), do:
service drbd start
Make drbd service start automatically at boot:
chkconfig --level 35 drbd on
Check on status of the drbd replication:
cat /proc/drbd or you can do:
service drbd status
Next we configure the GFS filesystem. Put the following into the /etc/cluster/cluster.conf on each system.
1: <?xml version="1.0"?>
2: <cluster name="cluster1" config_version="3">
3:
4: <cman two_node="1" expected_votes="1"/>
5:
6: <clusternodes>
7: <clusternode name="smb1.yniw.local" votes="1" nodeid="1">
8: <fence>
9: <method name="single">
10: <device name="manual" ipaddr="192.168.1.30"/>
11: </method>
12: </fence>
13: </clusternode>
14:
15: <clusternode name="smb2.yniw.local" votes="1" nodeid="2">
16: <fence>
17: <method name="single">
18: <device name="manual" ipaddr="192.168.1.31"/>
19: </method>
20: </fence>
21: </clusternode>
22: </clusternodes>
23:
24: <fence_daemon clean_start="1" post_fail_delay="0" post_join_delay="3"/>
25:
26: <fencedevices>
27: <fencedevice name="manual" agent="fence_manual"/>
28: </fencedevices>
29:
30: </cluster>
Next, we start cman on both systems:
service cman start
And then we start the other services:
service clvmd start
service gfs start
service gfs2 start
chkconfig --level 35 cman on
chkconfig --level 35 clvmd on
chkconfig --level 35 gfs on
chkconfig --level 35 gfs2 on
Then we format the cluster filesystem (only on one node):
gfs_mkfs -p lock_dlm -t cluster1:gfs -j 2 /dev/drbd0
Then we create the mount point and mount the drbd device (on both nodes):
mkdir /clusterdata
mount -t gfs /dev/drbd0 /clusterdata
Then we insert the following line into the /etc/fstab (mine is different than the instructions):
/dev/drbd0 /clusterdata gfs
I found that the gfs argument was necessary – however…do not add the other items – default 1 1 –as this will cause the auto-mounting system to try and fsck them at startup which won’t work.
Now…you should be able to copy data onto that /clusterdata mount point on one node and have it show up on the other automatically.
Next we configure samba. Again, my working file is different than the original instructions. Edit /etc/samba/smb.conf
1: [global]
2:
3: clustering = yes
4: idmap backend = tdb2
5: private dir=/clusterdata/ctdb
6: fileid:mapping = fsname
7: use mmap = no
8: nt acl support = yes
9: ea support = yes
10: security = user
11: map to guest = Bad Password
12: max protocol = SMB2
13:
14: [public]
15: comment = public share
16: path = /clusterdata/public
17: public = yes
18: writeable = yes
19: only guest = yes
20: guest ok = yes
Next, create the directories needed by samba:
mkdir /clusterdata/ctdb
mkdir /clusterdata/public
chmod 777 /clusterdata/public
Follow the same instructions to install CTDB:
First, we need to download it:
cd /usr/src
rsync -avz samba.org::ftp/unpacked/ctdb .
cd ctdb/
Then we can compile it:
cd /usr/src/ctdb/
./autogen.sh
./configure
make
make install
Creating the init scripts and config links to /etc:
cd /usr/src/ctdb
cp config/ctdb.sysconfig /etc/sysconfig/ctdb
cp config/ctdb.init /etc/rc.d/init.d/ctdb
chmod +x /etc/init.d/ctdb
ln -s /usr/local/etc/ctdb/ /etc/ctdb
ln -s /usr/local/bin/ctdb /usr/bin/ctdb
ln -s /usr/local/sbin/ctdbd /usr/sbin/ctdbd
Next, we need to config /etc/sysconfig/ctdb on both nodes:
joe /etc/sysconfig/ctdb
Again…there are mistakes in the example originally given and I have provided my copy:
1: CTDB_RECOVERY_LOCK="/clusterdata/ctdb.lock"
2: CTDB_PUBLIC_INTERFACE=eth0
3: CTDB_PUBLIC_ADDRESSES=/etc/ctdb/public_addresses
4: CTDB_MANAGES_SAMBA=yes
5: ulimit -n 10000
6: CTDB_NODES=/etc/ctdb/nodes
7: CTDB_LOGFILE=/var/log/log.ctdb
8: CTDB_DEBUGLEVEL=2
9: CTDB_PUBLIC_NETWORK="192.168.1.0/24"
10: CTDB_PUBLIC_GATEWAY="192.168.1.8"
Now, config /etc/ctdb/public_addresses on both nodes:
vi /etc/ctdb/public_addresses
10.0.0.183/24
10.0.0.184/24
Then, config /etc/ctdb/nodes on both nodes:
vi /etc/ctdb/nodes
10.0.0.181
10.0.0.182
Then, config /etc/ctdb/events.d/11.route on both nodes:
vi /etc/ctdb/events.d/11.route
1: #!/bin/sh
2:
3: . /etc/ctdb/functions
4: loadconfig ctdb
5:
6: cmd="$1"
7: shift
8:
9: case $cmd in
10: takeip)
11: # we ignore errors from this, as the route might be up already when we're grabbing
12: # a 2nd IP on this interface
13: /sbin/ip route add $CTDB_PUBLIC_NETWORK via $CTDB_PUBLIC_GATEWAY dev $1 2> /dev/null
14: ;;
15: esac
16:
17: exit 0
Set +x permission on script:
chmod +x /etc/ctdb/events.d/11.route
Next…start the ctdb service:
service ctdb start
Here is one place I differ from those other instructions – he says to have the samba service auto-start, but the ctdb service handles the starting and stopping of samba, so you don’t do that.
Plus, I couldn’t make everything start properly when done from the init.d (using the chkconfig –level commands). The problem is that the GFS filesystem tries to be mounted by the fstab, but other things aren’t ready yet, so it doesn’t work. So I wrote the following lines into /etc/rc.local:
service drbd start
mount –a
service ctdb start
Also….to stop one of the servers and take it offline:
service ctdb stop
umount /clusterdata
service drdb stop
You should now have a working active/active Windows style share available that is fully redundant.
You can get to it by using a Windows PC and going to \\virtualip\public
So…in my example: \\192.168.1.29\public
Enjoy!
Jim