show sidebar
		Filesystem Failover with DRBD and OpenSSI

The drbd-HowTO describes the following:
I. Failover of root file system using DRBD
II. Failover of non-root file systems using DRBD

I. Failover of root file system using DRBD

These are the steps involved in getting drbd root failover on OpenSSI.
These steps are for a fresh install. Towards the ends of the section,
you can find steps to convert an existing openSSI cluster into
a drbd-enabled root-failover cluster.

1. Install base linux (Fedora Core 2). Make sure you partition your disk
   such that it has a an extra partition with atleast 128 Mb for the
   drbd meta-device later on. (For example, in addition to /root, /boot and
   /swp there will an extra partition for drbd of at least 128 MB). Also,
   make sure that /boot is on a different partition than /root.
2. Install the latest version OpenSSI available from
   The current latest version is openssi-1.9.0. Make sure you 
   enable 'root failover' during installation.
3. Install openssi-enabled drbd.
   The tarball is available from
   The tarball contains the drbd package (drbd-0.7.7), a patch for linuxrc
   and a sample linuxrc.
   The drbd code needs to be built against the kernel-source currently.
   (a) Download and install the OpenSSI kernel source rpm.
       rpm -Uvh kernel-ssi-source-2.6.10_ssi_1.i686.rpm
   (b) Edit the EXTRAVERSION /usr/src/linux-2.6-ssi/Makefile to reflect the
       version of the running OpenSSI kernel.
       - EXTRAVERSION = _ssi_1custom
       + EXTRAVERSION = _ssi_1smp
   (c) Prepare the kernel source tree.
       $ cd /usr/src/linux-2.6-ssi
       $ make mrproper
       $ cp configs/kernel-ssi-2.6.10-i686-smp.config .config
       $ make oldconfig_nonint; make oldconfig_nonint
       $ make dep
   (d) Build and install drbd.
       $ cd <where you downloaded drbd.tar>
       $ tar -xf drbd.tar
       $ cd drbd-fc2-1.9.0.i686/drbd-0.7.7
       $ make clean all doc
       $ make install
4. Edit /etc/drbd.conf.
   a. Modify the hostnames
   b. Modify the disk partitions (ensure that the size of the mirror device
      is greater than or equal to the current root device)
   c. Modify the nodenums.
   d. Make sure that the "incon-degr-cmd" line returns a non-zero value.
   e. Make sure that "wfc-timeout" is zero.
   The following instructions take the "device" value to be /dev/drbd/0, and
   the resource to be "root".
5. To ensure that there is no state on the meta-device from an earlier
   drbd setup, use 'dd if=/dev/zero of=<meta-device>' on both nodes.
6. Unpack the ramdisk and mount it loopback(for example on /mnt/ramdisk/).
   Copy over the following :
   a. drbd.o 
        cp $LIBDIR/drbd.o /mnt/ramdisk/lib/
   b. drbd utilities: drbdsetup, drbdadm
        cp /sbin/drbdsetup /mnt/ramdisk/bin/
        cp /sbin/drbdadm /mnt/ramdisk/bin/ 
   c. drbd.conf 
        cp /etc/drbd.conf /mnt/ramdisk/etc/ 
7. Patch the linuxrc with linuxrc.patch.
   # patch -p0 < linuxrc.patch
8. Pack the ramdisk and copy it into /boot. Make sure you retain 
   a copy of the old ramdisk so that you can boot up with the older
   ramdisk if needed. 
9. Change /etc/init.d/SSIfailOver to enable failover for /dev/drbd/0
   The patch for SSIfailover will look like:
                return 0
-       DEVICE=`/sbin/findfs $FSDEV`
+#      DEVICE=`/sbin/findfs $FSDEV`
+       DEVICE=/dev/drbd/0
        cfs_setroot $TYPE $DEVICE
         action $"Enabled root failover: " /bin/true
        return 0
10. Change /etc/rc.d/rc.sysrecover to update /etc/mtab with /dev/drbd/0
    The patch for rc.sysrecover will look like:
-       DEVICE=`/sbin/findfs $FSDEV`
+       DEVICE=/dev/drbd/0
11. In /etc/fstab, replace the UUID=... with /dev/drbd/0
12. Shut down the cluster and boot only the first node.
    This will drop into bash because of the failure to set the node to
    primary. In the shell, do:
    $ drbdadm -s /bin/drbdsetup -- --do-what-I-say primary root
    $ exit
    The root partition should be picked up by the drbd-enabled ramdisk.
    You are now running with the root as a drbd primary. This is only
    necessary the first time you are running the device under drbd.
14. Once the first node completes booting up, boot up the second node. 
    The second node will start the sync. This might take a little time.
         cat /proc/drbd 
    to check progress of the sync. The complete sync will show "ld:Consistent"
    Once the sync is complete, node2 is ready for failover.
15. Now if the first node crashes, the second node should take over.

Converting an existing openssi-cluster into a drbd-enabled root failover
Converting an existing openssi-cluster into a drbd-enabled root failover
cluster is only possible at the moment if you have an extra partition
that is atleast 128Mb for the index partition. The only alternative
is to use resize2fs to convert an existing root partition so that there
is space for the 128Mb index partition. This as can be guessed is going
to be tricky to do over a running and in-use root filesystem.
If the cluster does have an extra 128 Mb space for the index partition, then
you can convert the existing cluster into a root-failover cluster. And then
start from step 3 above.
There are basically three steps involved in changing a non-failover cluster 
to a failover cluster.
1. Run  ssi-chnode to turn secondary node into a takeover CLMS master
2. Add the chard option to the root line in /etc/fstab
e.g. UUID=$UUID      /       ext3    chard,defaults,node=1:2 1       1
3. Run mkinitrd to create a new mkinitrd
Then goto step 3 above to engineer the ramdisk to do drbd-failover.

II. Failover of non-root file systems

Apart from the procedure below, do *not* use drbdadm and drbdsetup. Also, do
*not* make any changes to /etc/drbd.conf. The reason for this is that
changes to /etc/drbd.conf are not automatically propagated to the drbd.conf
in the ramdisk, and it is important for drbd devices to be brought up in
the ramdisk to prevent drbd from split-braining.

1. Download, build and install the OpenSSI enabled drbd.
2. Edit /etc/drbd.conf to indicate the resources that need to
   to be failed over. If drbd is being layered over a device with
   an existing filesystem, use a separate meta-device.
   Ensure that the 'incon-degr-cmd' doesn't 'halt -f' for the less
   important file systems (i.e. probably just non-root). Also ensure that
   the resource is configured with an infinite timeout.
3. Bring up the drbd device on the first node.
   $ insmod drbd
   $ drbdadm up r1
   $ drbdadm -- --do-what-I-say primary r1
4. Create the filesystem if not layering over an existing filesystem.
   $ mkfs -t ext3 /dev/drbd/1 # where /dev/drbd/1 is the drbd 
                              # device name in drbd.conf
5. On the other node,
   $ insmod drbd
   $ drbdadm up r1
6. If there is a drbd.conf already existing in the ramdisk, modify it to
   include r1. If it doesn't exist, then follow step 6-8 of setting up
   root filesystem failover to engineer the ramdisk to start drbd in the
   linuxrc. (This is necessary to prevent split-brain for these filesystems)
7. Edit /etc/fstab to enable failover
     /dev/drbd/1 <mntpt>	ext3	default,chard,node=1:2	0	0
   The node= are the numbers of the nodes which are in drbd.conf
8. Mount the filesystems
     mount -a

This page last updated on Thu Dec 15 17:17:44 2005 GMT
privacy and legal statement