Cheap HA KVM Cluster w/ Shared File System
Document to go through setting up KVM install on CentOS with clustering.
Network Configuration
Configure hostname
# hostnamectl set-hostname centos-vm1.home.labrats.us
Configure Layer 3 Interface Bonding and Bridge Interface
For this documentation, we are going to use 2 ethernet interfaces, one on the system board (em1) and one on an expansion card (p1p1).
Configure em1
/etc/sysconfig/network-scripts/ipcfg-em1. Make sure the MAC address (HWADDR) is correct.
DEVICE=em1 NAME=em1 UUID=a401d981-52cd-4c4d-8c0d-3f81f182fe45 TYPE=Ethernet BOOTPROTO=none ONBOOT=yes USERCTL=no MASTER=bond0 SLAVE=yes
Configure p1p1
/etc/sysconfig/network-scripts/ipcfg-p1p1. Make sure the MAC address (HWADDR) is correct.
DEVICE=p1p1 NAME=p1p1 UUID=75ba61cf-0781-4023-b451-911f2e0b69d3 TYPE=Ethernet BOOTPROTO=none ONBOOT=yes USERCTL=no MASTER=bond0 SLAVE=yes
Configure bond0
/etc/sysconfig/network-scripts/ipcfg-bond0.
DEVICE=bond0 NAME=bond0 TYPE=Bond BOOTPROTO=none #DEFROUTE=yes #PEERDNS=yes #PEERROUTES=yes #IPV4_FAILURE_FATAL=no #IPV6INIT=yes #IPV6_AUTOCONF=yes #IPV6_DEFROUTE=yes #IPV6_PEERDNS=yes #IPV6_PEERROUTES=yes #IPV6_FAILURE_FATAL=no ONBOOT=yes BRIDGE=virbr0
/etc/modprobe.d/bond0.conf
alias bond0 bonding options bond0 primary=em1 miimon=100 mode=1 updelay=30000
Configure virbr0
/etc/sysconfig/network-scripts/ipcfg-virbr0.
DEVICE="virbr0" TYPE=BRIDGE ONBOOT=yes BOOTPROTO=none IPADDR="192.168.1.201" NETMASK="255.255.255.0" GATEWAY="192.168.1.1" DNS1="216.136.95.2" DNS2="64.132.94.250" DEFROUTE=yes PEERDNS=yes PEERROUTES=yes IPV4_FAILURE_FATAL=no IPV6INIT=no IPV6_AUTOCONF=no IPV6_DEFROUTE=no IPV6_PEERDNS=no IPV6_PEERROUTES=no IPV6_FAILURE_FATAL=no NM_CONTROLLED="no"
Enable IP Bridging:
# systemctl start libvirtd # systemctl enable libvirtd # echo "net.ipv4.ip_forward = 1" | tee /etc/sysctl.d/99-ipforward.conf # sudo sysctl -p /etc/sysctl.d/99-ipforward.conf
Configure resolv.conf
domain home.labrats.us search home.labrats.us labrats.us nameserver 216.136.95.2 nameserver 64.132.94.250
Disable Network Manager
Disable Network Manager
/bin/systemctl disable NetworkManager /bin/systemctl disable NetworkManager-dispatcher
Delete Network Manager Packages
# yum erase NetworkManager-tui NetworkManager-glib NetworkManager
Restart networking
# /etc/init.d/network restart Restarting network (via systemctl): [ OK ]
It is possible that you will need to reboot the server.
# shutdown -r now
Show Interface Bonding
# cat /proc/net/bonding/bond0 Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) Bonding Mode: fault-tolerance (active-backup) Primary Slave: None Currently Active Slave: p1p1 MII Status: up MII Polling Interval (ms): 100 Up Delay (ms): 30000 Down Delay (ms): 0 Slave Interface: em1 MII Status: up Speed: 1000 Mbps Duplex: full Link Failure Count: 0 Slave queue ID: 0 Slave Interface: p1p1 MII Status: up Speed: 1000 Mbps Duplex: full Link Failure Count: 0 Slave queue ID: 0
Setup Bond1 and Bond2
Use similar configuration steps as above to map em2/p1p2 to bond1, and em3/p1p3 to bond3. We will skip the bridging setup for these interfaces, as they are for Cluster and Storage Networks.
Bond interface configurations look like this:
Bond1 interface
/etc/modprobe.d/bond1.conf
alias bond1 bonding options bond1 primary=em2 miimon=100 mode=1 updelay=30000
/etc/sysconfig/network-scripts/ifcfg-em2
NAME=em2 UUID=903d0880-7c69-4bfd-b928-35f507be0d73 TYPE=Ethernet BOOTPROTO=none ONBOOT=yes USERCTL=no MASTER=bond1 SLAVE=yes
/etc/sysconfig/network-scripts/ifcfg-p1p2
DEVICE=p1p2 NAME=p1p2 UUID=7516861d-5dc2-4cd7-a4b0-525bb398d8f9 TYPE=Ethernet BOOTPROTO=none ONBOOT=yes USERCTL=no MASTER=bond1 SLAVE=yes
/etc/sysconfig/network-scripts/ifcfg-bond1
DEVICE=bond1 NAME=bond1 TYPE=Bond BOOTPROTO=none IPADDR="192.168.101.201" NETMASK="255.255.255.0" NM_CONTROLLED="no"
Bond2 interface
/etc/modprobe.d/bond2.conf
alias bond2 bonding options bond2 primary=em3 miimon=100 mode=1 updelay=30000
/etc/sysconfig/network-scripts/ifcfg-em3
DEVICE=em3 NAME=em3 UUID=2215a365-2d29-45c1-8f68-fbee29713c86 TYPE=Ethernet BOOTPROTO=none ONBOOT=yes USERCTL=no MASTER=bond2 SLAVE=yes
/etc/sysconfig/network-scripts/ifcfg-p1p3
DEVICE=p1p3 NAME=p1p3 UUID=7aff546a-1dec-4763-b2fc-8cfcfdaf086b TYPE=Ethernet BOOTPROTO=none ONBOOT=yes USERCTL=no MASTER=bond2 SLAVE=yes
/etc/sysconfig/network-interfaces/ifcfg-bond2
DEVICE=bond2 NAME=bond2 TYPE=Bond BOOTPROTO=none IPADDR="192.168.102.201" NETMASK="255.255.255.0" NM_CONTROLLED="no"
Setup /etc/hosts file
Run the following command to setup /etc/hosts file.
# cat << EOF >> /etc/hosts 192.168.1.200 centos-vm.home.labrats.us centos-vm 192.168.1.201 centos-vm1.home.labrats.us centos-vm1 192.168.1.202 centos-vm2.home.labrats.us centos-vm2 192.168.1.211 centos-vm1-ipmi.home.labrats.us centos-vm1-ipmi 192.168.1.212 centos-vm2-ipmi.home.labrats.us centos-vm2-ipmi 192.168.101.201 centos-vm1-cn.home.labrats.us centos-vm1-cn 192.168.101.202 centos-vm2-cn.home.labrats.us centos-vm2-cn 192.168.102.201 centos-vm1-sn.home.labrats.us centos-vm1-sn 192.168.102.202 centos-vm2-sn.home.labrats.us centos-vm2-sn EOF
Configure Networking On Each Additional Node
Setup Gluster
Create Gluster Brick
mkfs.xfs -i size=512 /dev/sdb1 mkdir -p /data/brick1 vi /etc/fstab
Mount Brick
Add to /etc/fstab
/dev/sdb1 /data/brick1 xfs defaults 1 2
Mount the brick
# mount /data/brick1
Install Gluster
Install EPEL and Gluster Repos
We'll need to install the Gluster repo for the server package, and EPEL repo to satisfy dependencies.
# rpm -i https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm # wget -P /etc/yum.repos.d http://download.gluster.org/pub/gluster/glusterfs/LATEST/CentOS/glusterfs-epel.repo
Install glusterfs-server
Now we can install glusterfs-server package.
# yum install glusterfs-server
Start Gluster Server
# service glusterd start Redirecting to /bin/systemctl start glusterd.service
Check the status of the gluster service
# service glusterd status Redirecting to /bin/systemctl status glusterd.service glusterd.service - GlusterFS, a clustered file-system server Loaded: loaded (/usr/lib/systemd/system/glusterd.service; disabled) Active: active (running) since Wed 2015-09-09 12:27:58 MDT; 12s ago Process: 13774 ExecStart=/usr/sbin/glusterd -p /var/run/glusterd.pid (code=exited, status=0/SUCCESS) Main PID: 13775 (glusterd) CGroup: /system.slice/glusterd.service └─13775 /usr/sbin/glusterd -p /var/run/glusterd.pid Sep 09 12:27:58 centos-vm1.home.labrats.us systemd[1]: Started GlusterFS, a clustered file-system server.
Add Firewall Rules
We'll use firewall-cmd in this example. IPTables would have similar rules.
First we have to open up the port for the Gluster service. This runs on TCP port 24007.
# firewall-cmd --add-port=24007/tcp # firewall-cmd --permanent --add-port=24007/tcp
Next we have to open up a port for each brick, starting at 49152 (for GlusterFS 3.4 and later, 24009 for GlusterFS 3.3 and older). Since we are running two bricks in each node, we only need to listen to 49152 and 49153.
# firewall-cmd --add-port=49152-49153/tcp # firewall-cmd --permanent --add-port=49152-49153/tcp
If we want to open up GlusterFS to external nodes, we will also need to open up TCP ports 38465, 38466, 38468, 38469 and 2049.
Let's validate that is shows up:
# firewall-cmd --list-all public (default, active) interfaces: virbr0 sources: services: dhcpv6-client ssh ports: 24007/tcp 49152-49153/tcp masquerade: no forward-ports: icmp-blocks: rich rules:
Configure each additional node
Configure each node in the Gluster Cluster with the same configuration.'
Configure Gluster Service
Probe each server from the other one
Node 1:
# gluster peer probe centos-vm2-sn
Node 2:
# gluster peer probe centos-vm1-sn
Create volume directory on each node
This needs to be ran on each node.
# mkdir /data/brick1/gv0
Create Gluster Volume
This only needs to be ran on ONE node.
# gluster volume create gv0 replica 2 centos-vm1-sn:/data/brick1/gv0 centos-vm2-sn:/data/brick1/gv0 volume create: gv0: success: please start the volume to access data
Start Gluster Volume
Again, this only needs to be done on one node.
# gluster volume start gv0 volume start: gv0: success
Check Gluster Volume Status
This can be ran on either node, and should be ran on each to validate operation.
# gluster volume info Volume Name: gv0 Type: Replicate Volume ID: 877e2a93-89c5-4f19-b98c-72d79abbae83 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: centos-vm1-sn:/data/brick1/gv0 Brick2: centos-vm2-sn:/data/brick1/gv0 Options Reconfigured: performance.readdir-ahead: on
Set Gluster Service to start with the system
Run this command on each node.
# systemctl enable glusterd.service
Testing Gluster
Perform the following steps to test and validate that Gluster is operating correctly.
Mount Gluster volume on each server
On each node, mount the local Gluster volume.
# mount -t glusterfs localhost:/gv0 /mnt
Test Write & Synchronization
On node 1, run the following commands.
# for i in `seq -w 1 100`; do cp -rp /var/log/messages /mnt/copy-test-$i; done <pre> On node 1, check to see that this has been written to the local brick. <pre> # ls -lA /data/brick1/gv0 | wc -l
There should be 100 files on the local brick.
On node 2, check both the /mnt directory, and the local brick.
# ls -lA /mnt | wc -l # ls -lA /data/brick1/gv0 | wc -l
Both should list 100 files. If the numbers are different, check that synchronization is working correctly.
Setup Second Gluster Volume
We will also need to setup a "gv1" volume as well, using the above steps.
Setup KVM
Install KVM
# yum -y install kvm virt-manager libvirt virt-install qemu-kvm # yum -y install xauth xorg-x11-apps
Mount GlusterFS
# mkdir -p /var/lib/libvirt/images /var/lib/libvirt/configs # mount -t glusterfs localhost:/gv0 /var/lib/libvirt/images # mount -t glusterfs localhost:/gv1 /var/lib/libvirt/configs
Start KVM (libvirtd)
We must start libvirtd temporarily to create the virtual machine disk images and config files.
# systemctl start libvirtd.service
Create Virtual Machines
We will create two machines, test-centos-1 and test-centos-2.
# virt-install --connect qemu:///system -n test-centos-1 -r 2048 --vcpus=1 \ --disk path=/var/lib/libvirt/images/test-centos-1.img,size=10 --graphics vnc,listen=0.0.0.0 \ --noautoconsole --os-type linux --os-variant rhel7 --accelerate --network=bridge:virbr0 --hvm \ --cdrom /var/lib/libvirt/images/CentOS-7.0-1406-x86_64-DVD.iso # virt-install --connect qemu:///system -n test-centos-2 -r 2048 --vcpus=1 \ --disk path=/var/lib/libvirt/images/test-centos-2.img,size=10 --graphics vnc,listen=0.0.0.0 \ --noautoconsole --os-type linux --os-variant rhel7 --accelerate --network=bridge:virbr0 --hvm \ --cdrom /var/lib/libvirt/images/CentOS-7.0-1406-x86_64-DVD.iso
Connect to VNC console and install the OD from CD image. This can be done via a VNC client application, or via the virt-manager GUI application.
Configure Virtual Machines
Configure Guest Network
Remove Network Manager software.
# /bin/systemctl disable NetworkManager # /bin/systemctl disable NetworkManager-dispatcher # yum erase NetworkManager-tui NetworkManager-glib NetworkManager
Run the commands below to set up a static ip address (192.168.1.221) and hostname (test-centos-1).
# export remote_hostname=guest1 # export remote_ip=192.168.1.221 # export remote_gateway=192.168.1.1 # hostnamectl set-hostname $remote_hostname # sed -i.bak "s/.*BOOTPROTO=.*/BOOTPROTO=none/g" /etc/sysconfig/network-scripts/ifcfg-eth0 # cat << EOF >> /etc/sysconfig/network-scripts/ifcfg-eth0 IPADDR0=$remote_ip PREFIX0=24 GATEWAY0=$remote_gateway DNS1="216.136.95.2" DNS2="64.132.94.250" NM_CONTROLLED="no" EOF # systemctl restart network # systemctl enable network.service # systemctl enable sshd # systemctl start sshd # echo "checking connectivity" # ping www.google.com
To simplify the tutorial we'll go ahead and disable selinux on the guest. We'll also need to poke a hole through the firewall on port 3121 (the default port for pacemaker_remote) so the host can contact the guest.
# setenforce 0 # # sed -i.bak "s/^SELINUX=.*/SELINUX=disabled/g" /etc/selinux/config # firewall-cmd --add-port 3121/tcp --permanent # firewall-cmd --add-port 3121/tcp
At this point you should be able to ssh into the guest from the host.
Configure pacemaker_remote
On the 'host' machine, run these commands to generate an authkey and copy it to the /etc/pacemaker folder on both the host and guest.
# mkdir -p --mode=0750 /etc/pacemaker # chgrp haclient /etc/pacemaker # dd if=/dev/urandom of=/etc/pacemaker/authkey bs=4096 count=1 # scp -r /etc/pacemaker root@192.168.1.221:/etc/
Now on the 'guest', install the pacemaker-remote package, and enable the daemon to run at startup. In the commands below, you will notice the pacemaker package is also installed. It is not required; the only reason it is being installed for this tutorial is because it contains the Dummy resource agent that we will use later for testing.
# yum install -y pacemaker pacemaker-remote resource-agents # systemctl enable pacemaker_remote.service
Now start pacemaker_remote on the guest and verify the start was successful.
# systemctl start pacemaker_remote.service # systemctl status pacemaker_remote pacemaker_remote.service - Pacemaker Remote Service Loaded: loaded (/usr/lib/systemd/system/pacemaker_remote.service; enabled) Active: active (running) since Thu 2013-03-14 18:24:04 EDT; 2min 8s ago Main PID: 1233 (pacemaker_remot) CGroup: name=systemd:/system/pacemaker_remote.service └─1233 /usr/sbin/pacemaker_remoted Mar 14 18:24:04 guest1 systemd[1]: Starting Pacemaker Remote Service... Mar 14 18:24:04 guest1 systemd[1]: Started Pacemaker Remote Service. Mar 14 18:24:04 guest1 pacemaker_remoted[1233]: notice: lrmd_init_remote_tls_server: Starting a tls listener on port 3121.
Verify Host Connection to Guest
Before moving forward, it's worth verifying that the host can contact the guest on port 3121. Here's a trick you can use. Connect using ssh from the host. The connection will get destroyed, but how it is destroyed tells you whether it worked or not.
First add guest1 to the host machine's /etc/hosts file if you haven't already. This is required unless you have dns setup in a way where guest1's address can be discovered.
# cat << EOF >> /etc/hosts 192.168.1.221 test-centos-1 EOF
If running the ssh command on one of the cluster nodes results in this output before disconnecting, the connection works.
# ssh -p 3121 test-centos-1 ssh_exchange_identification: read: Connection reset by peer
If you see this, the connection is not working.
# ssh -p 3121 test-centos-1 ssh: connect to host test-centos-1 port 3121: No route to host
Repeat for second (and additional) guest virtual machines.
Shut Down Virtual Machines
Power down guest virtual machines, as they will be controlled by Pacemaker, as documented below.
From the host, youcan run this command.
# virsh shutdown test-centos-1 # virsh shutdown test-centos-2
Export Virtual Machine Config Files
We will export the config files for the two virtual machines. This will later be used for loading into Pacemaker.
# virsh dumpxml test-centos-1 > /var/lib/libvirt/configs/test-centos-1.xml # virsh dumpxml test-centos-2 > /var/lib/libvirt/configs/test-centos-2.xml
Shut Down KVM (libvirt)
We can shut down KVM, as it will be started by Pacemaker following the below steps.
# systemctl start libvirtd.service # systemctl disable libvirtd.service
Unmount Gluster File Systems
We need to unmount the Gluster file systems, as we will be mounting them with pacemaker below.
# umount /var/lib/libvirt/images # umount /var/lib/libvirt/configs
Setup Pacemaker Cluster
Pacemaker Firewall rules
Add the following rules for pacemaker/pcs/corosync to communicate correctly.
# firewall-cmd --permanent --add-service=high-availability # firewall-cmd --add-service=high-availability # firewall-cmd --permanent --direct --add-rule ipv4 filter IN_public_allow 0 -p igmp -j ACCEPT # firewall-cmd --direct --add-rule ipv4 filter IN_public_allow 0 -p igmp -j ACCEPT # firewall-cmd --permanent --add-port=5405/tcp # firewall-cmd --add-port=5405/tcp
Disable SELinux
# sed -i.bak "s/^SELINUX=.*/SELINUX=disabled/g" /etc/selinux/config
Install Pacemaker Packages
# yum install pacemaker corosync pcs resource-agents
Setup /etc/hosts file
Create the /etc/hosts file containing entries for each hypervisor host in the cluster.
# cat << EOF >> /etc/hosts 192.168.1.201 centos-vm1 192.168.1.202 centos-vm2 EOF
Setup Cluster Auth
Run the following command on ALL cluster to setup user authentication.
# passwd hacluster
Run the following on ONE node to authenticate the cluster nodes/
# pcs cluster auth centos-vm1-cn centos-vm2-cn
Setup Cluster
Run this on ALL hypervisor machines.
# pcs cluster setup --local --name mycluster centos-vm1-cn centos-vm2-cn
Start Cluster Software
Start the cluster. Run the following on ONE node.
# pcs cluster start --all
Verify Cluster Operation
Verify corosync membership
# pcs status corosync Membership information ---------------------- Nodeid Votes Name 1 1 centos-vm1.home.labrats.us (local) 2 1 centos-vm2.home.labrats.us
Verify pacemaker status. It may take a moment or two for the other node to appear.
# pcs status Cluster name: mycluster WARNING: no stonith devices and stonith-enabled is not false Last updated: Tue Sep 15 14:16:51 2015 Last change: Tue Sep 15 14:14:01 2015 by hacluster via crmd on centos-vm2 Stack: corosync Current DC: centos-vm2 (version 1.1.13-a14efad) - partition with quorum 2 nodes and 0 resources configured Online: [ centos-vm1 centos-vm2 ] Full list of resources: PCSD Status: centos-vm1: Offline centos-vm2: Offline Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: inactive/disabled
Verify Corosync relationship.
# corosync-cfgtool -s Printing ring status. Local node ID 1 RING ID 0 id = 192.168.101.201 status = ring 0 active with no faults
Setup Pacemaker Active/Standby Cluster
We will initiall setup an active/standby cluster to get things running and tested. This only works in a two-node cluster.
Setup PCSD and Virtual IP
Start PCSD on Primary Node
We need to start PCSD on the Promary node, so it can generate the certificate and key files.
# systemctl start pcsd.service # systemctl status pcsd.service pcsd.service - PCS GUI and remote configuration interface Loaded: loaded (/usr/lib/systemd/system/pcsd.service; enabled) Active: active (running) since Tue 2015-09-15 13:54:50 MDT; 7s ago Main PID: 19779 (pcsd) CGroup: /system.slice/pcsd.service ├─19779 /bin/sh /usr/lib/pcsd/pcsd start ├─19783 /bin/bash -c ulimit -S -c 0 >/dev/null 2>&1 ; /usr/bin/ruby -I/usr/lib/pcsd /usr/lib/pcsd/ssl.rb └─19784 /usr/bin/ruby -I/usr/lib/pcsd /usr/lib/pcsd/ssl.rb Sep 15 13:54:50 centos-vm1-cn systemd[1]: Started PCS GUI and remote configuration interface.
Copy PCSD certificate and key files to other node
We need to make sure that both nodes have the same certificate and key files, or our browser will pester us each time the service fails over.
# cd /var/lib/pcsd # scp -pv pcsd.* root@centos-vm2-cn:/var/lib/pcsd
Start PCSD opn Secondary Node
# systemctl start pcsd.service # systemctl status pcsd.service pcsd.service - PCS GUI and remote configuration interface Loaded: loaded (/usr/lib/systemd/system/pcsd.service; enabled) Active: active (running) since Tue 2015-09-15 14:01:24 MDT; 3s ago Main PID: 21311 (pcsd) CGroup: /system.slice/pcsd.service ├─21311 /bin/sh /usr/lib/pcsd/pcsd start ├─21315 /bin/bash -c ulimit -S -c 0 >/dev/null 2>&1 ; /usr/bin/ruby -I/usr/lib/pcsd /usr/lib/pcsd/ssl.rb ├─21316 /usr/bin/ruby -I/usr/lib/pcsd /usr/lib/pcsd/ssl.rb └─21319 python /usr/lib/pcsd/systemd-notify-fix.py Sep 15 14:01:24 centos-vm2-cn systemd[1]: Started PCS GUI and remote configuration interface.
Create Virtual IP Resource
Now we create the virtual IP resource.
First we will make a backup of the current cluster config.
# cd /tmp # pcs cluster cib /tmp/cluster-active_config.orig
Now we will configure the virtual IP in an offline file.
# pcs cluster cib /tmp/VirtualIP_cfg # pcs -f /tmp/VirtualIP_cfg resource create ClusterIP ocf:heartbeat:IPaddr2 ip=192.168.1.200 cidr_netmask=24 op monitor interval=30 # pcs -f /tmp/VirtualIP_cfg constraint location ClusterIP prefers centos-vm1-cn=200 # pcs -f /tmp/VirtualIP_cfg constraint location ClusterIP prefers centos-vm2-cn=50
Now we will push the configuration to the cluster.
# pcs cluster cib-push /tmp/VirtualIP_cfg CIB updated
Now we can whow the status of the Virtual IP.
# pcs status Cluster name: mycluster WARNING: no stonith devices and stonith-enabled is not false Last updated: Tue Sep 15 14:26:37 2015 Last change: Tue Sep 15 14:25:45 2015 by root via cibadmin on centos-vm1-cn Stack: corosync Current DC: centos-vm2-cn (version 1.1.13-a14efad) - partition with quorum 2 nodes and 1 resource configured Online: [ centos-vm1-cn centos-vm2-cn ] Full list of resources: ClusterIP (ocf::heartbeat:IPaddr2): Stopped PCSD Status: centos-vm1-cn: Online centos-vm2-cn: Online Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/disabled
Note that the service shows "stopped". This is because we have not configured or disabled fencing yet.
Disable Fencing
To disable fencing, run the following commands. We will discuss enabling it later.
# pcs property set stonith-enabled=false
Now we can check the status again, to see if the new service is running.
# pcs status Cluster name: mycluster Last updated: Tue Sep 15 14:28:59 2015 Last change: Tue Sep 15 14:28:55 2015 by root via cibadmin on centos-vm1-cn Stack: corosync Current DC: centos-vm2-cn (version 1.1.13-a14efad) - partition with quorum 2 nodes and 1 resource configured Online: [ centos-vm1-cn centos-vm2-cn ] Full list of resources: ClusterIP (ocf::heartbeat:IPaddr2): Started centos-vm1-cn PCSD Status: centos-vm1-cn: Online centos-vm2-cn: Online Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/disabled
Enable PCSD at system startup
We can tell the system to start pcsd at startup by running the below on ALL nodes.
# systemctl enable pcsd.service
Setup GlusterFS and Libvirtd (PCS)
Gluster setup is basically to control the mounted of the Gluster File system after system boot. This is needed as the Gluster application is not running when the system mounted the normal file systems, so can not be mounted at the same time.
To turn up GlusterFS mounted, running the following commands on an offline configuration file.
# pcs cluster cib /tmp/GlusterFS_cfg # pcs -f /tmp/GlusterFS_cfg resource create gluster-configs Filesystem device="localhost:/gv1" directory="/var/lib/libvirt/configs/" fstype="glusterfs" # pcs -f /tmp/GlusterFS_cfg resource create gluster-images Filesystem device="localhost:/gv0" directory="/var/lib/libvirt/images/" fstype="glusterfs" # pcs -f /tmp/GlusterFS_cfg resource create libvirtd systemd:libvirtd # pcs -f /tmp/GlusterFS_cfg constraint colocation add gluster-configs with gluster-images INFINITY # pcs -f /tmp/GlusterFS_cfg constraint colocation add libvirtd with gluster-images INFINITY # pcs -f /tmp/GlusterFS_cfg constraint order set gluster-configs gluster-images libvirtd sequential=true require-all=true setoptions kind=Serialize symmetrical=true
Now we can push the GlusterFS configs to the cluster.
# pcs cluster cib-push /tmp/GlusterFS_cfg CIB updated
Let's show the cluster status.
# pcs status Cluster name: mycluster Last updated: Tue Sep 15 14:51:03 2015 Last change: Tue Sep 15 14:50:17 2015 by root via cibadmin on centos-vm1-cn Stack: corosync Current DC: centos-vm2-cn (version 1.1.13-a14efad) - partition with quorum 2 nodes and 4 resources configured Online: [ centos-vm1-cn centos-vm2-cn ] Full list of resources: ClusterIP (ocf::heartbeat:IPaddr2): Started centos-vm1-cn gluster-configs (ocf::heartbeat:Filesystem): Started centos-vm2-cn gluster-images (ocf::heartbeat:Filesystem): Started centos-vm2-cn libvirtd (systemd:libvirtd): Started centos-vm2-cn PCSD Status: centos-vm1-cn: Online centos-vm2-cn: Online Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled
Setup KVM (PCS)
We actually handled this in the previous section, setting the libvirtd resource to run co-located with the GlusterFS resources. We also serialized the order to ensure that GlusterFS resource was running and completed start up before libvirtd resource is attempted to start.
Setup Virtual Machines (PCS)
To turn up Virtual Machine resources, running the following commands on an offline configuration file.
# pcs cluster cib /tmp/virtual-machine_cfg # pcs -f /tmp/virtual-machine_cfg resource create test-centos-1_vm VirtualDomain hypervisor="qemu:///system" config="/var/lib/libvirt/configs/test-centos-1.xml" meta remote-node=test-centos-1 # pcs -f /tmp/virtual-machine_cfg resource create test-centos-2_vm VirtualDomain hypervisor="qemu:///system" config="/var/lib/libvirt/configs/test-centos-2.xml" meta remote-node=test-centos-2 # pcs -f /tmp/virtual-machine_cfg constraint colocation add test-centos-1_vm with libvirtd INFINITY # pcs -f /tmp/virtual-machine_cfg constraint colocation add test-centos-2_vm with libvirtd INFINITY # pcs -f /tmp/virtual-machine_cfg constraint order start libvirtd then start test-centos-1_vm kind=Serialize symmetrical=true # pcs -f /tmp/virtual-machine_cfg constraint order start libvirtd then start test-centos-2_vm kind=Serialize symmetrical=true
We also want to tell teh cluster not to attempt to run the GlusterFS on teh virtual machines.
# pcs -f /tmp/virtual-machine_cfg constraint location gluster-configs avoids test-centos-1 # pcs -f /tmp/virtual-machine_cfg constraint location gluster-images avoids test-centos-1 # pcs -f /tmp/virtual-machine_cfg constraint location gluster-configs avoids test-centos-2 # pcs -f /tmp/virtual-machine_cfg constraint location gluster-images avoids test-centos-2
Now we can push the Virtual Machine configs to the cluster.
# pcs cluster cib-push /tmp/virtual-machine_cfg CIB updated
Let's show the cluster status.
# pcs status Cluster name: mycluster Last updated: Tue Sep 15 15:01:23 2015 Last change: Tue Sep 15 15:01:00 2015 by root via crm_resource on centos-vm2-cn Stack: corosync Current DC: centos-vm2-cn (version 1.1.13-a14efad) - partition with quorum 4 nodes and 8 resources configured Online: [ centos-vm1-cn centos-vm2-cn ] GuestOnline: [ test-centos-1@centos-vm2-cn test-centos-2@centos-vm2-cn ] Full list of resources: ClusterIP (ocf::heartbeat:IPaddr2): Started centos-vm1-cn gluster-configs (ocf::heartbeat:Filesystem): Started centos-vm2-cn gluster-images (ocf::heartbeat:Filesystem): Started centos-vm2-cn libvirtd (systemd:libvirtd): Started centos-vm2-cn test-centos-1_vm (ocf::heartbeat:VirtualDomain): Started centos-vm2-cn test-centos-2_vm (ocf::heartbeat:VirtualDomain): Started centos-vm2-cn PCSD Status: centos-vm1-cn: Online centos-vm2-cn: Online Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled
Setup FAKE Services (PCS)
We want to run fake services on the virtual machines to test out the pacemaker-remote service. This is not a critical part of the cluster, but it can come in handy if we are trying to track cluster services, such as DNS, mail, etc.
To turn up FAKE service resource, running the following commands on an offline configuration file.
# pcs cluster cib /tmp/FAKE_cfg # pcs -f /tmp/FAKE_cfg resource create FAKE1 ocf:pacemaker:Dummy # pcs -f /tmp/FAKE_cfg resource create FAKE2 ocf:pacemaker:Dummy
We will also set preference for the FAKE resources, so they only run on the virtual guests.
# pcs -f /tmp/FAKE_cfg constraint location FAKE1 prefers test-centos-1=200 # pcs -f /tmp/FAKE_cfg constraint location FAKE1 avoids centos-vm1-cn # pcs -f /tmp/FAKE_cfg constraint location FAKE1 avoids centos-vm2-cn # pcs -f /tmp/FAKE_cfg constraint location FAKE2 prefers test-centos-2=200 # pcs -f /tmp/FAKE_cfg constraint location FAKE2 avoids centos-vm1-cn # pcs -f /tmp/FAKE_cfg constraint location FAKE2 avoids centos-vm2-cn
Now we can push the FAKE Service configs to the cluster.
# pcs cluster cib-push /tmp/FAKE_cfg CIB updated
Let's show the cluster status.
# pcs status Cluster name: mycluster Last updated: Tue Sep 15 15:20:33 2015 Last change: Tue Sep 15 15:20:23 2015 by root via cibadmin on centos-vm1-cn Stack: corosync Current DC: centos-vm2-cn (version 1.1.13-a14efad) - partition with quorum 4 nodes and 10 resources configured Online: [ centos-vm1-cn centos-vm2-cn ] GuestOnline: [ test-centos-1@centos-vm2-cn test-centos-2@centos-vm2-cn ] Full list of resources: ClusterIP (ocf::heartbeat:IPaddr2): Started centos-vm1-cn gluster-configs (ocf::heartbeat:Filesystem): Started centos-vm2-cn gluster-images (ocf::heartbeat:Filesystem): Started centos-vm2-cn libvirtd (systemd:libvirtd): Started centos-vm2-cn test-centos-1_vm (ocf::heartbeat:VirtualDomain): Started centos-vm2-cn test-centos-2_vm (ocf::heartbeat:VirtualDomain): Started centos-vm2-cn FAKE1 (ocf::pacemaker:Dummy): Started test-centos-1 FAKE2 (ocf::pacemaker:Dummy): Started test-centos-2 PCSD Status: centos-vm1-cn: Online centos-vm2-cn: Online Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled
Set Gluster Resource to Clone
One last thing we want to do is set GlusterFS resources to be cloned. This will start the resources on all hypervisor nodes, and will quicken the recovers after node failure.
To turn up GlusterFS Clone service resource, running the following commands on an offline configuration file.
# pcs cluster cib /tmp/GlusterFS-Clone_cfg # pcs -f /tmp/GlusterFS-Clone_cfg resource clone gluster-configs # pcs -f /tmp/GlusterFS-Clone_cfg resource clone gluster-images
Now we can push the GlusterFS Clone Service configs to the cluster.
# pcs cluster cib-push /tmp/GlusterFS-Clone_cfg CIB updated
Let's show the cluster status.
# pcs status Cluster name: mycluster Last updated: Tue Sep 15 15:24:53 2015 Last change: Tue Sep 15 15:24:48 2015 by root via cibadmin on centos-vm1-cn Stack: corosync Current DC: centos-vm2-cn (version 1.1.13-a14efad) - partition with quorum 4 nodes and 16 resources configured Online: [ centos-vm1-cn centos-vm2-cn ] GuestOnline: [ test-centos-1@centos-vm2-cn test-centos-2@centos-vm2-cn ] Full list of resources: ClusterIP (ocf::heartbeat:IPaddr2): Started centos-vm1-cn libvirtd (systemd:libvirtd): Started centos-vm2-cn test-centos-1_vm (ocf::heartbeat:VirtualDomain): Started centos-vm2-cn test-centos-2_vm (ocf::heartbeat:VirtualDomain): Started centos-vm2-cn FAKE1 (ocf::pacemaker:Dummy): Started test-centos-1 FAKE2 (ocf::pacemaker:Dummy): Started test-centos-2 Clone Set: gluster-configs-clone [gluster-configs] Started: [ centos-vm1-cn centos-vm2-cn ] Stopped: [ test-centos-1 test-centos-2 ] Clone Set: gluster-images-clone [gluster-images] Started: [ centos-vm1-cn centos-vm2-cn ] Stopped: [ test-centos-1 test-centos-2 ] PCSD Status: centos-vm1-cn: Online centos-vm2-cn: Online Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled
Fencing
To configure fencing, there are several ways to do this, including isolating the network, isolating the power, or simply rebooting the machine. There are some that argue that the fence device should not share power with the device being fenced, since in the event of a power failure, the cluster will not be able to fence a failed device, and the cluster will remain in a partially failed state. This can be solved by configuring layered fencing, however this can be left to future discussions.
Configure Fencing and Constraints
The following commands will configure fencing on an offline file, which will be applied later. We set constraints so that the fencing resource on each node does not run on the node that it would act upon.
# pcs cluster cib /tmp/fencing_cfg # pcs -f /tmp/fencing_cfg stonith create fence_centos-vm1_ipmi fence_ipmilan pcmk_host_list="centos-vm1-cn" ipaddr="centos-vm1-ipmi" login=fencer passwd=2eznRKdeTu7cEee op monitor interval=60s # pcs -f /tmp/fencing_cfg constraint location fence_centos-vm1_ipmi avoids centos-vm1-cn # pcs -f /tmp/fencing_cfg stonith create fence_centos-vm2_ipmi fence_ipmilan pcmk_host_list="centos-vm2-cn" ipaddr="centos-vm2-ipmi" login=fencer passwd=2eznRKdeTu7cEee op monitor interval=60s # pcs -f /tmp/fencing_cfg constraint location fence_centos-vm2_ipmi avoids centos-vm2-cn
Now we can apply the config.
# pcs cluster cib-push /tmp/fencing_cfg
Finally, we can enable fencing on the cluster.
# pcs property set stonith-enabled=true
Fencing Delay
It may be useful to set a delay for the fencing devices. The default delay is 0, which means that the fence will act immediately upon failure. A delay would be useful to deal with momentart loss of communication between nodes, such as those caused by changes in switch topology.
If you wish to set a delay, you can run the following commands.
# pcs cluster cib /tmp/fencing-delay_cfg # pcs -f /tmp/fencing-delay_cfg stonith update fence_centos-vm1_ipmi delay=15 # pcs -f /tmp/fencing-delay_cfg stonith update fence_centos-vm2_ipmi delay=15 # pcs cluster cib-push /tmp/fencing-delay_cfg
System Testing
For testing, it is useful to run the crm_mon command, which will continuously monitor the cluster state. Use <CTL>-C to exit the program.
Last updated: Wed Sep 16 16:17:14 2015 Last change: Wed Sep 16 16:11:54 2015 by root via cibadmin on centos-vm2-cn Stack: corosync Current DC: centos-vm2-cn (version 1.1.13-a14efad) - partition with quorum 4 nodes and 18 resources configured Online: [ centos-vm1-cn centos-vm2-cn ] GuestOnline: [ test-centos-1@centos-vm2-cn test-centos-2@centos-vm2-cn ] ClusterIP (ocf::heartbeat:IPaddr2): Started centos-vm1-cn libvirtd (systemd:libvirtd): Started centos-vm2-cn test-centos-1_vm (ocf::heartbeat:VirtualDomain): Started centos-vm2-cn test-centos-2_vm (ocf::heartbeat:VirtualDomain): Started centos-vm2-cn FAKE1 (ocf::pacemaker:Dummy): Started test-centos-1 FAKE2 (ocf::pacemaker:Dummy): Started test-centos-2 Clone Set: gluster-configs-clone [gluster-configs] Started: [ centos-vm1-cn centos-vm2-cn ] Clone Set: gluster-images-clone [gluster-images] Started: [ centos-vm1-cn centos-vm2-cn ] fence_centos-vm1_ipmi (stonith:fence_ipmilan): Started centos-vm2-cn fence_centos-vm2_ipmi (stonith:fence_ipmilan): Started centos-vm1-cn
Pacemaker_Remote Node Failure
To test a remote node failure, simply log into the node and run the following command.
# killall -9 pacemaker_remoted
You will see in crm-mon, where the remote node changed to FAILED state, and was immediately restarted.
Cluster Node Failure
To test a full cluster node, you will want to run the following command.
# killall -9 corosync
You will see in crm_mon where this node will go into "Offline/UNCLEAN" state. If you have fencing enabled without delay, it will reboot the node. If you have fencing disabled, or set for delay, you will see corosync restart, and the node rejoin the cluster.
You can do further testing by isolating the node, however care must be taken with a two-node cluster so that the isolation of the two nodes does not cause the nodes to reboot each other, resulting in a cluster without any nodes.
Shut Down PCS Cluster
You will be able to shut down a single node in a cluster using the following command. Shutting down the node gracefully will ensure that all resources are stopped, and that other nodes will restart those resources, as applicable, without enacting fencing and rebooting the node.
# pcs cluster stop
Other Observations While Testing
It has been observed that some resources will be restarted on the cluster under failure condition, even though those resources are not moving to another node. I have not been able to determine why this happens, only that it happens some of the time. Further research into this should be done.
Converting to Active/Active Cluster
To convert to an active/active cluster, we should only need to convert the libvirtd resource to a cloned resource, and then assign some location preference to the virtual machines.
Configure Libvirtd Clone
You can run the following commands to configure the libvirtd resource for cloning using an offline config file.
# pcs cluster cib /tmp/libvirtd-clone_cfg # pcs -f /tmp/libvirtd-clone_cfg resource clone libvirtd
Now you can push the config to the cluster.
# pcs cluster cib-push /tmp/libvirtd-clone_cfg CIB updated
Configure Resource Preference
You can determine where to normally run the resources by configuring location preferences.
We will do this on an offline config file.
# pcs cluster cib /tmp/vm-preferences_cfg # pcs -f /tmp/vm-preferences_cfg constraint location test-centos-1_vm prefers centos-vm1-cn=200 # pcs -f /tmp/vm-preferences_cfg constraint location test-centos-1_vm prefers centos-vm2-cn=50 # pcs -f /tmp/vm-preferences_cfg constraint location test-centos-2_vm prefers centos-vm2-cn=200 # pcs -f /tmp/vm-preferences_cfg constraint location test-centos-2_vm prefers centos-vm1-cn=50
Now you can push the config to the cluster.
# pcs cluster cib-push /tmp/vm-preferences_cfg CIB updated
Resource Stickiness
Stickness is the cost of moving a resource from it's present location. The default is unset (equivalent to 0). If the the location preference is higher than the stickiness value, then the resource will move to the preferred location, assuming it is available. If the location preference is lower than the stickiness value, then the resource will remain in the current location until further action is taken on the resource, or the location becomes unavailable due to maintenance or failure.
To set the default value, run the following command.
# pcs resource defaults resource-stickiness=100
To set the value per resource, teh commands are ran as shown below.
# pcs resource meta test-centos-1_vm resource-stickiness=250 # pcs resource meta test-centos-2_vm resource-stickiness=250
To unset stickiness, use one of the following forms.
# pcs resource defaults resource-stickiness= # pcs resource meta test-centos-1_vm resource-stickiness= # pcs resource meta test-centos-2_vm resource-stickiness=
Retry Start Failures
One of the frustrating issues that I have seen with Pacemaker is the fatality of failure to start. It appears that this can be partially solved with setting the start-failure-is-fatal default setting. By default, it is set to true so that any failure to start a resource will never be retried.
# pcs resource defaults start-failure-is-fatal=false # pcs resource update <resource> meta failure-timeout="30s" # pcs resource op delete <resource> start # pcs resource op add <resource> start interval=0s timeout=90 on-fail="restart" # pcs resource op delete <resource> monitor # pcs resource op add <resource> monitor interval=10 timeout=30 on-fail="restart"
After set, the resource will look something like this:
# pcs resource show test-centos-1_vm Resource: test-centos-1_vm (class=ocf provider=heartbeat type=VirtualDomain) Attributes: hypervisor=qemu:///system config=/var/lib/libvirt/configs/test-centos-1.xml Meta Attrs: remote-node=test-centos-1 resource-stickiness=250 failure-timeout=30s Operations: stop interval=0s timeout=90 (test-centos-1_vm-stop-timeout-90) start interval=0s timeout=90 on-fail=restart (test-centos-1_vm-name-start-interval-0s-on-fail-restart-timeout-90) monitor interval=10 timeout=30 on-fail=restart (test-centos-1_vm-name-monitor-interval-10-on-fail-restart-timeout-30)
Manually Move Resource
You can manually move resources by running a series of commands.
# pcs constraint location test-centos-1_vm prefers centos-vm1-cn=INFINITY # pcs constraint location test-centos-1_vm prefers centos-vm1-cn=200
First command sets the location preference to INFINITY, ensuring he move, if the location is available. The second command sets the location preference bact to what it was previously set to.
Adding Additional Nodes
You can add additional nodes using the below procedures, however you should always have an odd number of nodes to ensure that you are able to achieve quorum. Quorum is achieved when there are 50% of the configured nodes present, plus one additional node. The exception to this is the two-node cluster, where a single node will be able to achieve quorum.
Helpful Commands
crm_resource
Move Rsource
To move a resource manually from one node to another node, the following command will need to be set.
# crm_resource --resource <resource> --move OR # crm_resource --resource <resource> --move --node <node>
NOTE: This will set the location preference on the current node to -INFINITY, which will need to be cleared to resume normal operations.
To permit, but not force a resource to return to normal operation, and return to preferred node on it's own.
# crm_resource --resource <resource> --un-move
pcs resource
Show Resources
This shows the defined resources, and their current configuration values.
# pcs resource show --full
pcs constraint
Show Constraints
To show constraints, use the following command:
# pcs constraint show
To show full constraints, including constraint ID, use the following:
# pcs constraint show --full
Additional Resources
Useful web links: