Difference between revisions of "Checking for swappers on XenServer"

From Christoph's Personal Wiki
Jump to: navigation, search
(XenServer)
(XenServer)
Line 39: Line 39:
 
* Verify that the output from the last command is within the current timestamp
 
* Verify that the output from the last command is within the current timestamp
 
  date
 
  date
 +
* As a final step, you can bring up the slices console to check for any errors, problems, or wait times (e.g., [[fsck]]ing):
 +
xl console slice01234
  
 
That's it! You have successfully killed a swapper and brought the slice back up to a normal load.
 
That's it! You have successfully killed a swapper and brought the slice back up to a normal load.

Revision as of 12:42, 5 September 2013

This article will outline the steps I take to check for swappers on my XenServer and XenClassic setups.

XenServer

  • Log into the huddle the host in question is located in
  • Log into the host the slice/instances is located on
  • Run iostat to check for any swappers on this host:
$ iostat -xkd 1
  • If swappers are found, cat the device of the slice to get its minor number:
$ cat /sys/block/tdk/dev
432:20 # this is the major:minor number of the device
  • Get the UUID for this slice

The `tap-ctl list` command can be used to list all VHDs in use along with their minor version number and the process id of the tapdisk2 process that is responsible for identifying this .vhd file

$ tap-ctl list | grep minor=20
# ~OR~
$ tap-ctl list -m 20
pid=12912 minor=20 state=0 args=vhd:/var/run/sr-mount/eee-eee-eee-eee-eee/fff-fff-fff-fff-fff.vhd

`tap-ctl list` will help you identify the .vhd file or the UUID of the VDI, which is responsible for the slice pushing heavy IO operations. In the above example, the vdi-uuid=fff-fff-fff-fff-fff.

  • Check if this slice is a "swap" partition (make sure it is _not_ a root partition!):
$ xe vdi-list uuid=fff-fff-fff-fff-fff
uuid ( RO)                : fff-fff-fff-fff-fff
          name-label ( RW): slice01234
    name-description ( RW): 
             sr-uuid ( RO): 76fa2a40-5729-1297-a4d8-38df5ea4128c
        virtual-size ( RO): 163208757248
            sharable ( RO): false
           read-only ( RO): false
  • Check to make sure no other tasks are currently being performed on this host (except for "pending: reboot"-like statuses):
xe task-list
  • Now force a reboot
xe vm-reboot --force name-label=slice01234
  • Check that the slice has actually rebooted:
xe vm-list name-label=slice01234 params=start-time
  • Verify that the output from the last command is within the current timestamp
date
  • As a final step, you can bring up the slices console to check for any errors, problems, or wait times (e.g., fscking):
xl console slice01234

That's it! You have successfully killed a swapper and brought the slice back up to a normal load.

  • The following does the same as the above (without rebooting) as a single CLI script:
(echo "Slice IO_Read IO_Write Total"; \
(for uuid in $(xe vbd-list params=uuid | awk '$5{print $5}'); do \
  xe vbd-param-list uuid=$uuid | grep -P "^\s*(io_|vm-name-label|device)" | \
  awk '{if($1=="vm-name-label") name=$4; \
  if($1=="device") {\
    if($4=="xvdc" || $4=="xvdd") name=name"-swap"; \
    if($4=="xvda" || $4=="xvdb") name=name"-root";} \
  if($1=="io_read_kbs") ioread=$4; \
  if($1=="io_write_kbs") iowrite=$4}\
  END{if(substr(name,0,9)!="XenServer") print name" "ioread" "iowrite" "ioread+iowrite}';\
done) | sort -k4n) | column -t

XenClassic

This is how you do the above on a XenClassic setup.

  • Log into the huddle the host in question is located in
  • Log into the host the slice/instances is located on
  • Run iostat to check for any swappers on this host:
iostat -xkd 1
  • If swappers are found, list the device mapper for this slice:
ls -l /dev/mapper/ | grep ' 20 '
  • Now "destroy" that slice (we are not permanently destroying this device, that's just the terminology):
xm destroy slice01234
  • Finally, re-create this slice (this takes information from a configuration flatfile found under /etc/xen/slice01234):
xm create slice01234
  • You can check that the server has fully booted back up with the following command:
xm console slice01234

That's it! You have successfully killed a swapper and brought the slice back up to a normal load.

We can accomplish the above with a single script (well, a series of CLI calls) like so:

(echo "device name tps MB_read/s MB_write/s MB_total/s";\
(ls -l /dev/mapper/; iostat -m 1 2) |\
awk 'BEGIN {section=0} {\
  if($3=="root") devices[$6]=$9;\
  if($1=="Device:") section++;\
  else if(section==2 && $0) {\
    dev=$1; tps=$2; read=$3; write=$4;\
    if(substr(dev,1,2)=="dm") {split(dev,parts,"-"); name=devices[parts[2]]} \
    if(!name) name="unknown";\
    print dev" "name" "tps" "read" "write" "read+write}}' |\
sort -k3n) | column -t

And, if the above command(s) return a slice id, then execute:

xm destroy slice01234 && xm create slice01234 && xm console slice01234

XenClassic slice configuration file

Below is an example of what a slice configuration flatfile (which is, by default, located at /etc/xen/slice01234) would look like:

name="slice01234"
memory=512
vcpus=4
kernel="/etc/xen/seeds/01/vmlinuz-2.6.35.4-generic"
ramdisk="/etc/xen/seeds/01/initrd.img-2.6.35.4-generic"
vif=['bridge=eth0, ip=xxx.xxx.xxx.xxx, mac=4f:d0:c4:10:ab:0c','bridge=eth1, ip=10.x.x.x, mac=4f:e0:29:9f:c3:fe']
disk=[ 'phy:slices/slice01234_root,sda1,w', 'phy:slices/slice01234_swap,sda2,w' ]
root="/dev/sda1 ro"
cpu_weight=512
extra="xencons=tty console=tty1 clocksource=acpi_pm "

where,

memory = how much RAM should be allocated for the slice
vcpus = number of virtual CPU's that should appear within the customer's environment (default: 4)
kernel = which kernel seed the slice should use
ramdisk = the initrd that corresponds to the kernel being used by the slice
vif = sets IP's and MAC addresses for the various network interfaces on the slice
disk = the storage volumes that the slice will use
root = the storage volume that should be used as the root filesystem
cpu_weight = helps to calculate the slice's share of the CPU (equal to RAM)
extra_ips = if the slice has additional IP's, they will be listed here
extra = additional Xen settings