Skip to content
Snippets Groups Projects
Forked from SCS / docs.it4i.cz
1522 commits behind, 1054 commits ahead of the upstream repository.
virtualization.md 15.93 KiB

Virtualization

Running virtual machines on compute nodes

Introduction

There are situations when Anselm's environment is not suitable for user needs.

  • Application requires different operating system (e.g Windows), application is not available for Linux
  • Application requires different versions of base system libraries and tools
  • Application requires specific setup (installation, configuration) of complex software stack
  • Application requires privileged access to operating system
  • ... and combinations of above cases

We offer solution for these cases - virtualization. Anselm's environment gives the possibility to run virtual machines on compute nodes. Users can create their own images of operating system with specific software stack and run instances of these images as virtual machines on compute nodes. Run of virtual machines is provided by standard mechanism of Resource Allocation and Job Execution.

Solution is based on QEMU-KVM software stack and provides hardware-assisted x86 virtualization.

Limitations

Anselm's infrastructure was not designed for virtualization. Anselm's environment is not intended primary for virtualization, compute nodes, storages and all infrastructure of Anselm is intended and optimized for running HPC jobs, this implies suboptimal configuration of virtualization and limitations.

Anselm's virtualization does not provide performance and all features of native environment. There is significant performance hit (degradation) in I/O performance (storage, network). Anselm's virtualization is not suitable for I/O (disk, network) intensive workloads.

Virtualization has also some drawbacks, it is not so easy to setup efficient solution.

Solution described in chapter HOWTO is suitable for single node tasks, does not introduce virtual machine clustering.

!!! note Please consider virtualization as last resort solution for your needs.

!!! warning Please consult use of virtualization with IT4Innovation's support.

For running Windows application (when source code and Linux native application are not available) consider use of Wine, Windows compatibility layer. Many Windows applications can be run using Wine with less effort and better performance than when using virtualization.

Licensing

IT4Innovations does not provide any licenses for operating systems and software of virtual machines. Users are ( in accordance with Acceptable use policy document) fully responsible for licensing all software running in virtual machines on Anselm. Be aware of complex conditions of licensing software in virtual environments.

!!! note Users are responsible for licensing OS e.g. MS Windows and all software running in their virtual machines.

Howto

Virtual Machine Job Workflow

We propose this job workflow:

Workflow

Our recommended solution is that job script creates distinct shared job directory, which makes a central point for data exchange between Anselm's environment, compute node (host) (e.g. HOME, SCRATCH, local scratch and other local or cluster file systems) and virtual machine (guest). Job script links or copies input data and instructions what to do (run script) for virtual machine to job directory and virtual machine process input data according instructions in job directory and store output back to job directory. We recommend, that virtual machine is running in so called snapshot mode, image is immutable - image does not change, so one image can be used for many concurrent jobs.

Procedure

  1. Prepare image of your virtual machine
  2. Optimize image of your virtual machine for Anselm's virtualization
  3. Modify your image for running jobs
  4. Create job script for executing virtual machine
  5. Run jobs

Prepare Image of Your Virtual Machine

You can either use your existing image or create new image from scratch.

QEMU currently supports these image types or formats:

  • raw
  • cloop
  • cow
  • qcow
  • qcow2
  • vmdk - VMware 3 & 4, or 6 image format, for exchanging images with that product
  • vdi - VirtualBox 1.1 compatible image format, for exchanging images with VirtualBox.

You can convert your existing image using qemu-img convert command. Supported formats of this command are: blkdebug blkverify bochs cloop cow dmg file ftp ftps host_cdrom host_device host_floppy http https nbd parallels qcow qcow2 qed raw sheepdog tftp vdi vhdx vmdk vpc vvfat.

We recommend using advanced QEMU native image format qcow2.

More about QEMU Images

Optimize Image of Your Virtual Machine

Use virtio devices (for disk/drive and network adapter) and install virtio drivers (paravirtualized drivers) into virtual machine. There is significant performance gain when using virtio drivers. For more information see Virtio Linux and Virtio Windows.

Disable all unnecessary services and tasks. Restrict all unnecessary operating system operations.

Remove all unnecessary software and files.

Remove all paging space, swap files, partitions, etc.

Shrink your image. (It is recommended to zero all free space and reconvert image using qemu-img.)

Modify Your Image for Running Jobs

Your image should run some kind of operating system startup script. Startup script should run application and when application exits run shutdown or quit virtual machine.

We recommend, that startup script

  • maps Job Directory from host (from compute node)
  • runs script (we call it "run script") from Job Directory and waits for application's exit
    • for management purposes if run script does not exist wait for some time period (few minutes)
  • shutdowns/quits OS

For Windows operating systems we suggest using Local Group Policy Startup script, for Linux operating systems rc.local, runlevel init script or similar service.

Example startup script for Windows virtual machine:

    @echo off
    set LOG=c:startup.log
    set MAPDRIVE=z:
    set SCRIPT=%MAPDRIVE%run.bat
    set TIMEOUT=300

    echo %DATE% %TIME% Running startup script>%LOG%

    rem Mount share
    echo %DATE% %TIME% Mounting shared drive>%LOG%
    net use z: 10.0.2.4qemu >%LOG% 2>&1
    dir z: >%LOG% 2>&1
    echo. >%LOG%

    if exist %MAPDRIVE% (
      echo %DATE% %TIME% The drive "%MAPDRIVE%" exists>%LOG%

      if exist %SCRIPT% (
        echo %DATE% %TIME% The script file "%SCRIPT%"exists>%LOG%
        echo %DATE% %TIME% Running script %SCRIPT%>%LOG%
        set TIMEOUT=0
        call %SCRIPT%
      ) else (
        echo %DATE% %TIME% The script file "%SCRIPT%"does not exist>%LOG%
      )

    ) else (
      echo %DATE% %TIME% The drive "%MAPDRIVE%" does not exist>%LOG%
    )
    echo. >%LOG%

    timeout /T %TIMEOUT%

    echo %DATE% %TIME% Shut down>%LOG%
    shutdown /s /t 0

Example startup script maps shared job script as drive z: and looks for run script called run.bat. If run script is found it is run else wait for 5 minutes, then shutdown virtual machine.

Create Job Script for Executing Virtual Machine

Create job script according recommended

Virtual Machine Job Workflow.

Example job for Windows virtual machine:

    #/bin/sh

    JOB_DIR=/scratch/$USER/win/${PBS_JOBID}

    #Virtual machine settings
    VM_IMAGE=~/work/img/win.img
    VM_MEMORY=49152
    VM_SMP=16

    # Prepare job dir
    mkdir -p ${JOB_DIR} && cd ${JOB_DIR} || exit 1
    ln -s ~/work/win .
    ln -s /scratch/$USER/data .
    ln -s ~/work/win/script/run/run-appl.bat run.bat

    # Run virtual machine
    export TMPDIR=/lscratch/${PBS_JOBID}
    module add qemu
    qemu-system-x86_64
      -enable-kvm
      -cpu host
      -smp ${VM_SMP}
      -m ${VM_MEMORY}
      -vga std
      -localtime
      -usb -usbdevice tablet
      -device virtio-net-pci,netdev=net0
      -netdev user,id=net0,smb=${JOB_DIR},hostfwd=tcp::3389-:3389
      -drive file=${VM_IMAGE},media=disk,if=virtio
      -snapshot
      -nographic

Job script links application data (win), input data (data) and run script (run.bat) into job directory and runs virtual machine.

Example run script (run.bat) for Windows virtual machine:

    z:
    cd winappl
    call application.bat z:data z:output

Run script runs application from shared job directory (mapped as drive z:), process input data (z:data) from job directory and store output to job directory (z:output).

Run Jobs

Run jobs as usual, see Resource Allocation and Job Execution. Use only full node allocation for virtualization jobs.

Running Virtual Machines

Virtualization is enabled only on compute nodes, virtualization does not work on login nodes.

Load QEMU environment module:

    $ module add qemu

Get help

    $ man qemu

Run virtual machine (simple)

    $ qemu-system-x86_64 -hda linux.img -enable-kvm -cpu host -smp 16 -m 32768 -vga std -vnc :0

    $ qemu-system-x86_64 -hda win.img   -enable-kvm -cpu host -smp 16 -m 32768 -vga std -localtime -usb -usbdevice tablet -vnc :0

You can access virtual machine by VNC viewer (option -vnc) connecting to IP address of compute node. For VNC you must use VPN network.

Install virtual machine from ISO file

    $ qemu-system-x86_64 -hda linux.img -enable-kvm -cpu host -smp 16 -m 32768 -vga std -cdrom linux-install.iso -boot d -vnc :0

    $ qemu-system-x86_64 -hda win.img   -enable-kvm -cpu host -smp 16 -m 32768 -vga std -localtime -usb -usbdevice tablet -cdrom win-install.iso -boot d -vnc :0

Run virtual machine using optimized devices, user network back-end with sharing and port forwarding, in snapshot mode

    $ qemu-system-x86_64 -drive file=linux.img,media=disk,if=virtio -enable-kvm -cpu host -smp 16 -m 32768 -vga std -device virtio-net-pci,netdev=net0 -netdev user,id=net0,smb=/scratch/$USER/tmp,hostfwd=tcp::2222-:22 -vnc :0 -snapshot

    $ qemu-system-x86_64 -drive file=win.img,media=disk,if=virtio -enable-kvm -cpu host -smp 16 -m 32768 -vga std -localtime -usb -usbdevice tablet -device virtio-net-pci,netdev=net0 -netdev user,id=net0,smb=/scratch/$USER/tmp,hostfwd=tcp::3389-:3389 -vnc :0 -snapshot