How do you approach VM snapshotting and quiescence?

muay_throwaway · January 11, 2024, 4:05am

For those hosting VMs, what is your approach to quiescence for backups? Do you just rely on journaling (no quiescence)? Do you quiesce the filesystems/applications prior to backup (flushing transactions, etc.)? Do you do full memory-state snapshots (which are in turn backed up on other storage)?

mercenary_sysadmin · January 11, 2024, 5:11pm

I rely on the journaling filesystems and database engines (and ACID compliant db-backed apps) running inside the VMs for crash consistency, when it comes to automated snapshots.

Generally speaking, any modern tech SHOULD very much be crash consistent. That’s something to directly investigate, of course, but at this point when you discover things which are not crash consistent, you can and should immediately root out and replace them before they can cause you problems. This is true whether you’re virtualizing or not: even on bare metal, crashes obviously do happen, and it’s simply unacceptable not to build for consistency across them anymore.

On the other hand, if I’m performing an operation that demands a clean snapshot, I will absolutely quiesce the VM before manually taking a clean snapshot–it’s a belt and suspenders approach.

So, if I value the suspenders in addition to the belt when manually taking clean snapshots, why don’t I set up automated quiescence as a part of my automated snapshot routine? Answer is, because the automation is considerably dumber than I am. I trust myself do to complex tasks intelligently, but any scripting I build cannot be as flexible and responsive. So if I build in quiescence routines, I also build in a hugely larger area in which bugs can crop up, with consequences ranging from “the VM or apps did not unquiesce correctly” to “snapshots aren’t being taken at all because the quiescence routine didn’t complete as intended” (very obvious example of this: Microsoft’s Volume Shadow Copy service, which has triggers to quiesce DB engines, and as a result VERY frequently just doesn’t work at all on volumes that have a DB engine on them somewhere).

So, how DO I quiesce my workloads when taking a clean snapshot manually? For me, everything is always inside a VM… so I typically just shut the VM down before the snapshot. Doesn’t get any cleaner than that. If you can’t live with a reboot, there’s nothing left but full system state dumps along with your snapshot, and that tends to have a pretty severe performance penalty to go along with it.

If your VM is a tiny little throwaway, a full system state snapshot of it isn’t such a big deal. But the VMs that you just can’t bear to reboot are, in my experience, generally large and powerful beasts: and if your large and powerful beast has 16GiB of RAM, that’s 16GiB of writes you have to commit to disk every time you take that full system state snapshot.

muay_throwaway · January 11, 2024, 9:40pm

Great and comprehensive points; thank you! Yes, it does seem modern systems are generally crash safe.

Halfwalker · January 11, 2024, 10:33pm

Mostly rely on the VM filesystems internal checks to handle things. Databases a little more careful, though it should be fine. I use a method like this for a system with mysql-on-zfs

snapshot_mysql() {
    echo " =- Stopping sphinxsearch, locking mysql"
    systemctl stop sphinxsearch

    mysql_locked=/var/run/mysql_locked
    # flush & lock MySQL, touch mysql_locked, and wait until it is removed
    mysql -NB <<-EOF &
        flush tables with read lock;
        delimiter ;;
        system touch $mysql_locked
        system while test -e $mysql_locked; do sleep 1; done
        exit
EOF

    # wait for the preceding command to touch mysql_locked
    i=1
    while ! test -e $mysql_locked; do
        echo -en "\r    Waiting for mysql to lock tables $i"
        sleep 1
        i=$(($i+1))
    done
    echo ""

    # take a snapshot of the filesystem, while MySQL is being held locked
    for dataset in ${db_datasets} ; do
        if [[ $(echo ${src_snaps} | grep -c "${db_pool}/${dataset}@${new_snap}") -eq 0 ]] ; then
            echo "    Taking snapshot ${db_pool}/${dataset}@${new_snap}"
            zfs snap ${db_pool}/${dataset}@${new_snap}
        else
            echo "    Snapshot exists ${db_pool}/${dataset}@${new_snap}"
        fi
        done

    # unlock MySQL
    rm -f $mysql_locked

    echo " =- Restarting sphinxsearch, unlocking mysql"
    systemctl start sphinxsearch

It’s a circular spin-lock basically. Mysql does a flush-with-read-lock, then sets a flag-file and waits for that to disappear. Externally, the script waits for that flag-file to appear, then snaps the datasets and removes the flag-file. When it gets removed, mysql exits the sleep loop.

Works well.

muay_throwaway · January 11, 2024, 11:40pm

Thank you! That’s a useful and creative solution!