Benchmarking ZVOL vs .raw VM disk

I know this has been discussed previously here, but I am trying to compare the performance benefits of using raw files for VM disks in Proxmox vs ZVOLs which are the default when using a ZFS storage type. I am under the impression from Jim’s Klara article and recent 2.5 Admins episode that raw files should give better performance than ZVOLs. I am trying and failing to replicate those testing results.

Here are the details of the test I ran:

  • PVE 8.4.1
  • ZFS pool: 4x SSDs in 2 mirrored vdevs

ZVol Setup

  • Created dataset ssdtank/vm-testing with the relevant properties:
root@proxmox:~# zfs get recordsize,compression,xattr,atime ssdtank/vm-testing
NAME                PROPERTY     VALUE           SOURCE
ssdtank/vm-testing  recordsize   64K             local
ssdtank/vm-testing  compression  on              default
ssdtank/vm-testing  xattr        sa              inherited from ssdtank
ssdtank/vm-testing  atime        off             inherited from ssdtank
  • Created Proxmox storage:
    • Storage Type: ZFS
    • Block Size: 64k
    • Thin Provision: No

Raw File Setup

  • Created dataset ssdtank/vm-testing-dir with the exact same dataset properties:
root@proxmox:~ zfs get recordsize,compression,xattr,atime ssdtank/vm-testing-dir
NAME                    PROPERTY     VALUE           SOURCE
ssdtank/vm-testing-dir  recordsize   64K             local
ssdtank/vm-testing-dir  compression  on              default
ssdtank/vm-testing-dir  xattr        sa              inherited from ssdtank
ssdtank/vm-testing-dir  atime        off             inherited from ssdtank
  • Created Proxmox storage:
    • Storage Type: dir
    • Preallocation: Off

VM Setup

I created 2 VMs with the exact same settings, but using the 2 different storage backends:

  • OS: Debian 13 (Trixie)
  • Qemu Agent: Yes
  • Disk Settings: all default, size: 32GB
  • CPU: 4x cores, type: host
  • Memory: 2G
  • All Debian installation defaults, all files in one partition
  • No desktop environment just “standard system utilities”

Benchmark

Since I’m optimizing for a basic “general” workload with these VMs, I used a mix of read/write and a block size of 64k

fio --name=rand-rw-64k --ioengine=posixaio --rw=randrw --rwmixread=25 --bs=64k --size=4g --numjobs=1 --iodepth=1 --runtime=60 --time_based

Results

ZVol Performance

read

  • 233 IOPS
  • 15.3 MB/s

write

  • 700 IOPS
  • 45.9 MB/s

Raw File Performance

read

  • 234 IOPS
  • 15.4 MB/s

write

  • 704 IOPS
  • 46.2 MB/s

is there something influencing these results that I’m not aware of? Is this a limitation of Linux vs FreeBSD?

Well, I don’t know and you aren’t directly showing what proxmox is actually doing with the VM that you created “directory storage” for, so I don’t know.

I’ve been told that it’s not that easy to get proxmox to use raw files in the first place. I suspect you might actually be testing zvol vs zvol there.

Can you directly show me the VM storage back end? You should be able to get a literal directory listing of the dataset that supposed raw file backed VM lives in.

It’s very likely I messed something up.

VM Config

root@proxmox:/ssdtank/vm-testing-dir/images/108# cat /etc/pve/qemu-server/108.conf
agent: 1
boot: order=scsi0;ide2;net0
cores: 4
cpu: host
ide2: local:iso/debian-13.0.0-amd64-netinst.iso,media=cdrom,size=754M
memory: 2048
meta: creation-qemu=9.2.0,ctime=1755217325
name: vmtest-raw
net0: virtio=BC:24:11:66:EF:D1,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
scsi0: vm-testing-dir:108/vm-108-disk-0.raw,iothread=1,size=32G
scsihw: virtio-scsi-single
smbios1: uuid=4b96ddc7-befe-417a-8d61-13660d7a1496
sockets: 1
vmgenid: c7035fb5-12bf-4a12-ad53-d6b13c8020b7

[PENDING]
boot: order=scsi0;net0
delete: ide2

Directory Storage
(if there’s a better name for this, I’m all ears. The Proxmox manual just refers to it as a “storage”)

Raw File

root@proxmox:~# ls -alh /ssdtank/vm-testing-dir/images/108/
total 4.9G
drwxr----- 2 root root   3 Aug 14 19:22 .
drwxr-xr-x 3 root root   3 Aug 14 19:22 ..
-rw-r----- 1 root root 32G Aug 14 22:12 vm-108-disk-0.raw
root@proxmox:~# qemu-img info /ssdtank/vm-testing-dir/images/108/vm-108-disk-0.raw
image: /ssdtank/vm-testing-dir/images/108/vm-108-disk-0.raw
file format: raw
virtual size: 32 GiB (34359738368 bytes)
disk size: 4.88 GiB
Child node '/file':
    filename: /ssdtank/vm-testing-dir/images/108/vm-108-disk-0.raw
    protocol type: file
    file length: 32 GiB (34359738368 bytes)
    disk size: 4.88 GiB
1 Like

That does look correct. The performance advantage of raw vs zvol applies on both Linux and FreeBSD, but I don’t use virtio-scsi, I just use virtio. That might be the difference.

Hmm, I don’t see just “virtio” as an option in the web UI

I’ll play around with the different options and see if one of them offers better performance.

you need to choose “virtio block” as bus/device type when adding a new harddisk.

Where in the config file might it say “virtio”? Would you be able to post an example config file? That would be very helpful. I’m having a hard time finding info on QEMU storage drivers.

Whew, okay I did some more digging at here’s what I found.

virtio-scsi vs virtio-blk

In Proxmox, the default SCSI controller in the UI is “VirtIO SCSI (Single)” which translates to a KVM argument of -device virtio-scsi-pci,.... Though it is not in the UI as an option, another controller is available. If you create a VM, but don’t start it, then edit /etc/pve/qemu-server/ID.conf you can change the value of scsihw to virtio-blk. Once the VM is started, the KVM argument will be -device virtio-blk-pci,....

There’s not much info online about the difference between these 2 types of devices except for this QEMU article so I decided to add that as a testing dimension.

Setup

For this round of testing, I used the same ZFS datasets, VM/OS settings, and fio benchmark as I listed in the top post.

For “raw” storage, I used the “directory” Proxmox storage and chose a .raw file - not .qcow2 or .vmdk.

Results

test storage controller Read Speed Read IOPS Write Speed Write IOPS
1 ZFS virtio-scsi 18.5 MB/s 295 58.0 MB/s 884
2 ZFS virtio-blk 18.0 MB/s 274 53.9 MB/s 821
3 Raw virtio-scsi 16.8 MB/s 256 50.3 MB/s 767
4 Raw virtio-blk 16.1 MB/s 224 48.0 MB/s 731

Again, it seems like using ZFS storage (which puts a raw file inside a ZVol) gives the best performance. Both Jim and Allan agree that raw files are faster than ZVols and I believe them, which is why it’s annoying that I cannot replicate those results in my own testing.

1 Like

It’s been a minute since I spun up a proxmox VM. If you’re really curious and you want to try some more, consider maybe booting temporarily from a vanilla Ubuntu installer, importing the pool, and testing in the live environment?

At this point I’m curious also, because I’ve never seen zvols perform well, in at this point decades of testing on multiple OSes and hypervisors.

Those results look very low across the board for a pool with two mirror vdevs of SSDs, though. There may simply be a nastier bottleneck hitting before the difference between raw files and zvols can make a difference.

Are you running proxmox on bare metal, or is there some other layer (eg TrueNAS) in play?

I’m running Proxmox on bare metal, specifically:

  • Mobo: MSI B450 GAMING PLUS MAX
  • CPU: AMD Ryzen 5 1600
  • RAM: Corsair Vengeance LPX DDR4 16GB (2x8GB)
  • Storage: 4x 1TB WD Blue SATA SSD
  • GPU: Sparkle Intel Arc A380 6GB

SSDs are connected directly to the motherboard. Build log is here and gives more details if useful.

I think that’s the topology of my pool:

root@proxmox:~# zpool status ssdtank
  pool: ssdtank
 state: ONLINE
  scan: scrub repaired 0B in 00:24:05 with 0 errors on Sun Aug 10 00:48:08 2025
config:

        NAME                                    STATE     READ WRITE CKSUM
        ssdtank                                 ONLINE       0     0     0
          mirror-0                              ONLINE       0     0     0
            ata-WDC_WDBNCE0010PNC_19416F800207  ONLINE       0     0     0
            ata-WDC_WDBNCE0010PNC_200559800082  ONLINE       0     0     0
          mirror-1                              ONLINE       0     0     0
            ata-WDC_WDBNCE0010PNC_200608800160  ONLINE       0     0     0
            ata-WDC_WDBNCE0010PNC_200608802113  ONLINE       0     0     0

OHHHHH, I just caught it:

Numjobs=1, iodepth=1. This is not a very reasonable benchmark for testing a storage stack, because it doesn’t model what you actually do with it very well–and it removes any possibility of reordering for more efficient operations, which is a key portion of getting the performance your full storage stack actually delivers in real world use.

Try numjobs=8, iodepth=8 if you want to get closer to replicating my results. I’d have to go back to my notes from creating the article to find the exact settings I used, but that’s generally about the level of concurrency I test with.

If you think that’s a bit too busy for your system, I’d still advise at least numjobs=4, iodepth=4.

Those Blues are holding you back horribly, btw. This is on a system with a single mirror vdev of Kingston DC600m, using the same fio parameters you used, including numjobs=1,iodepth=1:

Run status group 0 (all jobs):
   READ: bw=135MiB/s (141MB/s), 135MiB/s-135MiB/s (141MB/s-141MB/s),
io=8095MiB (8488MB), run=60001-60001msec
  WRITE: bw=402MiB/s (421MB/s), 402MiB/s-402MiB/s (421MB/s-421MB/s),
io=23.5GiB (25.3GB), run=60001-60001msec

You might be tempted to call foul because those Kingstons are enterprise grade SSDs, but… DC600 are actually slower than fast consumer SATA SSDs on single process workloads like these; the DC600s focus is on hardware QOS that smooths latencies out when the device is under extremely heavy, massively parallel load… Which it does at the expense of raw single process throughput like this fio job represents.