NetApp VMware VAAI performance test, part II: iSCSI
Few days ago I posted some performance test results with VMware vStorage API for Array Integration. I went through the test again, but insted of Fibre Channel I have iSCSI today.
I have the same piece of HW except for the connection:
- Cisco UCS B200 blade system with two X5550 sockets and 48GB of RAM
- NetApp FAS2040 with 12×300 FC 15krpm disks. For this test I’ve created an aggregate of 9 disks.
In the last test I had 4Gb Fibre Channel connection and now I have simple 1Gb iSCSI. I repeated the same steps, the only difference is the size of the VMDK I’ve added to the Virtual Machine:
- add a new Hard Disk to the VM, 50GB (was 100GB in the FC post), thick, cluster supported (zeroed)
- clone the VM (with the added disk) within the same LUN
- clone the VM to another LUN
- Storage VMotion the VM
It’s not a surprise, the trend is the same.
| Operation | Enabled VAAI | Disabled VAAI |
| 50GB VMDK creation with cluster support (zeroed) | 5:09 | 9:36 |
| Clone VM within datastore (LUN) | 8:36 | 13:38 |
| Clone VM between datastores (LUN) | 8:34 | 14:36 |
| Storage VMotion | 9:38 | 14:45 |
The tipical VAAI operation looks like this on the storage controller:
The CPU load is almost 100% and the Operation (ops/sec) is near zero. With disabled VAAI looks different: 
The CPU usage is 80%, and the chart above shows around 900-1000 Ops/sec iSCSI operations. This is for the simple write (zeroed VMDK creation), for clone and Storage VMotion looks a bit different:
The CPU load is the same, but the iSCSI operation is more like ~2000 Ops/sec. Let’s see what’s happening on the ESXi host.

With VAAI enabled, there’s no write or read rate (as there’s no read or write from the host side), but the charts shows latency around 8-10ms. With disabled VAAI the chart looks a bit different. For the VMDK creation the write rate is around 100000KBps with 160ms latency (write only, no reads). The read/write operation shows 70000KBps IO rate with 10-15ms latency.
Update: Yellow-Bricks author Duncan Epping was kind enough to include this post in his “VAAI sweetness” article with other examples. Must read!



NIce post. Question thou, why does the CPU run at 100% ?
Hi Dan,
Thanks for your comment/question.
It’s not necessary to run at 100%. The reason is, with VAAI the vSphere offloads the operation from the host side to the storage controller side. This literally means the copy will happen inside the array. So you have your data (in this case a VMDK file) in a LUN and you have to copy or move to the same or another LUN (clone or Storage VMotion). With VAAI the storage controller will do the copy or move operation in the name of vSphere, of course it makes additional load at the storage side, and there will be no load on the ESX(i) side. The normal operation (I mean, without VAAI) makes around 80% load on the storage CPU (in this case, see the charts above), so 100% is not a big addition in this case. On the other hand as per the Hardware Universe doc, FAS2040 has dual core CPU, I think this load must affect only one of them (have to do more sophisticated charts). FAS2040 is the smallest NetApp array with ONTAP8 support (ONTAP7 doesn’t support VAAI). So sizing storage and host for VAAI operation is more difficult, but nice job