X4500 ZFS and iSCSI Performance Characteristics

Benchmarks are useful in many ways, they are particularly effective when you wish to validate an architectural design. In this case a SUN X4500 as an iSCSI target and VMware ESX 3.5 servers with QLA4050c initiators. This benchmark is not a definitive measure of what the architectural maximums are for the X4500 or the other components of the architecture, its a validation that the whole system performs as expected under the context of its current configuration. Within this configuration there are several components that have expected limitations such as the Ethernet switch buffering and flow control rates. As well we need to realize that in any iSCSI configuration it will have a characteristic collapsing point and inherent latency due to packet saturation peeks. In this architecture we can expect several specific limits such as 60% effective 1Gb Ethernet connection usage limits before latency issues are prevalent. Additionally when using SATA disk interfaces we can expect that high rates of small I/O will result in less than effective performance characteristics in the event that this behavior is sustained or is occurring very frequently.

The design details are on blog entry http://blog.laspina.ca/ubiquitous/running_zfs_over_iscsi_as

What becomes important to consider with this architecture is the cost to performance ratio in which this design is very attractive. The combined components of this system perform very well in this context and it has some pleasant surprises within its delivered capabilities. When we look at the performance of this design there are elements that seem to escape the traditional behaviors of some of the involved subcomponents.  Within the current limiting parameters we would expect it to underperform in more use cases that not, but this behavior is not occurring. There are several reasons for this result and ZFS is a big factor within the system since it performs blocks of  transactional write functions which are complimentary to SATA interface behaviors. SATA interfaces work well with larger transfer segments rather than small short transfer operations and thus ZFS optimizes performance over SATA disk arrays. Another factor is the virtualization layer on the VMware hosts consolidates many of the smaller I/O behaviors and delivers larger read and write transfer requests, provided that we make use of vmdk files when provisioning virtual disk devices.

In the first graph we are observing the results of a locally executed dd command collected with iostat as follows:

dd if=/dev/zero of=/rp1/iscsi/iotest count=1024k bs=64k

iostat -x 15 13 (only the last 12 outputs are used for the graph plot)

 

The graph reveals excellent 36 disk raid Z write performance at a level of 600MB/s for a sustained time period of 3 Min.

Other raid modes can provide significantly superior performance such as a 24 pair raid 1 mirror,  this however does not grant the optimal use of the possible available disk capacity and is not required for this application.

 

 X4500 Write Performance by Mike La Spina

 

This next graph reveals excellent read performance at a level of 800MB/s for a sustained time period of 3 Min. A dd command was again used as follows:

dd if=/rp1/iscsi/iotest of=/dev/null count=1024k bs=64k

             

 X4500 Read Performance by Mike La Spina

This final graph plots the iostat collection values from the X4500 while the ESX 3.5 initiators were performing real time active application loads over the iSCSI network for a 3 Min period. Additionally 8 Virtual Machines were added to the production real time loads are were executing Microsoft’s SQLIO tool on 8 – 2GB files sustaining 100% writes at a block size of 64k.

We can observe 220MB/s sustained I/O while both read and write activity was present and also find a surprising 320MB/s peek of final write activity. While this is not a maximum attainable level of the possible configurations it certainly validates the performance to be excellent and definitely meets the cost to performance design objectives.

iSCSI Performance by Mike La Spina

There are some small improvements which are available to optimize this designs performance. The use of jumbo frames at the network side will provide better performance for the TCP stack operations especially when using Software iSCSI initiators on VMware. More importantly using a raid Z array of 44 drives and two spares will improve the I/O performance by 15-20% at zero additional cost. As well the option to upgrade to 10Gb Ethernet is a next step if required as the X4500 can deliver much more than the current 4Gb aggregate.

Regards,

Mike

 

Tags: , , , ,

Site Contents: © 2008  Mike La Spina

Leave a Reply

XHTML: You can use these tags:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>