Destaging Writes from Acceleration Tier to Primary Storage – Part II

In the part I of this series, I introduced FVP’s asynchronous data destaging in write-back mode from flash to the primary storage. I discussed the various nuances of destaging and showed how asynchronous destaging helps applications by providing flash class latency using a typical I/O workload. In this blog, I will discuss the implications of accelerating a write-intensive workload and the impact of asynchronous destaging on the workload performance.

Accelerating Write Intensive workloads 

A VM running a bursty-write workload was used for this test. During the testing period, the workload issued only writes which peaked to a very high value periodically. This VM was selected to be accelerated by FVP and was put in write-back mode. Figure. 1 shows the write operations observed by the VM during the entire testing period. Writes reached as high as 15K/sec during the peak periods but were only ~250/sec otherwise. All the writes were serviced by the flash device during the entire testing period including the bursty periods. However unlike the experiment in part I of this series, during this test the primary storage couldn’t service writes at the same rate as that issued by the VM. As a result, the rate of destaging VM’s data from flash to the primary storage was slower (11K/sec) than the rate of writes issued by the VM (15K/sec) which meant all of the VM’s data couldn’t be destaged as soon as they arrived during the bursty period. Thanks to FVP, the writes were acknowledged as soon as they arrived allowing the VM to issue more writes, but were sent to the primary storage at a rate the storage was comfortable of handling. The non-overlapping write peaks in figure. 1 illustrates this behavior and highlights the advantage of having an acceleration tier that can service writes as soon as they arrive, but sends the data to its permanent residence asynchronously without overwhelming it.

(Click to enlarge)

IOPS-FCFig 1. Write Operations

As the VM starts issuing writes, the writes got serviced by flash at flash speed (flash  + network speed, when using peers) as shown in fig 2. However, since the rate of writes from the VM outpaced the rate of destaging, destaging region saw a continuous increase in the amount of data to be destaged.  FVP continued to service writes at flash speed till the occupancy of destaging region reached a threshold. If the occupancy crosses the threshold, FVP starts injecting additional latency when acknowledging a write back to the VM to throttle new writes. This threshold is a carefully selected value that gives destager enough cushion to flush the dirtied data even if the primary storage is slow in servicing writes. The injected latency depends on the destaging area occupancy and the SAN latency (latency experienced by destager when writing dirty blocks to the primary storage) and is added when acknowledging only those writes that fill the destaging area above the threshold. Thus, the effective write latency (blue line) seen by the VM during bursty write periods was higher than flash latency (orange line), but much lower than datastore latency (green line).

(click to enlarge)

Latency-FCFig 2. Latency of Write Operations

The throttling aggressiveness is determined using an intelligent algorithm and adjusts dynamically to maintain the occupancy of the destaging region under the threshold. If the occupancy doesn’t reduce, FVP increases the throttling rate further until destager is able to empty enough data from the destaging region so that the occupancy falls below the threshold. As soon as the occupancy drops below the threshold, FVP resumes servicing writes at flash speed. However, most often writes from enterprise applications occur in short-spurts. The default size chosen for the destaging area is adequate to handle the spurts. Writes, in such cases, should be serviced at flash speed.

In summary, even for write-intensive workloads,  FVP can still provide an SLA that is much better than that promised by the primary storage technologies available today. Even a high barrage of writes is easily handled by FVP at flash like latencies. With its intelligent capabilities, FVP handles the burst even when primary storage is incapable of handling it.

UP NEXT: Accelerating Write-only Workloads ….

Resources:

  1. Iometer configuration file used for the test: bursty_writes
  2. Destaging Writes from Acceleration Tier to Primary Storage – Part I

Running Virtual Center Database in a Virtual Machine

I just completed an interesting project. For years, we at VMware believed that SQL server databases run well when virtualized. We have illustrated this through several benchmark studies published as white papers. It was time for us to look at real applications. One such application that can be found in most of the vSphere based virtual environments is the database component of the vCenter server (the brain behind a vSphere environment). Using the vCenter database as the application and the resource intensive tasks of the vCenter databases (implemented as stored procedures in SQL server-based databases) as the load generator, I compared the performance of these resource intensive tasks in a virtual machine (in a vSphere 4.1 host) to that in a native server.

From a sheer size perspective, the database doesn’t pop out many eyes. But from the criticality standpoint it can give sleepless nights. The database though ~93GB in size, included an inventory that in reality can represent a very large vSphere based virtual datacenter. With 500 hosts and 8000 virtual machines, performance of this vCenter database becomes extremely important. The tasks whose performance I studied are among the most common operations that can affect the performance of a vCenter database. These operations included:

  1. Rolling up performance statistics
  2. Calculating top resource consumers
  3. Purging old performance statistics

I won’t go into the details of these operations. Refer to the white paper for an explanation. Though I expected the performance of the virtualized database to be very close that of the native database, I didn’t expect it to be this close. Both in terms of CPU utilization and I/O performance, the virtual machine was very close to native (in some cases better than native!). I also observed an unusual behavior of native I/O stack during my experiments (I will blog about it next time, but for now check the appendix of the paper).

Here, this is for you VI admins – if you are thinking of virtualizing vCenter database (why not? you can virtualize anything and everything these days ;-)) here is a study that should give you the confidence to take the next step. If you have any comments to share with me or rest of the community, interesting facts, or tough performance issues, feel free to drop a comment or two here.

Oh!, BTW here is the link to the white paper.

Previous studies:

  1. http://www.vmware.com/files/pdf/perf_vsphere_sql_scalability.pdf
  2. http://www.vmware.com/pdf/SQL_Server_consolidation.pdf
  3. http://blogs.vmware.com/performance/2009/07/summary——–vmware-distributed-resource-scheduler-drs-dynamically–allocates-and-balances-computing-resources-in-a-clust.html
  4. http://communities.vmware.com/blogs/chethank/tags/performance
%d bloggers like this: