Destaging Writes from Acceleration Tier to Primary Storage – Part II

In the part I of this series, I introduced FVP’s asynchronous data destaging in write-back mode from flash to the primary storage. I discussed the various nuances of destaging and showed how asynchronous destaging helps applications by providing flash class latency using a typical I/O workload. In this blog, I will discuss the implications of accelerating a write-intensive workload and the impact of asynchronous destaging on the workload performance.

Accelerating Write Intensive workloads 

A VM running a bursty-write workload was used for this test. During the testing period, the workload issued only writes which peaked to a very high value periodically. This VM was selected to be accelerated by FVP and was put in write-back mode. Figure. 1 shows the write operations observed by the VM during the entire testing period. Writes reached as high as 15K/sec during the peak periods but were only ~250/sec otherwise. All the writes were serviced by the flash device during the entire testing period including the bursty periods. However unlike the experiment in part I of this series, during this test the primary storage couldn’t service writes at the same rate as that issued by the VM. As a result, the rate of destaging VM’s data from flash to the primary storage was slower (11K/sec) than the rate of writes issued by the VM (15K/sec) which meant all of the VM’s data couldn’t be destaged as soon as they arrived during the bursty period. Thanks to FVP, the writes were acknowledged as soon as they arrived allowing the VM to issue more writes, but were sent to the primary storage at a rate the storage was comfortable of handling. The non-overlapping write peaks in figure. 1 illustrates this behavior and highlights the advantage of having an acceleration tier that can service writes as soon as they arrive, but sends the data to its permanent residence asynchronously without overwhelming it.

(Click to enlarge)

IOPS-FCFig 1. Write Operations

As the VM starts issuing writes, the writes got serviced by flash at flash speed (flash  + network speed, when using peers) as shown in fig 2. However, since the rate of writes from the VM outpaced the rate of destaging, destaging region saw a continuous increase in the amount of data to be destaged.  FVP continued to service writes at flash speed till the occupancy of destaging region reached a threshold. If the occupancy crosses the threshold, FVP starts injecting additional latency when acknowledging a write back to the VM to throttle new writes. This threshold is a carefully selected value that gives destager enough cushion to flush the dirtied data even if the primary storage is slow in servicing writes. The injected latency depends on the destaging area occupancy and the SAN latency (latency experienced by destager when writing dirty blocks to the primary storage) and is added when acknowledging only those writes that fill the destaging area above the threshold. Thus, the effective write latency (blue line) seen by the VM during bursty write periods was higher than flash latency (orange line), but much lower than datastore latency (green line).

(click to enlarge)

Latency-FCFig 2. Latency of Write Operations

The throttling aggressiveness is determined using an intelligent algorithm and adjusts dynamically to maintain the occupancy of the destaging region under the threshold. If the occupancy doesn’t reduce, FVP increases the throttling rate further until destager is able to empty enough data from the destaging region so that the occupancy falls below the threshold. As soon as the occupancy drops below the threshold, FVP resumes servicing writes at flash speed. However, most often writes from enterprise applications occur in short-spurts. The default size chosen for the destaging area is adequate to handle the spurts. Writes, in such cases, should be serviced at flash speed.

In summary, even for write-intensive workloads,  FVP can still provide an SLA that is much better than that promised by the primary storage technologies available today. Even a high barrage of writes is easily handled by FVP at flash like latencies. With its intelligent capabilities, FVP handles the burst even when primary storage is incapable of handling it.

UP NEXT: Accelerating Write-only Workloads ….

Resources:

  1. Iometer configuration file used for the test: bursty_writes
  2. Destaging Writes from Acceleration Tier to Primary Storage – Part I
Advertisements

Destaging Writes from Acceleration Tier to Primary Storage – Part I

Frank posted a nice article on write-acceleration policies supported in FVP. It is a great read for anyone looking to for a quick intro on the two write-acceleration policies supported in FVP. At the end, some readers asked few interesting questions regarding ‘Write Destaging’, answers to which require a deep dive than a simple two-line replies. Hence, I thought of explaining FVP’s destager architecture as a multi-part blog series. This blog offers an introduction to asynchronous destaging of VM’s data from flash using an example.

BTW, kudos to all those readers who raised these questions! Just shows, how well these readers understood the technicalities of write-acceleration. Tip my hat to you folks and bow to you Frank.

Destaging Writes from Flash to Primary Storage

In the write-back mode, FVP acknowledges the writes coming from a VM as soon as it is written to the flash. The data is written to the primary storage (permanent residence of the data) eventually, at a rate the primary storage is comfortable of receiving data. This task of destaging the data written by VMs to their primary residence is delegated to what is called as a ‘Destager’, a key component of FVP that runs in the background. Essentially, in the write-back mode, writes from the VMs are acknowledged at flash speed (flash + network speed, when using peers), while they are sent to their permanent residence asynchronously at SAN speedNote that asynchronous data destaging is relevant only in write-back mode.

Destaging Area

At any given time, FVP uses flash in multiple ways – to host data read frequently by VMs (to accelerate reads), to buffer primary copies of data written by VMs running on the server which houses the flash (to accelerate writes), or to keep replicas of data written by VMs running on remote servers (to provide fault tolerance in write-back mode). In order to accelerate many VMs on a vSphere host, and to accelerate both reads and writes of these VMs, FVP has to manage the flash real estate very efficiently. FVP uses dynamically expanding and shrinking regions on flash to hold the writes coming from the VMs until all the data is moved to its permanent residence. This region is called ‘destaging area.  Each VM that is configured to be in write-back mode gets a separate destaging area.

Destaging Frequency

FVP acknowledges a write issued by a VM in write-back mode as soon as it is written to the VM’s destaging region on the flash. In the back ground, FVP activates the destager to migrate the VM’s data to its permanent residence. The migration happens at a rate the primary storage is capable of handling. When multiple VMs are configured to be in write-back mode, all their writes are acknowledged as soon as they are written to the individual destaging regions. In this case, destager migrates data from the destaging regions of all the VMs simultaneously, but more importantly, without overwhelming the underlying primary storage.

Implications of Destaging on Write-Acceleration: Flash Class Application Latencies! 

Let me illustrate the mechanics of destager with an example. In this experiment, a windows VM running iometer issued writes in burst to the primary storage. Figure 1 shows the rate of write operations during the experiment. Writes reached as high as 4K/sec during the bursty periods. This VM was selected to be accelerated by FVP and was put in write-back mode. All the writes were serviced by flash and the written data was destaged to the primary storage asynchronously by the destager. In this experiment, the primary storage was able to service writes at a high rate. Hence, the destager could empty the VM’s data as soon as it arrived.

The result: Writes/sec seen by VM = Writes/sec serviced by Flash = Destaging Rate = Writes/sec written to primary storage asynchronously (hence lines representing rate of writes serviced by different components overlap in fig 1).

(click to enlarge)
IOpsFig 1. Write Operations

However, the latency of write operations seen in the VM tells a different story. Figure 2 shows the latency of the write operations observed by different components during the test. By the virtue of write acceleration by FVP, all the writes were serviced by flash at flash speed (orange line showing “Local Flash Write” latency) even during periods of bursty writes. Write latency seen by the VM was almost the same as flash write latency (blue line showing “Total (Effective)” latency). Flash latency increased by only 200 microseconds during the bursty period. In contrast, I/O latency witnessed by destager when destaging VM’s data to the primary storage  reached as high as 3ms** (green line showing “Datastore Write” latency). This would be the latency seen by the VM, if it were to issue writes directly to the primary storage.

(click to enlarge)
LatencyFig 2. Latency of Write Operations

Most applications exhibit a write behavior that is similar to that shown in the above illustration. For such workloads, clearly, FVP offers an unprecedented boost in I/O QoS. This boost can be realized by a mere addition of an SSD to vSphere hosts and creation of a clustered acceleration tier on the SSDs using FVP.

NEXT UP: Accelerating write-intensive workloads…

** Primary storage used for this experiment was an all-flash SAN. In reality, latency could be even higher (few tens of milliseconds)  if the primary storage device was configured on magnetic disks.

Resources:

  1. Iometer configuration file used for the test: Bursty_writes
  2. Frank’s blog on Write-Back and Write-Through policies in FVP
  3. FVP Writeback policy deep dive whiteboard session
%d bloggers like this: