Tech blog December 22, 2014
Recently Microsoft released a new series of Virtual Machine called D-Series. D-Series have a relatively new CPU (Xeon E5-2660, 2.2GHz) and local Solid State Drives (SSD). Amazon AWS already has SSD instances, and there are several SSD-only cloud vendors. I’m just excited that we can finally use SSD on Azure. In this blog post, I will show you the I/O benchmark result of the D-series VM.
Azure VM has two types of local storage; one is a persistent storage and the other is an ephemeral storage. The data on the persistent storage is reusable after the VM is rebooted, but the data on the ephemeral storage may be lost after rebooting. For the D-Series, SSD storage is an “ephemeral” storage. As shown below, the performance of SSD is outstanding but you have to save the data in a persistent storage before the VM is turned off. The size of the new SSD storage is limited, in case of D14 (a high end instance of D-Series), the size of local storage is 800GB. If you need more storage, you can use Azure Storage which is an elastic storage system for the cloud and provides several different ways to access it. An Azure VM just mounts Azure Storage with “mount” command as a local block device.
I prepared one D14 instance that CentOS release 6.5 was installed on with 30GB of persistent storage, 800GB SSD ephemeral storage, and 999GB Azure Storage is mounted. (Fig. 1). The specification of D14 is shown in table 1. In this blog post, the I/O performance of 3 different storage types are evaluated.
|# of cores||16|
At first, I tried a simple read/write test with “dd” and “hdparam”. This test is very useful for a quick check since these commands should be pre-installed in many Linux distributions. The actual command is:
– Read Test
# sync # echo 3 > /proc/sys/vm/drop_caches</ # hdparm –t
– Write Test
# dd if=/dev/zero of= ibs=1M obs=1M count=1024 oflag=direct
The command is simple. However, if you missed a command or command arguments the result will likely be incorrect. Line 1 & 2 in Read Test Commands are to remove data on a read buffer cache on RAM, and evaluate the performance of storage IO. In the Write Test command, the 1GB data is read from /dev/zero and written in . The last argument in the dd command (oflag=direct) is for the direct access to the storage medium without Linux caching reads or writes. The result is shown in Fig. 2. The measurements were conducted 12 times and the interval of each trial is 10 sec. These commands issue I/O operations sequentially.
For HDD, the typical performance of a sequential Read/Write is around 100MB/s. For SATA/SAS SSD, it should be around 400MB/s. As shown in Fig. 2, Azure’s Ephemeral storage shows a remarkably high performance. They must use the highend PCIe-SSD having the typical performance of around 1GB/s. In the result of Read Test, you can see that the performance of the persistent storage (Blue Line) depends on the trial number. As I mentioned, I removed the Linux OS cache with sync; echo 3 > /proc/sys/vm/drop_caches commands, but It still looks like cache effect. I don’t have any concrete reason for this, but that might be due to the effect of cache on the host of Virtual Machine which we cannot control from a guest OS.
The simple Read/Write test is easy-to-use, but the result is not so reliable so we also evaluated the 3 different types of storage with FIO. FIO is a standard tool in Linux for evaluating I/O performance and has a bunch number of options. It may be difficult for a beginner to pick up right options. In this evaluation, I used the job file published by WinKey. With this job file, you can do the same types of I/O performance test as CrystalDiskMark (https://crystalmark.info/en/software/crystaldiskmark/ open_in_new ), which is a famous benchmark tool in Windows OS. The command is just typing :
# fio crystaldiskmark.fio
The target storage area is /tmp as default. If you want to evaluate a different
directory, Edit Line 7 (directory=/tmp/) in crystaldiskmark.fio. The result is
shown in Fig. 3. Note that Y-axis is logarithmic scale. The performance difference
between the ephemeral storage and the persistent storage is dramatically large
especially in Random Read/Write at Queue Depth* = 32.
* Queue Depth: The number of simultaneous input/output requests for a storage device.
As demonstrated in the FIO test results, the SSD performance on Azure D-Series is great but again, the problem is that SSDs are ephemeral storage. If your application boots up and shuts down instances often, you could develop a management tool or daemon to save your essential data in Azure storage or persistent local storage.