Tech blog October 14, 2015
Speed and capacity are the only metrics that matter to most users. This over-reliance on the “flashy” metrics often means that reliability gets overlooked. If you are a professional videographer, conside r devoting some time to learning about reliability and its importance.
Consider that in the world of video capture, some scenes are truly once-in-a-lifetime. Whether you’re shooting a rare wild animal, a fantastic sports play, or an actor’s heart churning performance - thes e are scenes that if lost, cannot be replaced. That data is quite literally priceless. Even when considering scenes shot in the studio environment, lost data is a massive drain of time and resources. Losing data is simply unacceptable.
Technology to Prevent Data Loss
From consumer to enterprise level, all Solid State Drives (SSD) have some technology implemented to prevent data loss. Right now, ECC (Error-Correcting Code) is the most well-known method. ECC allows us to reproduce data if some of it is corrupted or lost. Although ECC helps, the “size” and “distribution” of data that it can recover is rather limited. For example, if the ECC in your SSD has a 32 bit len gth of error collected for a 512 Byte (=4096 bit) data, then that means the SSD can only fix one error per 128 bits (4096/32=128). Though this number sounds small, it is normally enough to cover the electronic leaks inherent in all SSDs which are typically the main offenders. The main problem arises when an entire NAND chip fails ? ECC does not provide enough to recover from something of that magnitude. Another point that should be made clear: ECC can only reproduce the data if it was distributed in a different physical memory area. This means if data loss occurs within the same physical area, ECC is useless.
Improvement of Storage ReliabilityNow it has been established why ECC alone is not enough to be considered truly reliable. So let’s add some more preventative technology: RAID (Redundant Array of Independent Disks) open_in_new. RAID is a staple of storage technology in improving both speed and reliability by utilizing two or more disks in an array. For example, if you have two disks set up in a mirroring formation (known as RAID 1), you copy all of the same data across the two devices. Using this method, if one drive fails, all of your data is still safe in the mirrored drive, improving reliability 100%. There are many different variations of this basic idea, such as RAID 5, which actually utilizes three or more drives. This configuration prevents data loss when one of two drives suffers failure. Implementation of RAID dramatically improves the reliability of data storage and is commonly used in workstations and servers. Though there is great benefit to implementing RAID, it needs two drives and sometimes needs a special RAID card and/or enclosure. RAID is usually not available for portable devices for field shooting.
Raid 5 “Inside”Fixstars’ new SSD, the IRIS-DRIVE, has RAID 5 internally in order to prevent a much wider range of data loss. Figure 1 shows the block diagram of our SSD . The “channel” is the data path between the NAND chip and the NAND interface (the figure only has 4 channels, but a real SSD has many more). For a general SSD, the NAND controller slices the data into s mall chunks and sends these chunks simultaneously over multiple channels, utilizing parallelization similar to RAID 0. This technique contributes to the improvement of I/O performance but results in lower reliability within the SSD. When a NAND chip fails, the data it stores is not recoverable by any means, not even through ECC. For the IRIS-DRIVE, the controller uses these channels in RAID 5 mode. That means that the highly regarded RAID 5 configuration is used without any additional drives within IRIS-DRIVE.
Article by: Hironobu Asano, Translated by Akihiro Asahara, Israel Imru and Owen Stampflee