Musings from the Technology Underbelly: febrero 2018

By now you probably know that on Dec 19, 2017, Seagate announced their “Multi Actuator Technology” [1]

Here I will address some of the implications.

In a nutshell, instead of mounting one Head Stack Assembly (HSA) with X number of heads in a pillar with one actuator sandwiched between two neodymium magnets, you mount N HSAs on the same pillar with X/N heads each (well not exactly, more on that below), each head stack assembly with one actuator, and instead of only two neodymium magnets, N+1 are required. At least for the time being, Seagate is making N=2, and making this appear to the system as two separate drives (even though they share a single SAS/SATA bus port).

A crude implementation could be to put a SATA port multiplier [2] on board, and from there on, two of pretty much everything. Of course, Seagate will not go for such a crude implementation. Most likely each disk will use only one connector (SATA or SAS) and be distinguished by a driver (after all, the OS needs to know about this arrangement, so that it does not try silly things, like mirroring the top half of the drive to the bottom half of the same drive). Also, there will be only one head controller, but aware of the dual actuator arrangement. Caches (RAM and Flash) will also be shared. Most likely, each “virtual drive” will have its own Queue.

But, there are some tricks that Seagate can pull out from their sleeve that most analysts seem to forget about the implementation, and that give the idea a very interesting potential, not only in datacenters (Storage Arrays, Distributed Filesystems and Cloud), but in desktops to boot. We will use a 6 platter 12TB 3.5” HDD as an example (it is useful just because 12 is divisible by 6, 4, 3 and 2).

Trick #1: No one said that the split has to be in half, one can have an actuator with 2 platers and 4TB and another with 4 platters and 8TB. The old split between OS and data, like we used to do on old Workstations of yore with two Hard Disks in order to gain performance, but now, much cheaper. And, if you decide to buy two of these drives, you can RAID1 the OS part for resiliency and RAID0 the data part for performance (say scrap video/photos while editing or Steam games, were the important/definitive data is backed up to a NAS/SAN or the Cloud) independently of whether you have Intel’s Matrix RAID®™ or not.

Trick #2: No one said that the two actuators have to use the same write technology. One actuator can have 2 platers and 4TB using standard Perpendicular Magnetic Recording (PMR), while the other can have 4 platers and 10TB with Shingled Magnetic Recording (SMR) [3]. So that one part of your drive can be devoted to your Read/Write tasks, while the other part can be devoted to those Write Seldom Storage tasks. And of course, these can be used on Desktops/Laptops, or even on Storage Arrays. More capacity, for cheaper, and performance where you need it.

Trick #3: If you go with SMR in one set of actuators, you need bigger caches for the occasional writes (we already stated that the workloads here are Write Seldom), but that extra cache will be helping the performance of the Read/Write PMR side, and the read of the SMR side (of course, this does not apply to Storage Arrays, only to personal devices).

Now, for the really interesting part, the performance.

Some commenters like Chris Evans and Howard Marks have raised doubts about the technology, while Seagate remains bullish. For a brief survey of these, see [4]

Again, I emphasize: This arrangement appears to the system as two separate drives. Every Block will be written by only one actuator. Not only that, every File/Object/”Higher level abstraction” will be written by one actuator, and read by that same actuator only.

So, let’s analyze the performance implications in an intuitive fashion, using everyday examples, from most intuitive to less intuitive.

If you are old enough like me, you remember that having the OS in one Hard Drive (say, 6TB) and the Data in Another (say, again 6TB), was faster than having a 12TB HDD. Here, you achieve the same effect, but much cheaper.

If you ever had a RAID array, you know that rebuilding Two 12HDDs @ 6TB per HDD RAID 6 Arrays in Parallel will be significantly faster than rebuilding one 12HDDs @ 12 TB per HDD RAID 6 Array (of course, if and only if your controller is well designed to start with). The number 12 is not casual, 12 is the typical number of 3.5” HDDs that a 2U RAID Array will take. With the Multi Actuator case, you can achieve the same effect, but much cheaper.

Say that you are serving a pool of N VDI VMs from your RAID6 Array, things will go faster if you go with half your machines in one 12HDD*6TB Raid 6 and the other half in the other 12HDD*6TB RAID6, instead of going with all the machines in just one 12HDD*12TB RAID6 Array, because each machine will only be contending with half its peers for IOs. With Multi Actuator, you can get that effect, much cheaper.

Same with your traditional Video On Demand Server. Better to have your 1080p HDR copy of “The Shape of Water” in one RAID and your 1080p copy of “Lady Bird” in another, that having both on an Array 2x the size.

Note: I use RAID6 in all my examples because, at these capacity points, one can not risk a second failure while rebuilding, in my opinion, RAID6 is the minimum nowadays).

Say you work with Hadoop’s HDFS. If you have two 6TB HDDs in your node and it gets a petition for two blocks, the possibilities are high that each request to read a block will be in a different HDD, and therefore, can be served in parallel, as opposed to a node that has a single 12TB HDD. With Writes is even better, the node itself decides where to write, so it can write were there is less demand With Multi Actuator HDDs, you achieve the same effect, but cheaper.

Even if your Storage does clever things like Huawei’s “RAID 2.0” or other such optimizations, having more independent actuators will mean faster reading and writing.

So, there is no doubt that, if Seagate can pull it off, this will lead to faster IOPS and Read/Write Performance in many areas, for a cheaper price than solutions of the past (two actuators in different pivots in opposing corners of the drive, with weird restrictions in order to avoid writing something before is read).

At least, that is, until we saturate the SATA 6GB interfaces and SAS 12GB interfaces. But hey, there is talk about SAS 22.5GB!

I’ll be watching closely how this technology evolves, it may behoove you to watch as well.

Resources:

[1]https://blog.seagate.com/homefeatured/multi-actuator-technology-a-new-performance-breakthrough/

[2]https://sata-io.org/developers/sata-ecosystem/port-multipliers

[3] https://www.seagate.com/tech-insights/breaking-areal-density-barriers-with-seagate-smr-master-ti/

[4]https://www.theregister.co.uk/2018/01/04/doubling_up_disk_drive_actuator_pillars/

Musings from the Technology Underbelly

miércoles, 7 de febrero de 2018

Some Quick Musings about Seagate’s Multi Actuator Technology