ORIGINALLY POSTED 21st October 2009
10,997 views on developerworks
So far I have concentrated on the hardware side of the new SVC 5.1 release, thats probably because after the work I did with IBM Almaden Research last year on Project Quicksilver, I became the architect for the sub-SVC project to add SSD support to SVC. The tecnology demo set the scene and luckily we’d already been looking at how we could use existing SVC functionality to make a production storage device like Quicksilver using the SVC code base. Given that one of the value propositions of SSD is a reduction in power usage, possibly less of an advantage over and above the performance gains per device – maybe watt/iop is a good way to look at it. It seemed a waste of power to put a few SSD into their own box with say 500+W of power usage, their own FC ports etc etc. Embedding them into the node itself meant we could use of the same hardware, power supplies, FC ports and processing power that we already had for the SVC stack itself. But what about redundancy. Well the Virtual Disk Mirroring (VDM) function we added to SVC at 4.3.0 was an almost perfect solution. Mirror the SSD between nodes, and now you have not only data redundancy, but hardware redundancy. Should even a node fail, then you still have the partner node with a copy of the data.
So, VDM it was. But mirroring is expensive, it always has been, its always been the best performance solution, but its also the most wasteful of space… what if we could do something a little bit different. Given we can today use VDM to mirror data between disperate backend storage controllers. What if we could do the same with a single SSD based copy, and another already RAID protected backend FC based copy. Hmmm… Now you would get 100% of the capacity of the SSD you purchased as usable capacity. No wastage for RAID at all… and what if you could still maintain the performance of the SSD. Well that would just be a holy grail would it not?
I still remember the moment this all hit me.
Maybe I should step back and remind you, or educate those that don’t know, how SVC’s VDM works. The using system sees a single vdisk, lun or volume and SVC manages keeping two copies of the data, spread over two distinct pools of storage. The VDM function itself does no internal load balancing. Therefore we have one copy of the vdisk defined as ‘primary’. Any reads to this vdisk come from this primary copy. All writes are written to both.
This function lies below SVC’s cache, so the server write will have long since completed when the SVC cache decides to destages the write. This which results in the write forking into two, one to each of the copies of the vdisk.
If one write take a long time, or there are issues with the storage controller that is providing this copy we may have to perform an ERP and either fail one write, or wait for it to complete. In SVC 5.1 we have adapted this algorithm to include a ‘time to live’ on writes. So now, we can complete the cache destage even if only one write has completed in the ‘time to live’ period. This means that one copy is marked as ‘out-of-sync’ and will be later re-synced, when the controller recovers, or any faults are fixed.
Given the ‘primary’ read nature of VDM, if we set the VDM up to have the primary copy on SSD, and the other copy on an external RAID controller (presumably HDD based) then we get the massive read performance boost that SSD’s provide.
So what about writes which have to go to both copies (one on SSD and one on HDD)… well given that the second copy is on an external RAID controller, with its own write cache, this is a write into memory on the controller, and unless the write workload is overloading the controller cache, this write will complete very quickly.
Thus, for read biased workloads, SVC really can provide a major differentiator over any other system implementing SSD based storage. I don’t know of any other system that offers this ability today – I could be wrong – and feel free to comment if you know otherwise.
- We can provide every GB of SSD capacity you buy as usuable space.
- We can maintain optimal SSD read performance.
- We can maintain SSD write performance – for workloads with a 70/30 or higher proportion of reads.
- We can suffer a loss of the SSD and or node hardware and you still have RAID protection of your data within the backend controller itself. Giving you very high IOPs RAID-51,61 or even 11.
This is just one of the areas where we have been innovating with SVC, doing more than that looking at SSD as a replacement for disk. The software sitting above the SSD, managing your storage is just as important as the physical hardware itself, and thinking of using these devices in different ways can drive massive value to you. The disk bottleneck need no longer be an excuse for having to provision many more TB than you actually need. It just needs us all to think about storage in a different way.
I still need to move on to talking more about the other new software features of SVC 5.1, but this could be seen as a transitional post, just as SVC+SSD is a transitional way of thinking about storage. Don’t think about storage in a box centric view. Think of it in a storage service driven view – what services can my storage provide. How dynamic can I be about moving data from one container or tier to another. Virtualization and commoditisation are today prevellent across most of our industry, why shouldn’t ‘capacity’ be the same.