2010-41: A Brief History of “recent” Time – Part3 – Barry Whyte and Andrew Martin : IBM Storage

ORIGINALLY POSTED 3rd October 2010

16,162 views on developerworks

2000-2007 The “Virtual” Storage – SAN Virtualization Appliance Years

First of all thanks again for all the feedback regarding this series of posts, great that people are enjoying a bit of a history lesson, or even a reminder into your own histories!

So where was I… The Hursley team was in a bit of flux around the end of the last century, we’d been masters of our own destiny in a way with the 7133 and SSa products and seen close to $B revenues as a result of our efforts. With the follow on FC-AL adapter moving to embedded in DS8000 only, there was time for the architecture team to start looking at – what next…

Two of the key members of the architecture team had, or were working on assignment in the Almaden Storage research team and had been working on a “new horizon” storage architecture. The project name was Compass, or ComPaSS. Commodity Parts Storage Subsystem. (Remember CPSS from my previous post Common Parts Storage Subsystem) Well this architecture took that idea to the next level. What if you could have a loose collection of commodity hardware, that by its coupling, or clustering made said commodity hardware much more like an enterprise system. More is better and all that. The concept of a plurality of engines, or processing units that all knew about each other, but at the same time all did their own thing

IBM was looking for “new horizon” projects to fund at the time, and three such projects were proposed and created the “Storage Software Group”. Those three projects became know externally as TPC, (TotalStorage Productivity Center), SanFS (SAN FileSystem – oh how this was just 5 years too early) and SVC (SAN Volume Controller). The fact that two out of the three of them still exist today is actually pretty good. All of these products came out of research, and its a sad state of affairs when research teams are measured against the percentage of the projects they work on, versus those that turn into revenue generating streams… (but thats for another day)

SanFS was doomed. It required host side agents to make it work. Trying to get storage and server teams in any organization to sync-up… well you know better than me!

TPC, has morphed from TotalStorage, through SystemStorage and is now Tivoli Storage! [2019 comment: and now Spectrum!]

Meanwhile the flagship of that original trio has to be SVC. As I have discussed before, SVC was originally known as “lodestone” for the reference back to the original ComPaSS name. When the team inherited what the guys had done in Almaden we had the basis for a Cache component, and a simple SCSI front end and back end layer that sandwiched the cache.

In the years leading up to 2000 I had been working on porting the RSM tool to AIX, and was cutting my teeth on AIX device driver development, when, with my UI and configuration knowledge was asked to look at how we could configure a virtualization engine. One of the nightmare aspects of any configuration interface is that it has to understand or at least be able to represent everything. We spent a lot of time working out how to make a “self-generating” interface, using some of the SDS (Self-Describing-Structure) concepts we already had in the 7133 products. I was heavily involved in various UCD (User Centered Design) studies that ended up naming the “virtual” and “managed” disks we have today. One thing that I do pride myself on is how well the CLI (Command Line Interface) is taken for granted, intuative, fully functional, and the tab completion that was added in 2008 is a godsend. Anyway,.. back to 2001…

The rest of the team was being staffed up quickly, we had a core set of people that had worked on the 7133 products and its no suprise that a lot of the architectural concepts embedded in the SSa adapters, and DS8000 DA cards made their way into SVC. A debug strategy is key – the idea that we can dump the entire memory contents of a running SVC node and run a binary editor that can decode the exact state of every variable, structure and the like more like kernel development than software. ( I recently was scrounging through a hardware scrap pile, and found SVC node, with display panel number 000003 – if only it was 000001 – but anyway number three sits pride of place on my desk!)

It took us from 2001 to the middle of 2003 to finish the design, develop and most importantly test the first SVC 1.1 release to get it to a sufficient level of quality and reliability that we had been used to with our historical (but this time >10 year old) adapter firmware.

Over these years, I’ve learned there are at least three KEY features that are essential to the success of a storage system “architecture”. Thats not to say there aren’t a lot of other important things, but unless you have these KEY features and they are built into the architecture from day one you are, well, to be blunt, screwed.

Concurrent Upgrade
- You must be able to take your system from level X to level Y without any loss of access to the using systems.
First Time Data Capture
- All software has bugs in it.
- It sounds like a simple problem.
  - Take a bunch of zeros and ones, move them through some post processing, and then store them on some medium.
  - I often wonder if (what always seems a huge part of our development cycle) our test team actually over test. Would our sizings be any different if it was life critical system? …
    - Probably not!
- At some point, something will go wrong.
  - You’d better be able to tell what it was, why it was, and how it can be fixed / resolved.
  - The last thing a customer wants to hear is “can you do that again so we can capture the trace…” or “we aren’t sure but maybe upgrading to latest level wil fix it ?!”
- I still pay my dues, you maybe lucky – or is that unlucky – enough to get me on the end of the Level3 support phone at some un-godly hour. So I know the pain..
Scalability, in particular N-way-ness (Clustering)
- 640KB… who would want more than that… Need I say more… plan for scalability.
- What today seems like excessive, will tomorrow be laughable.

SVC was designed from the bottom up with a set of core architects that had come from the team that had lived the 24×7 7133 days, and we knew you couldn’t retro fit any of these things. The base architecture not only benefited from being built with cluster scalable, upgradable, debugable-ness ( I think i just made up a word ) but also grew from the scalable, flexible component-ised (encapsulated) firmware style that had evolved out of the SSa RAID adapter family.

In 2003 SVC was released as a 2-way dual controller appliance (engine), in 2004 we added 8-way, and pretty much every 18 months since we have doubled the capability an individual node()hence the cluster). I often get asked when we are going to take SVC beyond 8 nodes? The truth is, the architecture will scale today to 32 nodes easily, and way beyond with a little tuning. However since the “commodity” hardware base doubles in performance every 18 months, that means since 2003 we have scaled out the cluster some four fold, so what would have been a 32 node cluster in 2004 is like a 8 node cluster in 2010! Or to put it another way, next years 4 node cluster is like last year 8 node…

In the last 7 years we’ve all been busy. The adapter silicon team has been working on new PCIe ASCIs, the adapter hardware team have been working on FC and SAS protocols, and the firmware developers have been adding things like RAID-6 and atomic parity.

Those that moved into and joined the SVC team have been leading the way with enterprise class snap-shots (FlashCopy), and replication (Metro/GlobalMirror), not to mention Space Efficient virtual disks with zero detection, breaking that storage vendor lock-in – adding performance to existing storage (like a turbo unit) and of course bringing every IOP that enterrpise SSDs can provide in our latest CF8 nodes with the SSD attachment…

Part 4 soon…