ORIGINALLY POSTED 17th October 2010
17,538 views on developerworks
There are two reasons that my blogging this year has been sporadic at best. One, the sheer volume of work, its become the norm at the moment for the whole team to be spending evenings and weekends working on our baby.Why have we been all so dedicated over the last year or so, mainly because we’ve wanted to do this box for at least 15 years! There have been various proposals using the SSA RAID, and then of course SVC, and at last we got the chance to do it – (I urge you to read this insightful post from David Floyer – All I will say is that since this is built from the SVC core stack, the core of that stack is the clustering software…) We’ve all known that we had something really special, and with the enterprise class features, functions, field hardened code bases and of course the combined teams experience of something like 3000 years of storage development!
The second reason for my lack of blogging was the excitement, when
you know (or at least think you know) you are working on something
special, and something that could lead the market in a new all
encompassing manner, its difficult to keep it to yourself. To a large
extent we did, we even had internal NDA documents inside IBM, with a
“need to know” signature basis only. Some of the sales guys for example
only found out about this a week or two back. It was great to see that
the excitement was not just within the development team, and that not
only those people, the press, analysts and customers that attended the
launch events, including of course our critical sales and channel
partners, but also the many independent writers and bloggers out there
also saw that although this is just another modular storage array, it is
at the same time just a little bit special…
One thing that wasn’t mentioned much, and people have commented on (other than the name – and please, lets not go there) is the VMware integration. As the box is “SVC” to the outside world, i.e. at the SCSI layer, when requested for inquiry data, the box reports as an SVC model. Therefore, all the interop we have built up in the last 10 years with SVC comes on day one. Not only that, but the much loved “svctask” and “svcinfo” command line is unchanged. I say unchanged, but what I mean is the legacy is unchanged, we have a set of new objects like “drive” “array” “enclosure” “enclosureslot” “enclosurecanister” and some new “fabric” views, “sasfabric” etc So any host level intergration that requires recognising the lun type, and of course connecting into our CLI to run snapshots or site failover are all unchanged. (VSS, VMware SRM, TSM etc)
Now all that is out the way, down to the nitty gritty. Lets look at the hardware first.
Hardware in Detail
Back before the turn of the millennium, those clever bods in Almaden could see that the Intel server hardware was coming of age. I’ve talked about this on many occasions before, and at the time of SVC’s first release back in 2003 we had to spend a lot of time talking up Intel. Why didn’t we use Power, why use just a 1U server… is that really prime time… is that reliable enough. I know our stalwarts of criticism (them with the other TLA) spent at least 6 years spreading FUD before doing a COMPLETE U TURN with the latest max-i release of their aging architecture which basically put a rubber stamp of “ok, they got it right, but we aint ever gonna admit that”
x86 hardware has (despite selling into server shops) always been driven by the gaming industry. With the need for massive bandwidth between PCI (AGP), and now PCIe slots, CPU and memory, its always been pushing the forefront of technology to get those gaming graphics FPS out to the monitor.In a storge controller we also need that same bandwidth, mainly between Memory (Cache) and PCIe, where we can hang our Fibre-Channel ports, and of course now in the Storwize V7000 the SAS interfaces.
So what if we took what we had been doing in SVC for years, but shrunk the 1U server form factor into a much more storage friendly form factor. Say Storage Bridge Bay (SBB) for example. We could still have, the Fibre-Channel “PCIe card”, a SAS “PCIe card”, maybe even a PCIe internode connection and still have spare room for another TBD “PCIe card”…. people have been doing micro and even smaller PC formfactors for years…
The same goes for the battery, since we need a way to hold up power/memory to enable a graceful shutdown should the site power be lost. With SVC we’ve been using a separate 1U UPS, but why not embed the battery in the PSU itself, rather than take up space in the control canister.
The SVC code stack has a hefty investment in the PMC Sierra Tachyon based Fibre Channel cards, and like the latest CF8 node hardware the Storwize V7000 uses their QE8 – 4 port 8Gbit protocol chip. In addition we are using their SPC SAS protocol chip, which the observant will know is the same chip we provided as an optional feature on the CF8 node hardware with the SSD attachment. The base controller unit also includes a PMC SAS Expander chip, this is connected to one of the wide ports on the SPC chip and to the 12 or 24 internal drive phys on the control enclosure.
Each of the expansion enclosures come with two expansion canisters, which have an “in” and “out” wide SAS port, both of which again connect into a PMC SAS Exapnder chip, which connected to the 12 or 24 drives in the expansion enclosure.The system can support up to 120 dual ported drives on each SAS chain. We have two chains – these are marked ports 1 and 2 on the back on the control canisters. Each chain can therefore have 5 SAS Expanders – so port 2 (what we also call the internal chain) is connected to the first 12 or 24 drives, and can then connect up to 4 expansion units. Port 1 on the control canister can connect up to 5 expansion units. We recommend you add expansion units on each port alternately – thus the kinda inverse naming of the ports, so that when you connect your first expansion unit, you use the logically named – port 1.
At the back of the enclosures we have dual redundant control canisters, or expansion canisters. The bottom canister is upside down – as is required by the SBB standard. The rule for connecting SAS cables, is that you connect all “top” canisters to other “top” canisters and vice versa. The cabling really is pretty damn simple, certainly when compared to the fun and games of FC-AL cabling!
I guess I should finally talk about performance. SPC results will be published in due course, but here is some internal benchmark data. The rough comparison for existing SVC users that a Storwize V7000 is roughly 75% of the power of the current CF8 node hardware – when used in “SVC mode” – i.e. with no internal disks. That means we are talking about needing about 800-1000 external 15K RPM disk spindles to come close to saturating its virtualization capabilities.
The following HDD tests were all done using 240x 10K RPM 2.5″ SAS drives configured in RAID-5 7+P. The SSD results were using 24 SSD drives in 5+P RAID-5 arrays – note this many SSDs are not required to achieve these results.Those marked * are disk limited and have room for improvement with 15K RPM drives, or other additional options next year ….