ORIGINALLY POSTED 14th October 2009
17,301 views on developerworks
I’ve been waiting for what seems like far too long for this series of posts. The development and test teams have been working flat out for the last year to bring SVC 5.1 to market.
Today IBM issued a press release detailing a little of whats coming next month with SVC release 5 software, the new ‘CF8’ node hardware. The node is based on the recently released Intel Xeon 5500, Nehelam CPUs and provides a PCIe gen2 bus into which we’ve put a new 8 Gbit 4 port QE8 based HBA.
Last years Quicksilver technology demo threw another gambit into the mix and the plans had to be adjusted accordingly when the demand for such a solution became evident from the immediate customer and sales interest it generated. Requests came flooding in along the lines of “I’ll take 10TB today”. What started as a way to demonstrate the value of SSD in an already scale out high performance architecture like SVC has become a reality, and in very short order too thanks to the hard work and dedication of the SVC team. The development towards SVC 5.1 had already begun when we took the decision to include native SSD support in the nodes themselves.
What may surprise some is that this isn’t done using the FusionIO
ioDrive cards that we used in Quicksilver, but with STEC’s latest SSD
form factor and interface, namely SFF 2.5″ SAS ZeusIOPs drives – other
than IBM’s System P servers, the first vendor to offer enterprise class
SAS SFF SSDs in a storage system. There were various reasons for
switching from FusionIO to STEC, the most obvious is that a hot
pluggable drive is much easier to service than an internal PCIe adapter
(when it comes to a 24×7 enterprise storage system anyway). The STEC
drives also offload all of the ‘Flash Management’ required by enterprise
class SSD’s (the garbage collection of old no-longer active copies, the
wear levelling, the collation of writes etc) into the device itself.
Edit : 23rd October. I’ve had quite a few
comments, queries and requests to clarify what I briefly aluded to here.
I guess competition is fierce and its not only the Evil Machine Corp
that read my posts and decide to generate FUD based on what would look
like heresay… shame on you all. Anyway, here the key concept is that
SVC is an inband appliance and as such, all I/O flows through the SVC
ports and we have designed the product in such a way as to make 100%
efficient user of each core. That way, we can’t afford to lease spare
MIPs for anything other than the SVC threads… thus to enable a high
performance SSD solution, without any impact to our existing I/O
processing, we needed a device that offloaded all the small block I/O
request handling, wear levelling, garbage collection etc…
The SVC SSD solution is aimed at IOPs. Not MB/s. The ioDrive cards provided great IOPs AND MB/s but the SVC packaging limted the number of ioDrive we could accomodate. If you want a low latency, high IOPs AND MB/s solution with locality of access – as IBM and FusionIO provide in the SystemX solutions, then you cannot get a better option than the ioDrive, but since SVC is a very different, MIPs optimised solution we had to look at alternate options.
SVC has stringent requirements on serving I/O and our architecture trys to avoid kernel switching as much as possible. SVC runs as a user mode process and the hardware we use can be mapped to user mode and lets SVC manage the scheduling. The other main reason to switch between an ioDrive and ZeusIOPs device was form factor. SVC is whats commonly known in the industry as a ‘pizza box’ server, that is, 1U in height. These boxes are always the first out with the latest and greatest IBM has to offer in the Intel server market. However space is at a premium and so we only have two available PCIe slots. This would only allow one PCIe based SSD, verses the multiple 2.5″ drive bays.
Anyway, here is a bit more detail about the new CF8 node – this really is a leap forward in both bandwidth and performance :
New premium node hardware – CF8
- Double the throughput (IOPs and GB/s) capability of the existing (already industry leading) 8G4 nodes
- Upgrade option to add SAS SSD drives to each node via an IBM developed very high throughput SAS HBA
- Triple the cache capacity per node – 24GB – 192GB in an 8 node cluster (with future upgrade possibilities) – ECC DDR3
- 4x 8Gbit Fabric ports (capable of running at 95% of potential bandwidth)
- Dual-redundant internal power supplies
- Native iSCSI host attachment… (not only for new, but existing hardware platforms!)
New Software Functions
- Multiple Cluster Mirroring (support for up to 4 clusters in various disaster recovery configurations)
- Split-cluster GA Support (ability to split the nodes in the cluster across 2 sites as an HA solution)
- Improvements in I/O handling for Vdisk Mirroring, massively reducing failover times
- Multi-target Reverse FlashCopy (fast recovery from backups, before they have completed and without disrupting backup cascades)
- Space-efficient Enhancements
- Use Vdisk Mirroring to migrate a “thick” to a “thin” virtual disk
- Zero Impact Zero Detection on incoming writes – don’t allocate any space when an incoming write contains all zeros
- Cache Algorithm Optimisations – Enhanced cache management algorithms provide response time reductions and increased throughput – even on existing node hardware
All in all, the team has been very busy – qualifying a new hardware platform – the Nehelam based Xeons really are quite stunning (more of this later) – adding native SAS SSD mdisk attachment (much more of this later – think two RAMSAN 500 in just 2U) – Multiple-cluster Mirroring – Reverse FlashCopy – Zero Detection for SEV and an enhanced cache solution.
In my next post I’ll provide more details of the hardware and SSD performance – needless to say SVC will be the first system that can actually drive an STEC ZeusIOPs drive to its maximum IOPs potential while maintaining sub 1ms response times. I had to hold off from making so many comments to so many recent Register posts from Chris Mellor – in the comments some sceptics claimed (without obviously understanding where existing storage system bottlnecks lie) that STEC’s SSDs don’t match their claims. Well I can assure you they do – and not just peak – sustained, when coupled with the right enclosure. Similarily this week 3Par have been making more news about their custom ASIC that can be used to do thick to thin migrations and I think inline zero detection. I will now have to go round and comment on those posts… why waste the time and effort on a custom ASIC with the Intel Xeon already provides 64bit SSE instructions that can do 64bit memory compares at *MASSIVE* bandwidths with almost no impact to the normal I/O processing. Thanks to one of our lead architects, BillS, the 20 or so lines of Intel assembly can use the Xeon as a very efficient zero detect offload engine.
If you have been thinking you need a high performance solution to help make better use of what you already have, or can no longer afford to short stroke lots of expensive FC drives, there’s never been a better time to look at what SVC can do for you.