Would you trust Windows to Virtualize your Storage?

ORIGINALLY POSTED 2nd December 2007

12,831 views on developerworks

Back in June Chris Mellor stirred up a few people by suggesting that an unnamed source (I’m guessing someone at DataCore with a chip on their shoulder) was claiming IBM and HDS stole, or even reverse engineered the product to produce our respective SVC and USP products. Tony Pearson quite clearly outlined why this was not the case. I can back up his comments from experience as I was part of the SVC Architecture team around the millennium when SVC had been solidified into a product and the “Storage Software Group” as we were then had been formed. Prior to that several members of the Hursley development team had been working in Almaden in the late 1990’s on a research project looking to building a cluster infrastructure specifically aimed at using ‘commodity hardware’ to build a reliable enterprise ready storage cluster. I remember the discussions at the time as to what re-selling DataCore would mean for our customers as we knew then that SVC was only a year or two from production. Luckily the actual uptake of the re-sold product was small, but enough to validate and re-enforce our thinking on the appliance based solution.

One thing which we considered was making SVC a ‘software only’ product. At the time, the implications on RAS (Reliability, Availability and Serviceability) from going Software only were too great, the DataCore experience backed this up and the decision was to continue with the modular node approach using relatively low cost xSeries hardware as the building block. The only custom hardware in the nodes are the front panel (very simple display, watchdog timer, and flash device for recovery from failed or failing internal HDD) and I guess the Tachyon based HBA – which is a reference design from Agilent (as they were) and now PMC Sierra.

Anyway, to bring things back to the present day, an interesting article from Jon Toigo [DataCore has reason to Celebrate] – first of all interesting that this appears on “Enterprise Systems” website. I guess there are a few, but I’d question how many Enterprise data-centers are run using Windows at their heart. Specifically if I were deploying a Virtualized Storage SAN I would not trust Windows to manage the most critical of systems, the virtualizer itself. This all comes back to why we chose a dedicated platform for SVC and as thin an operating system as possible – and one that is under our own control. If you want to make any kind of performance statement, enterprise RAS, especially availability, you need a very fine grain of control over every level of the product. I’ve discussed at length before the three major approaches to virtualization, Switch, Controller and Appliance. While SVC and DataCore share the same high level ‘Appliance’ approach, the devil is in the details. I must confess that I haven’t looked in too much detail at DataCore recently, but from previous experience although you can get relatively low level access on Windows these days, with so much bloat in the base operating system (BOS) who knows what will be switched into context at the expense of your critical storage process. Since its Windows, you will have to run some form of Anti-virus, firewall, and be forever updating to the latest security levels. With SVC and its very very very limited TCP/IP access (purely for configuration access to one node in the cluster – using ssh and a very restricted shell) you don’t have the same risks or need to update the software every other day. I could go on…

Back to the article. A couple of things stand out, the grab line says they are having somewhat of a renaissance in the US. However the article states that they are selling well in Europe. Some vague numbers are quoted with an error percentage of 66%, but is this customers or licenses for each box? The CTO goes on to discuss some of the problems they’ve faced, with different Storage Controllers handling multi-pathing in different ways. This is true, and its why SVC also has to qualify storage controller families. Its not the actual multi-pathing thats an issue, well not with SVC as it handles multi-pathing internally to the SVC code – not relying on any 3rd party software or thin MPIO layers. Its the error recovery procedures and actual path failover that need to be validated. Each family of controllers handles things differently, and even some firmware levels change the behaviour. SVC will find almost any SCSI based disk that is presented to it, and while all may work well in good path, when something goes wrong on the controller all bets are off. With SVC however controlling the internal Fibre ports at least there is one less unknown in the mix. The same goes for host multi-pathing. IBM provides the Subsystem Device Driver (SDD) which is a common multi-pathing solution, both pure and MPIO based (for the OS that provide MPIO), across almost all the major OS types.

The CTO of DataCore is quoted as saying :

We can stripe across LUNs exposed by different vendors, but it isn’t a perfect result. At a minimum, the slowest LUN sets the performance of the resulting volume

I’m not sure what problems they are having. The only issue with SVC is that if you stick a slow disk in the middle of your faster disks then you will slow the striped virtual disk down. This is common sense. No different from array’s. You wouldn’t put a SATA disk in the middle of a 15K RPM RAID-5 array. Its the same for virtualization. SVC helps you manage this by providing “Groups” of Managed disks, where the guidelines are that you only add similar disks to a single group, you can then guarantee a set performance characteristic from Virtual disks provisioned. What strikes me here is his ‘minimum’ comment? What other problems do they have?

With respect to VMware, SVC was the first Storage Virtualization product to be certified for VMware infrastructure 3.0.2 in their new Storage Virtualization category. I don’t see any of the DataCore products listed as VMware certified – only the community support statement that it works under 3.0.1 and older versions.

While its good to see that another Appliance based approach is providing customers with Storage Virtualization, anyone considering or comparing DataCore to SVC (as I often see in the search hits on my blog) – consider if you would trust Windows at the heart of your SAN, and if you can live with the uncertainty of performance given the lack of control using an ‘off the shelf’ Operating System and a hardware/software stack in the appliance that you now have to self-maintain – isn’t that adding more complexity than is necessary?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: