Storage Virtualization – Part 2 – Barry Whyte and Andrew Martin : IBM Storage

ORIGINALLY POSTED 19th August 2007

11,419 Views on developerworks

Cornerstone #1 – Online data migration

It’s always nice to influence peoples lives in a positive way. Thoseof you reading this have no doubt had the fun of being ‘on call’ foryour respective company. That frantic storage call at 3am on a Sundaymorning when things haven’t quite gone to plan. Wouldn’t it be great ifthe tasks that storage administrators usually spend the wee small hours of aSunday morning doing could be done during the normal working day.

Chuck Hollisi s one of the first to acknowledge that today’s storage has a shelf-life of a two or three years. What does a customer do when they wheel in the next generation storage controller. They somehow have to copy everything onto it before they can start using it. Before StorageVirtualization this would require a tape or LAN backup/restore. With a Virtualized SAN, not only can you do this without affecting the hosts themselves, but you can do it while users still access the data.

One of the first SVC reference customers discovered very quickly that with SVC this was a reality, not a pipe-dream. Attach the new storage to SVC, create your managed disk group (pool) from the newarray’s storage and simply migrate from old to new. Because the host itself talks to the virtual disks that mapping doesn’t change.Only the meta-data that describes where the data actually lives has to change. This is updated as the data moves, extent by extent. This customers Storage admin was quoted to have said the person that was most pleased about this was his girlfriend – who soon became his wife -as she got to spend more time with him in the evenings and weekends.Which takes me back to my opening sentence : It’s always nice to influence peoples lives in a positive way – this assumes that your girlfriend becoming your a wife is a positive thing 🙂

Its like telling a DOS user in the late 80’s to imagine that they didn’t have to worry about EMM386 or HIMEM for their particular applications needs, the system will just virtualize the memory. Today we take Virtual Memory for granted and what kind of Operating System would it be if it didn’t provide it as standard. Thats how the I think Storage virtualization will be seen in a few years.

The ability to migrate data online, without wasting host mips or LANbandwith has to be one of the key applications that ‘comes for free’once you’ve virtualized. Once you start to realise what this can do foryou, the uses become a bit more exciting. You now have the basic toolsthat allow you to implement a true Information Lifecycle. With policybased software sitting above the Virtualization device, data can bemoved between tiers as and when your policy dictates. This does howeverrequire some additional information to fully automate. For example,one use case would be some ‘hot’ data that gets hit constantly for amonth, then occasionally for a few months, and may still be accessed for several years.The low level data migration is there, but you need a policy engine andthe statistics regarding how the data has been accessed and how it fitswith the policy set down. The access statistics need to be collectedsomewhere, and in general this is done today by host based software.With an in-band appliance, or virtualizing controller the data flowsthrough the device and therefore these devices themselves could recordaccess statistics, not only for the virtual disks, but also the manageddisks. This is much more difficult to do with a switch based approach.Because there is nothing ‘caching’ or viewing the data as it passesthrough the device you need some additional hardware or software layerand a place to record access statistics – without influencing the latency of theframes. I’ll discuss this more in the next part of this story.

One other great advantage of online data migration is the abilityto now manage hot-spots and adjust performance online. No matter how well theoriginal design of an enterprise and the layout of the arrays, poolsand virtual disks – user requirements change and application workloadsusually grow. What yesterday could satisfy the users performancerequirements may not cut it today. Previously you would have toschedule migrating the data from one array to another, with aVirtualized SAN you have more choices, all of which can be done duringthe working day. Maybe you do migrate to a new set of arrays orcontrollers with better performance characteristics, but what aboutadding more arrays to the pool and re-striping the virtual disk acrossmore spindles. Of course this assumes you have some way of determiningthe performance of both the virtual and physical ends of the device,again another topic to be discussed in detail later.

So how do the three approaches stack up when thinking about migration :

In-band appliance and Controller based
Because of the re-driving of I/O requests to downstream storage you getthis for free, data is copied extent by extent. After each singleextent is online in the new location the virtualization map (meta-data)is updated.
Switch based – split-path
Because the I/Orequest is re-directed it also just works, again a copying of thesource data is required, but once the ‘extent’ is online in the newlocation the virtualization map can be updated.

So far so good, all three approaches can provide online datamigration, but as noted above those devices that do actually manage theI/O requests and have local resources can provide the additional abilityto gather access statistics for both virtual and managed disks (without the need for additionalhardware or software) (It should be noted that split-path approachesdon’t completely negate this ability – its just seriously more difficultto implement.

Cornerstone #3 Enterprise level Copy Services

Copy Services is an IBM term used to describe anything that providessome level of data copying – either for backup, archive, application cloning,or for disaster recovery. The term is generally used to cover ‘Point in time copies'(FlashCopy) and Remote Replication (MetroMirror/GlobalMirror)

FlashCopy services are great. Lets pretend that we’ve made acomplete copy of a disk (or group of disks) when in actual fact we’vedone nothing. I like it. What actually happens is that any changes thathappen after the point in time the copy was made cause a ‘split’ ofsmall ‘chunk’ of the source and target. So the old data ‘chunk’ iscopied before the new data is written to the source. A read to thetarget now either comes from the target (where the data requested haschanged since the flash was taken) or the source.

Such services can reside below a cache, but require a preparationstep where the cache is flushed for those disks – thus ensuring youhave a consistent image of what is about to be ‘flashed’.

Data mirroring, or replication, is not a new concept, RAID-1 andRAID-10 are well known simple techniques for ensuring data is not lostdue to an individual drive failure. Most operating systems provide someform of volume mirroring. Locality of data is an issue however,especially after world events like 9/11. Remote replication is a mustfor most medium and enterprise customers. All data is of value to itsusers and a loss of that data is generally not acceptable. IBM calls shortdistance replication MetroMirror, and long distance replicationGlobalMirror. In general these can be distilled to the commonsynchronous and asynchronous techniques.

Such services generally have to reside above any cache to ensure a consistent mirror image at all times.

These copy services have become a consideration in moststorage controllers purchases. Even some entry level controllersprovide a level of replication or ‘flashing’. The problem now arrises that if youvirtualize above any controller you have just made their copyservices redundant. This may not be instantly obvious, so let me clarify. Because only thevirtualizer knows the mapping between a virtual disk and the manageddisks (arrays) only the virtualizer can make a consistent copy ormirror of all the ‘extents’ that make up any one virtual disk. Therefore your virtualizer has to provide competitive copy servicesThis means you need to ensure that the virtualization approach andproduct you select provides the same level of copy services you usetoday – or plan to use.

Mark (storagezilla)was quick to point out that adding virtualization above a controlleressentially blows its brains out – and this is true of all threeapproaches. Only the virtualizer knows the mapping between virtualdisks and actual arrays in terms of LUNs and LBA’s therefore any copyservices in the underlying storage are now practically useless. I saypractically because most products do provide exceptions where you canstill make use of the storage controller copy services, however theseusually mean you have to disable or bypass some or all of the featuresthat provide the ‘Cornerstones of virtualization’…

For these reasons, the virtualization device must provide Enterpriselevel copy services if it is to virtualize the enterprise. This isactually a major advantage over traditional controller based copyservices. You can now use the same advanced copy services to replicatebetween your Tier1 enterprise storage and your Tier3 low-end storage,and between different vendors products – whilst still using the samecopy services, tools and processes. While most enterprise array controllers are nowmarketing or planning SAS, FATA or SATA drives in the same box.Virtualization allows you to keep these devices separate and vendorneutral. Most importantly your purchasing decision can now be based onthe reliability, availability, serviceability (RAS) and performance of the arrays /controllers themselves and not the copy services they provide. Theideal purchase now being a ‘low-cost RAID brick’ assuming the brickitself has satisfactory RAS and performance characteristics.

This is where the approach taken starts to make a big difference :

In-band Appliance

Because the appliance maintainsthe virtualization mapping it must also provide the copy services, forthe reasons stated above. With an appliance that provides copy servicesyou need to be sure that services like FlashCopy, MetroMirror andGlobalMirror (if you need them) can be achieved. By this I meansomething like FlashCopy (in a true implementation i.e. not 100%cloning) requires very fast and non-distuptive meta-data or bitmapupdates across the entire cluster of virtualizers. When a ‘sparse’FlashCopy is created – with no background copy – the target disk isonly populated with the blocks or ‘extents’ that are modified after thepoint in time. This means that the target ‘copy’ is online and availableas soon as the meta-data or bitmap has been created. When any system accesses either the source disk (write)or target disk (read) you need to be able to determine very quickly ifthere is something other than the write or read needs to happen. Thisrequires a system that can perform very quick meta-data lookup andsynchronize updates between all cluster members. Unless caching andstriping is disabled, the underlying controller copy services are nowredundant.
Controller based

Here again the controllermaintains the virtualizaton mapping, and probably already has code toprovide copy services (for internal disks) so its not a huge leap forit to provide the same copy services to its external storage. This issimple when you have only one controller (pair) providing access tothe external storage, but requires a higher level clustering applicationor embedded controller cluster software layer to enable ‘multi-virtualizing-controller’copy services. That is, extending the copy services outside the domain of the virtualizing controllerand its external storage is very complex. Like in-band appliances, the external storagecontroller’s native copy services are now redundant.
Switch based – split-path

This is where theproblems start. So you can take a point in time ‘clone’ of a device.You end up having to copy all the data from the source to the target.This means the target ‘copy’ cannot be brought online until theentire copy has completed – some minutes or hours later.So in a way you do get a full copy flash, but implementing a ‘sparse’flash or an ‘incremental’ series of ‘cascaded’ copies is extremely difficult andcauses major headaches when trying to scale out to n-way clusters ofintelligent line cards. A multi-way switch design is also verydifficult to code and implement because of the issues in maintaining fast updates to meta-data to keep the meta-datasynchronized across all processing blades – this has to be done at wire speed or you lose that claim. For the samereason, space-efficient copies and replication are also seriously difficultto implement. Both synchronous and asynchronous replication requiresome level of buffering of I/O requests – while switches do havebuffering built in, the number of additional buffers would be huge andgrows as the link distance increases. Most intelligent line cards do nottoday provide anywhere near this level of local storage. The most commonsolution is to use an external box to provide the replication services.Yet another box to manage and maintain – going against what virtualizationpromises. Like the other approaches, unless striping is disabled, the underlyingcontroller copy services are now redundant.

Its important to weigh up your data migration and Copy servicesneeds before you make the decision on which device to install. I’ve outlinedhere why I see problems with switch-based split-path solutions when it comesto copy services – and it may explain why for example Invista only provides’cloning’ capabilities today – I’m sure there is more to come. But the complexityof coding in a switch-based wire-speed environment should not be under-estimated.

A virtualizing controller can provide its copy services to externally attachedstorage, but the box must be able to cluster with other such boxes to enable out-of-the-boxmulti-virtualizing-controller capabilities.

An in-band appliance is the only approach that gives you the ability to’code anything a controller can do’ in a less complicated manner. Such a device must bedesigned around a cluster to ensure scale-out across the enterprise.

I look forward to your thoughts and comments on this 🙂