SVC: How it works – Part3 Manipulate your virtual data

ORIGINALLY POSTED 14th December 2008

11,836 views on developerworks

In part 1 of this series I covered the terminology of SVC. Part 2 showed how you would introduce existing volumes into a virtualized environment using SVC. In part 3 I cover the ‘now what’…

Making the most of it

Once you have imported the data, you now have a set of disks that are simply running in ‘image mode’ – its likely you want to make the most of the benefits that SVC can provide. You have various choices.

You could be approaching the end-of-lease of your existing storage, moving to a new vendor, or upgrading to the latest your preferred vendor has to offer. Or maybe you need to move some applications up or down a tier, or just try and get better performance from the tier you are in. All of this can be done using online SVC data migration features.

My Data is now Virtual

You now have SVC presenting your disks to your servers, or host systems. Why does that help. Well now you have one place to go when you need to provision some storage to a host. Where before you had to decide which storage device had some space, would give the right SLA characteristics, could grow over time… now you simply have to decide which SLA is needed.

Online Migration

Because SVC is providing in essence an abstraction layer between virtual disks and physical (or managed disks) the system that is using the virtual disk doesn’t have direct knowledge of where the data lives. This means we can move the physical location of blocks (extents) from one managed disk to another, without the host system even knowing – or most importantly without having to reconfigure the using system or SAN.

Taking the “image mode” virtual disks that you have just imported, you can now move them elsewhere. When you create an image mode vdisk, you specify a “managed disk group” to which it will belong. Think of this as a pool of storage, of the same performance and redundancy characteristic, made of up many different RAID arrays. Thus you may actually have ~100 physical spindles, all protected by RAID-6, but made of of 6+P+S arrays. Thus instead of getting the benefit of just the 8 spindles in a single array, you now can stripe over many tens of arrays.

Lets assume we have just rolled in some new DS5300 storage using RAID-6 arrays, and our existing data was on an HP EVA (not picking on HP, but we do seem to migrate off a lot of these boxes). You create a new managed disk group with say 8 DS5K RAID-6 arrays. You create a temporary managed disk group and add the old EVA LUN to it in image mode. With one single operation “migrate vdisk” you specify the new DS5K managed disk group as the target… some time later you are now running from the DS5K completely and can unmap the old EVA LUN.

In a bit more detail, whats happening in the background is … :

  • SVC looks up the source location of the first extent on the EVA based vdisk.
  • It finds this location and issues a 256K block read at the start of this extent.
  • It decides on which disk to place the first extent on in the new group.
  • It writes the 256K to the new disk.
  • It finds the second 256K block in the source disk… etc
  • Once the entire extent has been copied (cloned really) the location of this extent is updated to reflect that it now resides on the new disk, not the old.
  • The old disk extent is marked as free space.
  • The second extent on the source disk is located … etc etc

During all of this, consistency is maintained as you’d expect and only once the entire extent has been copied (with any merged new writes) is the virtualization mapping table updated to reflect the new location of the extent.

The key feature is that all of this happens without the need for any change in application loading, host server configuration, switch zoning etc. You can control the target rate of migration to meet your needs – low performance impact but take longer, verses higher performance impact but go as fast as you can.

NB: As at the end of the day, there is really only one physical spindle performing the read operation, migration is usually limited to about 50MB/s by the underlying stoage.

Tiered migrate

Of course you can extend this migration of vdisks to anything, not just newly imported image mode disks. You can use it to move vdisks between different tiers. An application that was said to be tier1 critical, and need the highest performance possible, turns out to be perfectly suited to tier2, then move it. And vice versa.

Enterprise reliability and performance

Unlike some vendors, we don’t publish qazzilion 9’s numbers, but we do internally track them, and from what I understand to mean enterprise reliability, we more than meet that with our collected data.

Not only is the software and cluster highly fault tolerant, but you can start to look to save money while gaining performance. Do you need to splash out on an expensive monolithic box when a couple of reliable midrange boxes can give you the same performance due to the wide striping ability of SVC. For the super paranoid you can also perform local HA on the cluster (the recently added vdisk mirroring) and take a couple of even lower cost boxes and push them up higher in the pecking order. Then there is all the enterprise copy service function, FlashCopy &tm; MetroMirror &tm; GlobalMirror &tm;, Space-efficient vdisks etc etc. When you have all these functions sitting above the storage, you don’t need to pay for these functions under SVC, just RAID and reliability is all we ask of the storage you present to SVC.

Quality not Quantity

This could be a big topic. I’ve been getting more and more involved with customer briefings, sales bids and so on. Most of the time we are up against the big other big 3. Hitachi, EMC and HP. SUN “curiously” seldom figures. I don’t like vendor bashing, and IBM sales teams aren’t like EMC, we present the facts that customers want, we try to solve the issues they have and we don’t have presentations dedicated to attacking the competition. BTW, for my EMC vendor readers, you do realise that SVC pitch you have is 5 years out of date. A lot has changed since the first hardware, I know you have some of them (4F2) and we’ve had 7 hardware refreshes since then. The throughput has increase exponentially, and so its fun answering the EMC funded questions that customers ask me.

Anyway, when it comes to Fibre ports, more is not necessarily better. When you look at what a single Fibre Channel port should be capable of, at 4Gbit, it 400MB/s and 8Gbit, 800MB/s… Each SVC node today has 4 ports at 4Gbit. Thats 1.6GB/s duplex. Curiously (as if by magic) an 8-lane PCIe (Gen1) bus also has 1.6GB/s throughput. Using the latest and greatest Intel hardware provides the best memory and bus throughput you can get today. (That is to say) The Intel market is driven by the PC gaming industry, which demands more and more throughput for those graphics cards, not to mention memory interfaces and CPU throughput – at as low a cost, and maximal performance per device. Often I’m asked why we chose Intel appliances, explaining all the above turns that magic light on. For example, much as I hate providing cache hit numbers, the theoretical read cache hit bandwidth for an SVC I/O Group, 2 nodes, is 3.2GB/s – and I measure just over 3 GB/s. Pretty close to theoretical, thus very little overheads in the code, and maximal usage of the Fibre connectivity. For those of you with time on your hands, I’d suggest you try getting 400MB/s out of any of your storage controller ports… no matter what vendor… for a laugh try that full duplex too…

Intels new i7 (Nehelam) architecture is another leap forward, the i7 Xeon platforms help to yet further prove this was the correct architectural decision. 2009, “Watch out for Stella mix”… (bonus points to any readers that can ‘name that tune’)

Compare that to a device that provides many tens of Fibre ports, why? When everything is centralised on that single device, the only reason could be congestion or other such box limits – like not sharing multiple OS interfaces on one card? Why? How much backend bus or CPU bandwidth is used? How much of said interface cards can be run at full speed? 1, 2, maybe 8 maximum. Thats a lot of extra ports you are paying for that aren’t actually needed… Hence my “its not quantity…” comment.

Anyway, in this part I planned to cover how once you have imported your data to an SVC environment you can move it, manipulate it and maintain it as you desire, without any down time, and enhancing the performance characteristics of the underlying devices. I couldn’t help but tackle a few of the common mis-conceptions behind SVC, and what other vendors may try and lead you to believe are limitations… don’t be fooled, ask for solutions, not vendor bashing.

In the final part, I plan to cover the topic of upgrades – Its come up a couple of times recently in the blogosphere, and yes, once you are virtual you better be sure you can upgrade your virtualizer without disruption.

One response to “SVC: How it works – Part3 Manipulate your virtual data”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: