Online Data Migration

ORIGINALLY POSTED 1st July 2009

17,738 views on developerworks

These days I’m spending more and more of my time with customers visiting Hursley wanting to hear about our smart planet initiatives, how we can help to make their infrastructure dynamic and of course most of my involvement in this is around presenting how and where SVC and storage virtualizaton fits in the bigger picture. The executive briefing center in Hursley has the added benefit that it can draw on the SVC development team based here, not to mention CICS, Messaging, Java, and Websphere. This is especially helpful when a customer is interested in a technical deep dive into SVC and the impressive test facilities in the lab.

I’ve been presenting SVC, the reasons and benefits, for a few years now and I watch for that moment when ‘the penny drops’. It is usually always around the same point, especially so when you have the guys that actually know about and manage a storage infrastructure. Once you have explained what SVC does (see my last post) and then explain how this can be used to provide advanced functions, especially online data migration…

Imagine you have a set of disks today presented to a collection of servers. These disks are all being provided by one or more storage controllers you bought a few years ago. Its time to think about replacing them, either coming up for lease renewal, being aged out, or just reaching the end of their life. But your business is running and dependent on them. They have all your production data, all your customer records, your website, your inventory etc etc. You can’t afford any downtime, if you do have to bring them down, you will probably lose money – from loss of sales or disgruntled customers. In today’s ’24x7x365′ planet, there is no such thing as a maintenance window anymore.

Storage and service providers will offer you expensive services that can help ease the pain, they can manage the migration, but at what cost – both financially and from a time perspective. If you are going to have to move one disk at a time, at 3am, or go for a big bang weekend approach, its all risk – is going to be expensive and you are going to have to ask your staff to work long hours or weekends. If the boxes are leased, how much real use do you get from a 3 year lease – 2 years? 6 months at either side to plan and think about migration…

But aren’t we supposed to be working towards a ‘smart planet’ and ‘dynamic infrastructure’ based enterprise – well yes – and that is where SVC can help.

Lets start by thinking about what I discussed last time and explained as an abstraction. Its what SVC does, separates the physical disks (RAID arrays) from the logical (virtual disks). The servers only ever know about and talk to the virtual disks. This means they are sending I/O to a logical entity. This logical entity can reside on one or more physical resource (which of course can be a logical entity itself ?!)

Here is a picture I drew several years ago to help explain. The orange, pink and yellow are the physical resources, the purple is the logical, virtual disk.

SVC builds up a virtual disk based on a policy, here we have created a striped virtual disk. This not only helps to spread data across multiple physical resource, thus helping to improve performance for random workloads, but helps to pool together storage in tiers and give you a better idea of just how much storage is in use and how much is free. This can help to drive your storage utilisation up, reducing costs and management overheads.

Back to the migration. How does this all help. Because the server no longer has a physical mapping of where each chunk (or what we call extent) of data lives we can move it without disruption. In the above picture, say we moved extent 1a to another physical disk entirely, we update the virtualization map to point to the new location, but the virtual disk looks no different to the server. Repeat this process for all extents that make up a virtual disk – et voila. Your data is now on a completely different set of physical disks, but your server and applications haven’t noticed any change. This is usually where the attendees at the briefing have that ‘light bulb moment’ – especially if they are the ones that until now have had to come in at 3am or give up their weekend!

This is what we call online data migration. Its the same thing that can be done in virtual server environments, because you have abstracted the logical from the physical.

Lets summarise, to migrate some storage :

In a non-virtualized environment :

  • Stop applications, flush all caches to disk.
  • Move data from old to new storage, usually through a server and network (takes a long time)
  • Remove old disk mappings from server.
  • Create new disk mappings to server.
  • Restart applications.

In an virtualized environment :

Move the data…

It really is that simple – this can even be done during the working day. Users of SVC can control how fast, or slow this migration is executed, to ensure production workloads don’t suffer.

This is just one of the ways in which SVC can be used to move a static, homogeneous storage environment to a heterogeneous dynamic infrastructure. SVC was designed around openness. This is shown by the 20+ operating systems and 180+ storage controllers we support.

Follow these links to find out more about IBM’s Smart Planet, and Dynamic Infrastructure.

Comments copied from developerworks:


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: