ORIGINALLY PUBLISHED 9th October 2015

There’s a particular business partner (I’ll be shocked if he isn’t the first person to comment on this post) who has asked for a “what is clustering” session at almost every one of the UK user groups for what seems like a couple of years now. I really couldn’t see what I would talk about when answering this question – so I didn’t give it too much thought. Even more confusing was the fact that this business partner is well known to us as someone who has a really good grasp on how the technology works…

At the last user group I finally got around to asking him what he meant by “what is clustering” and he basically explained that, in his experience, Storwize customers often don’t realise that Storwize systems can be clustered. Many people that do know about clustering may not know when they should use it. This made much more sense to me so I did a session on it in the user group.

Here I’m going to try and put a basic explanation down for the rest of you to reference.

Many of you are probably not new to Storage Controllers. Pretty much all Storage controllers have some “head” or “processing” unit which communicates with the hosts and a number expansion units which allows you to add more capacity to your storage system. In Storwize we call the “head” unit a Control Enclosure and the expansion units Expansion Enclosures.

I’m sure you are all know about adding extra capacity and extra drive spindles to a Storwize system by adding Expansion Enclosures. But Storwize has an extra trick up it’s sleeve – adding Control Enclosures. What I’m going to try and do here is explain when and why you would want to do that. Note that the Storwize V3500 and V3700 only support a single Control Enclosure, so they aren’t able to use this extra trick.

How to Cluster

A clustered storwize system is basically just a single Storwize system which contains more than one Control Enclosure. The clustered Storwize still has a single management IP address and you use this to manage the entire clustered system.
Licensing is (to the best of my knowledge) based on how many units you have purchased, rather than how they are configured. So configuring the same hardware as two stand alone systems should require the same licenses as if you cluster the two systems together.
All you will need to cluster two Storwize Control Enclosures together is either a Fibre Channel or an FCoE network to allow the two control enclosures to communicate with each other.
You will need to cable and zone the Fibre Channel network correctly and then you run the “Add Control Enclosure” wizard in the GUI to configure the new control enclosure into your existing system.
There is a little bit more to it than this (although not much more), so if you’re planning to configure clustering I would recommend reading the manuals before trying to do it.

When and why to cluster

So if you have that network, when should you use the clustering feature?

Adding Capacity

Sometimes you’ve reached the maximum number of expansion enclosures and you simply want to add more capacity to your existing system. At this point you can add an additional control enclosure (as well as more expansion enclosures attached to that control enclosure) and add the capacity into your existing storage pools. The automatic storage configuration will normally create new storage pools for your new control enclosure – however for small incremental capacity increase this is not necessary. If you are adding a lot of extra capacity then new storage pools is probably a good idea (I discuss this more in the Adding Performance section below)

Adding Volumes

Each Storwize Control Enclosure supports a maximum number of volumes (at present 2048 per Control Enclosure for most Storwize systems).

If you find that you need more that that, adding an extra Control Enclosure will increase the Maximum number of volumes in the system. This doesn’t require you to add any more drives – because Control Enclosure A can provide volumes whilst the data is stored on RAID arrays which are attached to Control Enclosure B. The IO messages are simply forwarded between the Control Enclosures using the Fibre Channel or FCoE network.

Adding Performance

Each Control Enclosure contains two node canisters which run all of the IO processing. If you have a very busy system you could start to find that the Control Enclosures become saturated. You could be saturating the IO ports, the CPUs, the Cache or maybe the SAS network (to name a few). In this case, by adding a new Control Enclosure you get more of all of the above.

In “Adding Volumes” section above I pointed out that you can have a volume which is presented by Control Enclosure 1 getting capacity from drives which are attached to Control Enclosure 2. This is absolutely supported and will work just fine. However it doesn’t always give you the best possible performance mainly because it’s adding additional “hops” through the fibre channel network as well as additional traffic on the same network. So if you’re interested in performance it’s normally better to keep different storage pools for different Control Enclosures and try to keep the volumes in the same Control Enclosure as their storage. The good news is that when creating new volumes, the GUI will do this for you automatically unless you create a Storage pool containing arrays which come from multiple Control Enclosures.

Performance testing shows that in a “perfect” SAN environment (i.e. a lab environment) the

Maximum IOPs is achieved by striping the data across all of the arrays in all of the control enclosures.
Maximum MB/s is achieved by keeping the storage pools in silos – so that there is no forwarding between control enclosures

However my experience in working on real world performance cases shows that avoiding un-necessary SAN traffic is the best way to configure systems – because not all SANs are perfect all of the time, and avoiding as much of the inter-canister forwarding as possible will reduce the impact of SAN problems.

When shouldn’t you cluster

There is no value in clustering two or more V7000 Control enclosures if you are putting them behind an SVC. The SVC will treat the two control enclosures like two different storage controllers anyway – so just keep them as separate systems.
If you want to maintain complete isolation between different sets of storage, then you may decide that you would prefer to have two independent systems rather than one clustered Storwize system.
If you don’t have a Fibre Channel or FCoE network then you won’t be able to perform clustering at any of the currently supported software levels.
If you’ve already created two independent Storwize systems then there is no method to merge two existing Storwize systems into a single Clustered system.
If your Storwize systems are in different locations then you shouldn’t (normally) cluster them together unless you are using the Hyperswap feature.

I hope this was interesting. Let me know if you have any other interesting scenarios for when to use cluster.

Andrew

2015

8 responses

Jon

January 10, 2020 at 2:03 pm

Hello,

I would like to know does clustering two (4 “heads”) IBM Storwize 5100 NVMe system adds additional fault tolerance ? Is it possible to reach and access (LUN) from either head?

Thanks,

LikeLike

Reply
1. Andrew Martin
  
  January 10, 2020 at 3:28 pm
  
  Hi Jon,
  
  Adding additional control enclosures doesn’t add any additional fault tolerance in “standard” configurations. Which is why it’s not mentioned above
  
  However this blog post pre-dates the Hyperswap feature that was added in 7.5 (so ages ago now). If you configure a hyperswap system using 2 or more control enclosures – then you can get additional availability by reaching the same volume through 2 control enclosures. However this does require 2 copies of data.
  
  There is an ability that can allow a volume to be accessed from 2 or mroe control enclosures – but without hyperswap this doesn’t actually add any availability because not all drives can be accessed from all control enclosures.
  
  I hope this helps clarify things slightly.
  
  LikeLike
  
  Reply
  1. Jon
    
    January 10, 2020 at 6:42 pm
    
    First thanks for you fast replay didn’t expect what.
    But If I will add hyperswap functionality to clustered 4 “head” system, does my IOPS performance will be the same as using “standard” configuration ?
    
    LikeLike
  2. Andrew Martin
    
    January 13, 2020 at 9:14 am
    
    Hi Jon,
    
    Performance is a complex question, so the full answer is that you would have to model it. However from a simple perspective, Hyperswap has to do twice as much work per write, but the same amount of work per read. Therefore the maximum throughput will be reduced. But many clients are nowhere near the maxiumum throughput of the systems – so it may not make any difference to your specific workload.
    
    One other node – it’s not easy to convert an existing 4 enclosure system into hyperswap – you would normally make this conversion when adding an IO group to the system.
    
    LikeLike
  3. Jon
    
    January 13, 2020 at 10:14 am
    
    Andrew thank you so much, the last question :
    Can Storwise V5100 NVMe (with 4 controllers in one site) have active-active mode and guarantee that any disk can be accessed by any port on any controller. Would this only be correct if we had 2 controllers?
    
    LikeLike
  4. Jon
    
    January 13, 2020 at 10:39 am
    
    But isn’t that , At a time you can access LUN through only 2 controllers, you can only migrate that LUN to another 2 to make them “owners”
    
    LikeLike
  5. Andrew Martin
    
    January 13, 2020 at 10:20 am
    
    Hi Jon,
    
    The V5100 does support hyperswap (also known as active active). In a hyperswap configuration a single volume is accessible through 2 control enclosures rather than 1 (and critically the volume will be online as long as at least 1 of the enclosures are online).
    
    So a volume is avaialble through any port on 2 of the control enclosures rather than all of them (apart from the special case where you only have 2 control enclosures).
    
    LikeLike
Jon

January 13, 2020 at 2:52 pm

I see from demo that in single control enclosure host see 2 active 2 two passive path to the lun. If I make hyperswap cluster from 2 control enclosures (4 nodes) I see 8 paths, but only 2 active path, so this means that at a time I can access lun only via 2 nodes, right ? If you cluster system you get performance gains and all other things, but don’t get double active paths like in real cluster system like xtrimIO.

LikeLike

Reply