ORIGINALLY PUBLISHED 9th October 2015
There’s a particular business partner (I’ll be shocked if he isn’t the first person to comment on this post) who has asked for a “what is clustering” session at almost every one of the UK user groups for what seems like a couple of years now. I really couldn’t see what I would talk about when answering this question – so I didn’t give it too much thought. Even more confusing was the fact that this business partner is well known to us as someone who has a really good grasp on how the technology works…
At the last user group I finally got around to asking him what he meant by “what is clustering” and he basically explained that, in his experience, Storwize customers often don’t realise that Storwize systems can be clustered. Many people that do know about clustering may not know when they should use it. This made much more sense to me so I did a session on it in the user group.
Here I’m going to try and put a basic explanation down for the rest of you to reference.
Many of you are probably not new to Storage Controllers. Pretty much all Storage controllers have some “head” or “processing” unit which communicates with the hosts and a number expansion units which allows you to add more capacity to your storage system. In Storwize we call the “head” unit a Control Enclosure and the expansion units Expansion Enclosures.
I’m sure you are all know about adding extra capacity and extra drive spindles to a Storwize system by adding Expansion Enclosures. But Storwize has an extra trick up it’s sleeve – adding Control Enclosures. What I’m going to try and do here is explain when and why you would want to do that. Note that the Storwize V3500 and V3700 only support a single Control Enclosure, so they aren’t able to use this extra trick.
How to Cluster
- A clustered storwize system is basically just a single Storwize system which contains more than one Control Enclosure. The clustered Storwize still has a single management IP address and you use this to manage the entire clustered system.
- Licensing is (to the best of my knowledge) based on how many units you have purchased, rather than how they are configured. So configuring the same hardware as two stand alone systems should require the same licenses as if you cluster the two systems together.
- All you will need to cluster two Storwize Control Enclosures together is either a Fibre Channel or an FCoE network to allow the two control enclosures to communicate with each other.
- You will need to cable and zone the Fibre Channel network correctly and then you run the “Add Control Enclosure” wizard in the GUI to configure the new control enclosure into your existing system.
- There is a little bit more to it than this (although not much more), so if you’re planning to configure clustering I would recommend reading the manuals before trying to do it.
When and why to cluster
So if you have that network, when should you use the clustering feature?
Sometimes you’ve reached the maximum number of expansion enclosures and you simply want to add more capacity to your existing system. At this point you can add an additional control enclosure (as well as more expansion enclosures attached to that control enclosure) and add the capacity into your existing storage pools. The automatic storage configuration will normally create new storage pools for your new control enclosure – however for small incremental capacity increase this is not necessary. If you are adding a lot of extra capacity then new storage pools is probably a good idea (I discuss this more in the Adding Performance section below)
Each Storwize Control Enclosure supports a maximum number of volumes (at present 2048 per Control Enclosure for most Storwize systems).
If you find that you need more that that, adding an extra Control Enclosure will increase the Maximum number of volumes in the system. This doesn’t require you to add any more drives – because Control Enclosure A can provide volumes whilst the data is stored on RAID arrays which are attached to Control Enclosure B. The IO messages are simply forwarded between the Control Enclosures using the Fibre Channel or FCoE network.
Each Control Enclosure contains two node canisters which run all of the IO processing. If you have a very busy system you could start to find that the Control Enclosures become saturated. You could be saturating the IO ports, the CPUs, the Cache or maybe the SAS network (to name a few). In this case, by adding a new Control Enclosure you get more of all of the above.
In “Adding Volumes” section above I pointed out that you can have a volume which is presented by Control Enclosure 1 getting capacity from drives which are attached to Control Enclosure 2. This is absolutely supported and will work just fine. However it doesn’t always give you the best possible performance mainly because it’s adding additional “hops” through the fibre channel network as well as additional traffic on the same network. So if you’re interested in performance it’s normally better to keep different storage pools for different Control Enclosures and try to keep the volumes in the same Control Enclosure as their storage. The good news is that when creating new volumes, the GUI will do this for you automatically unless you create a Storage pool containing arrays which come from multiple Control Enclosures.
Performance testing shows that in a “perfect” SAN environment (i.e. a lab environment) the
- Maximum IOPs is achieved by striping the data across all of the arrays in all of the control enclosures.
- Maximum MB/s is achieved by keeping the storage pools in silos – so that there is no forwarding between control enclosures
However my experience in working on real world performance cases shows that avoiding un-necessary SAN traffic is the best way to configure systems – because not all SANs are perfect all of the time, and avoiding as much of the inter-canister forwarding as possible will reduce the impact of SAN problems.
When shouldn’t you cluster
- There is no value in clustering two or more V7000 Control enclosures if you are putting them behind an SVC. The SVC will treat the two control enclosures like two different storage controllers anyway – so just keep them as separate systems.
- If you want to maintain complete isolation between different sets of storage, then you may decide that you would prefer to have two independent systems rather than one clustered Storwize system.
- If you don’t have a Fibre Channel or FCoE network then you won’t be able to perform clustering at any of the currently supported software levels.
- If you’ve already created two independent Storwize systems then there is no method to merge two existing Storwize systems into a single Clustered system.
- If your Storwize systems are in different locations then you shouldn’t (normally) cluster them together unless you are using the Hyperswap feature.
I hope this was interesting. Let me know if you have any other interesting scenarios for when to use cluster.