Quorum, How does it work? SVC and Storwize

ORIGINALLY POSTED 10th January 2017

19,712 views on developerworks

Spectrum Virtualize Quorum

Happy New Year to all. I’ve been getting a few questions relating to quorum devices, and in particular the IP quorum and what happens when various different failure scenarios occur. I thought it was worth detailing things here. So first some background.

What are quorum devices?

All Spectrum Virtualize systems are clusters. Even a single control enclosure of a Storwize system is a 2 node cluster. The system uses a voting set of nodes to ensure that the majority of the cluster nodes continue when there is a failure state. This is fine when there are are larger number of nodes remaining compared to those that have failed – or are missing. The majority always wins, so even if there is just a communication failure between parts of the system, if 5 nodes can see each other out of 8, then the 5 wins, and even if the other 3 nodes see each other, they know they need at least 4 to look for a tie break (quorum) device.

The real fun begins when there is an even split in the cluster – most commonly caused by a split-brain scenario – where the communication between the two halves of the cluster fails. With 8 nodes, this means 4 nodes in each half can see each other, and so neither has majority. At this time, the active quorum device is utilised as the tie break to become like a 9th member of the voting set and which ever half locks the quorum will win and continue.

The system automatically defines 3 quorum devices from the attached storage. For SVC, this is 3 mdisks, which it attempts to spread over 3 different storage controllers. For Storwize, this is 3 ‘used’ drives (i.e. not unused drives – only drives that are parts of arrays or spares)

In addition, more recently we added the ability to define up to 5 additional IP Quorum devices. These are applications running on servers that are IP connected to the same management IP network as the cluster itself. This is most commonly used when deploying a Stretched or HyperSwap cluster.

Now this means you could have up to 8 quorum devices. So how does the cluster know which one to use? Well, at any given time only one quorum devices is the active quorum. Every node in the cluster knows which is the active voting set of nodes, and which is the active quorum. The active quorum can only be changed when all nodes in the active voting set agree and can confirm.

Worked Examples

Lets go for an example to explain some of the uses / failure scenarios.

Assume we have a 4 node cluster (Storwize V7000) two control enclosures, and these are deployed in a HyperSwap solution. So one control enclosure (2nodes) at sites A and B.

The quorum devices are defined as :

Type	Site	Active Quorum
IP	C	Yes
Drive 0	A	No
Drive 1	B	No
Drive 2	A	No

Assume this is the starting point for each of the failures below, the cluster is whole with 4 nodes and the active IP quorum device.

Failure 1. HyperSwap Intersite Link down – Site A to Site B links dead (Split-Brain)

Starts from a fully online state as described above.

The cluster here suffers a split brain scenario, although both Sites A and B are online, the cluster has been split, and we have two online halves, so the active quorum is used as a tie break devices. Whichever half talks to the IP quorum first, and locks it, wins, now has 3 votes and continues. The other site contacts the IP quorum, sees that it is locked and halts any I/O through its nodes.

Failure 2. IP Quorum Connection Lost, Subsequent Split-Brain

Starts from a fully online state as described above.

The cluster loses IP connectivity to Site C, and so loses access to the IP quorum devices. Since this is the active quorum device, and the cluster cannot communicate with it, the cluster we re-assign one of the drives as the active quorum. In this case, Drive 0 becomes the active quorum. This re-assignment is possible because the 4 voting members (nodes) in Sites A and B are still online and communicating.

Quorum now looks like :

Type	Site	Active Quorum
IP	C	Offline
Drive 0	A	Yes
Drive 1	B	No
Drive 2	A	No

Now if we have a split brain scenario, Site A will always win, because only Site A can communicate with the active quorum devices (Drive 0).

Worst case here, if Site A actually fails (power failure or such) then Site B will also halt, because it can’t see the active quorum device (which was at Site A) You can get Site B online again, by manually running the quorum override command – essentially you have become the active quorum device here and tell Site B to continue because you know Site A is offline.

Failure 3. Simultaneous IP Quorum Connection Loss, and Site A Power Loss

Starts from a fully online state as described above.

Simultaneous failure of the communication of the IP quorum device, and Site A power is lost, so all of SIte A goes offline. This leaves Site B looking for the active quorum device, but it has now failed, and while there are other quorum devices available (Drive 1) it was not marked as the active quorum device and so Site B will halt. Again you can become the quorum and enable Site B to continue using the quorum override command to get access again.

Conclusions

You can manually assign the quorum devices (mdisks an drives) on any system with there chquorum commands. This can be done to ensure the quorum devices are spread over available controllers or enclosures if the system hasn’t done this automatically – for example you added more controllers after the initial setup.

The system will define the active quorum automatically, and will maintain its movement as needed.

For normal (non Stretched/HyperSwap) systems you generally don’t need to worry as much. Just check that the system has spread the quorum devices over multiple controllers in the SVC case, and SAS chains in the Storwize case.

For HyperSwap/Stretched systems its is critical you ensure the quorum setup is correct for your config. As you can see from the examples, if you are using IP Quorum we’d recommend you setup at least 2 – probably 3 IP quorum devices across different systems, so that if one fails, you maintain the ‘3rd site’ nature of the config. i.e. the quorum remains on a 3rd site, and doesn’t end up local to one of the cluster sites – as this will force the Site to become the one that continues and could result in a temporary loss of access while you manually override the quorum. Ensure that the drive or disk based backup quorum devices are spread over the sites also, so that in the event of multiple failures a quorum device can be assigned even after site failures etc.

PS. One final thing, there are two uses of the quorum drives/mdisks. They act as the tie break quorum (as discussed here) but they also store the virtualization table maps and other such cluster ‘recovery’ information in the event of all nodes having their brains blown out. The IP quorum devices ONLY act as tie break devices. So both types of quorum device are needed even when using IP quorum.

Hope this is useful, and feel free to ask any questions raised.

FlashSystem, Quorum, Storwize, SVC, Virtualize

2017

5 responses

Jose Parra

August 24, 2020 at 3:49 pm

Barry, I was reviewing your Quorum post, For an environment with 2 Storages FlashSystem 7200 with a Hyperswap configuration, site A and B have Productive servers that are cross-replicated, which happens if the Hyperswap communication between both sites is lost, with the IP quorum on the third site, only Does a Storage FlashSystem 7200 remain operational delivering I / O to Production Servers?

Also the servers that are not in Hyperswap disk access mode, lose access to the local disks for the Storage that the IP quorum stops the I / O of the nodes?

Thank you for answer my questions

LikeLike

Reply
1. Barry Whyte
  
  September 3, 2020 at 10:50 am
  
  Hi, yes so if the hyperswap suffers a ‘split brain’ and just the link between the two 7200 is lost one site (7200l will win the quorum race and will continue to operate. The other 7200 will halt all operations until the link is restored.
  
  For this reason, ALL volumes in a hyperswap cluster should be mirtored and no local volumes used at just one site as if they are local to the site that halts they will not be accessible until the link is recovered.
  
  If you do have local volumes you must either understand the risk or use the winner site or preferred site options. Winner site will guarantee that one site wins in a split brain. Preferred site will bias the race in favor of one site.
  
  LikeLike
  
  Reply
  1. Hasan Afzal
    
    October 8, 2022 at 9:36 pm
    
    Can we use IP Quorum when ISL is not being used? I have two FS3700 systems which would be installed at 150 meters apart. I have only one set/pair of FC switches (G610, which do not allow logical partitioning to cater for Public/Private SAN) at each site.
    
    Is it possible to setup Public SAN via ISL and do a direct connectivity for private SAN and use an IP quorum device?
    
    Do we need to use FC storage for quorum when topology does not make use of ISLs?
    
    LikeLike
Jorge Barriga

May 19, 2021 at 8:53 pm

Barry, You can now configure the quorum mode, (preferred, winner, standard), which is the value or importance that the ip quorum will have when the two sites do not access the ip quorum and the ISL drops.

LikeLike

Reply
Anonymous

December 11, 2025 at 1:18 am

Hi Barry,

We have two FlashSystem 7300 arrays configured in a HyperSwap topology across Site A and Site B.
A third independent location, Site C, hosts an IP quorum.
Site A is defined as the preferred site.
On Site B, the LAN and SAN networks are not powered by the same electrical circuit, meaning one can fail while the other remains operational.

Two scenarios are considered, and the question is what would happen if they occurred one after the other ?

T0: Loss of the LAN network on Site B:
The two FlashSystem systems can still communicate through the SAN network, but only the FlashSystem at Site A continues to communicate with the IP quorum on Site C.

T0 + 30 sec: Loss of Site B:
All connectivity to the FlashSystem at Site B is lost.

Thanks

LikeLike

Reply

Latest posts

Deployment Models for Policy-based High Availability (PBHA) – Part 3c

December 3, 2025
Deployment Models for Policy-based High Availability (PBHA) – Part 3b

December 1, 2025
Deployment Models for Policy-based High Availability (PBHA) – Part 3a

November 25, 2025
Deployment Models for Policy-based High Availability (PBHA) – Part2

August 14, 2025

Quorum, How does it work? SVC and Storwize

Share this:

5 responses

Leave a comment Cancel reply

Latest posts

Deployment Models for Policy-based High Availability (PBHA) – Part 3c

Deployment Models for Policy-based High Availability (PBHA) – Part 3b

Deployment Models for Policy-based High Availability (PBHA) – Part 3a

Deployment Models for Policy-based High Availability (PBHA) – Part2