Weight doesn’t ALWAYS have to be AlwaysOn

One thing I keep hearing myself mentioning more and more in conversation (and most recently in a discussion group at SQLBits a few days ago) is the ability to configure your Windows Cluster Quorum for situations where each cluster node may not be of equal importance.

In SQL Server 2012 we have various high availability enhancements and improvements and most of these are branded under the term AlwaysOn. In particular we have enhancements to AlwaysOn SQL Failover Clustering and a new technology known as AlwaysOn Availability Groups. Whilst I won’t bore you with any specific details about what these are or how to use them -since that is no doubt something for another day, you may probably be already aware that for both, each require the existence of a Windows Cluster in order to use them.

One of the biggest reasons why Windows Clustering has been adopted as a pre-requisite for AlwaysOn Availability Groups is to use the mechanism known as the Quorum that provides a voting mechanism to determine which nodes can collectively be considered to be still “alive” and which nodes can be singularly or collectively be considered to have “failed”. This mechanism is a very important concept to AlwaysOn since it prevents the classic split brain problem. With respect to Availability Groups it also means that the Database Witness server that is used for Database Mirroring is not needed to implement this technology and because of this, is more scalable and reliable than it otherwise would be. *1

Quorum is not so CUTE!

As you may also be aware, in Windows 2008/R2 there are four types of Quorum model that you may use and these are:-

  • Node Majority -where more than half the nodes must remain active.
  • Node and Disk Majority -where the disk acts as an extra vote and that more than half the votes must be available.
  • Node and File Share Majority -where the file share acts as an extra vote and that more than half the votes must be available.
  • No Majority: Disk Only -the disk quorum must remain available.

In all cases, should the required situations not be true, each Windows Cluster Node’s Cluster Service will automatically shutdown since “Quorum” is lost.

In situations where the total number of Cluster Nodes is an odd number then you would traditionally use the Node Majority Quorum model so that you could lose a total of half of your cluster nodes minus one before having a cluster failure. Otherwise the other three Quorum models should be considered (generally Disk Only Quorum could be a valid option when the node count is one or two but should otherwise only be considered in special cases).

What is not commonly known is that a Cluster Node does not HAVE to have a quorum vote. By default they do, but it is possible to set what is known as Node Weight to zero. Before we come onto that though, you are probably wondering exactly why you would want to do this? Well there are several scenarios that make this a desirable thing to do such as in situations where you have implemented a Geo-Cluster (a Cluster across Geographic locations).

Consider the following diagram :-

AlwaysOn FCI or AG nodes across sites

As you can see we have Site A and Site B each hosting a selection of Cluster Nodes. It is possible that each node might be hosting a standalone instance of SQL Server and is simply part of a Windows Cluster because we are using Availability Groups (so each can house a Replica) OR it might be that we have implemented SQL Failover Clustered Instances perhaps because that we require the ability to fail SQL instances across sites.

Now in the scenario we propose (and assuming we choose a Quorum model of Node Majority) the total number of Quorum votes equals five. Imagine then that we suddenly have a failure of Node C. Since the number of votes would then equal four we would still have “quorum”. Now consider the prospect that we lose connectivity to Site B. This event would not only cause the Cluster Service for nodes on Site B to shut down (since their total votes of two would be less than the required majority of three) but would also cause the Cluster Service for nodes on Site A to shut down since there would also only be two votes available.

Although the Cluster Nodes on Site A could be forced to start, we have unfortunately lost availability (however temporarily) and more importantly requires manual intervention. Perhaps a slightly more elegant solution would be to set the Node Weight of  Node D and E to zero meaning that nodes A,B and C are the only ones that can cast a vote to make quorum (making a total Quorum count of three). In the event of a loss of one node on Site A and a loss of connectivity to Site B, two voting nodes will still be casting a quorum majority thereby keeping the Cluster available.

So lets now move onto how you are able to set your Node Weight for your Cluster nodes. Unsurprisingly we can set the Cluster node “Node Weight” through a property called “NodeWeight” but this is not accessible by default. This can be demonstrated by using the following PowerShell script (first we must import the failover clustering module):-

Import-Module failoverclusters
Get-Clusternode|Format-Table -auto -property Name, State, NodeWeight

We get the following result:-

Name         State NodeWeight
----         ----- ----------
wonko           Up
wowbagger       Up

However upon installing Hotfix 2494036 (which I should add requires a reboot to take effect) this makes the NodeWeight property accessible:-

Name         State NodeWeight
----         ----- ----------
wonko           Up          1
wowbagger       Up          0

As you can see from the above, I have already set the NodeWeight of wowbagger to zero and I did so by running the following PowerShell command:-

(Get-ClusterNode "wowbagger").NodeWeight = 0

Before you get all Jackie Chan on me and set some of your Cluster Nodes node weights to zero you should first seriously sit down and draw up a design strategy as to whether this makes sense to do so. In the scenario I proposed “the Business” had stipulated that a loss of Site B or any of the Cluster Nodes within it should not in any way effect the availability of the primary Site A but by doing so we reduce the total number of possible failures on site A to a maximum of one failure in a Cluster containing five nodes! Therefore be very careful and cautious and only when necessary remember that your Cluster Node weight doesn’t always have to be AlwaysOn.

*1 This is a contentious issue for some (including with me) since unlike with Database Mirroring, AlwaysOn Availability Groups (since it uses Windows Clustering) requires that a single Active Directory Domain spans each Geographic location.

About these ads
This entry was posted in availability, clustering, scale-out, SQLServerPedia Syndication. Bookmark the permalink.

13 Responses to Weight doesn’t ALWAYS have to be AlwaysOn

  1. Very nice blog post Mark! You’ve covered quorum, node weight, and how they relate to AlwaysOn quite well. I’m doing a presentation about AlwaysOn Availability Groups in June and was going to blog about quorum as well, but you’ve done such an awesome job that I think I’ll just refer people here instead. Thanks again!

  2. kwisatz78 says:

    Hi and nice article. However I am confused as to what will happen should Site A be lost entirely. Would the cluster be broken as Nodes D and E have no votes? I ask because I have a situation where I am going to build a cluster across data centres but its only 1 node in each.

    • retracement says:

      Hi kwisatz78, good question and this is an important point to raise. The most important thing with respect to quorum for each node is not necessarily *if* they cast a vote, but the number of votes they can see out of the total number of votes constituting quorum. Therefore in the scenario I present, the loss of the link between Site A and B would result in nodes D and E seeing a total of zero votes and in that case will stop their cluster service. This might be a desirable thing if you don’t want Site B to have any effect on site A period. But it is also important to point out three things:

      One, the choice of having zero votes in the scenario provided is not necessarily the best design choice for that situation (but it depends on what your business requirements are). Really that was an over-simplified example.
      Two, cluster weighting can be dynamically changed, meaning that a loss a one node might result in you wanting to change your weighting setup for the quorum.
      Three, quorum can be forced -so in the situation that Site A really was down (not just the link), the cluster on Site B can be forced online and reconfigured where appropriate. In other words, by removing the quorum vote, you are not disabling the ability of the node to host clustered resources, only its ability to cast a vote.

      In the situation you are looking at implementing (having a geo-cluster consisting of 1 node each site), I would avoid using zero node weighting in that scenario because there would be no benefit to it UNLESS you were not going to use a majority node set + fileshare and would therefore need to reduce your quorum count to 1 (meaning the loss of the node with the vote would bring down the cluster). I use that scenario on my test rig because I want to simplify quorum and only really need to run a single cluster node most of the time (so cluster node 2 can usually be kept offline) . In the majority node set + fileshare configuration for your 2 node geo-cluster, you would be able to lose either site and have the cluster remain online. You generally always want to aim for this type of scenario unless there is a deviation in the design and requirements from the norm.

      Hope this made sense!

      • kwisatz78 says:

        Hi retracement thanks for that and yes it makes sense apart from one thing you said about my cluster being up regardless of which site is lost in a Majority + Fileshare. as I only have 2 Sites then the Fileshare must be in one of these sites, and as the share and the 2 nodes each will have a vote, if I lost the site that contains the fileshare the cluster would go down would it not?

  3. retracement says:

    That is correct. I assumed you would have a Site C -which would be desirable when using a fileshare. If not and your nodes remain the same then you can either host the fileshare in the “primary” site or set node weight on “secondary” site to zero. Either implementation would be the same result should site A or link go down (i.e. site B nodes will also). And should site B or link go then you would still have quorum in site A due to fs+node vote OR in the case of node weighting a maximum vote of 1.

    • kwisatz78 says:

      OK thats great. Fileshare in my situation would probably better as if I lost Node A then at least it would auto failover to Node B (I am wanting to us AGs btw).

      Many thanks for the interesting article and help

  4. Raunak Jhawar says:

    Hi, will this setting be required in case of a single site setup of 2 nodes and a witness disk. I had observed that originally number_of_quorum_votes for the two node that I have was NULL and the witness disk was assigned a value of 1. Can you help me understand more?

    • retracement says:

      Hi Raunak,
      Thanks for your comment. So in a situation where you are using a Quorum model of majority node set and witness, yes the nodes would have a quorum vote (and the disk would count as one as well) -so in a 2 node cluster under that model you would have a total quorum vote of 3 and required majority would be 2. Essentially you could lose 1 node (or your disk) but not both together. What you describe suggests you are running under a disk only quorum model which means that the nodes do not play a part in the quorum vote. The disk *is* the quorum which means that if that fails your cluster will go offline. My advice to you is to change to the node + disk quorum model for your scenario.

  5. Raunak Jhawar says:

    I already have configured the cluster in node + disk model. My concern is in a single subnet implementation where all nodes are in same site, should I assign the nodeWeight = 1 to all nodes?
    The disk has a vote right = 1.

  6. David Williams says:

    Hi Mark,

    You can configure multiple SQL Instances to be part of different AlwaysON availability groups on the same Windows Server. My questions is what happens when you have a 2 node cluster with multiple availability groups? In the case of a failover of one availability group and not the other you end up with both Windows nodes in a 2 node cluster having a primary copy of one database and not the other.

    How does Windows handle quorum in this event?

    I haven’t found much information on this config and am not sure if it’s even supported.


  7. Pingback: 10 reasons why I HAte you! | tenbulls.co.uk

  8. Pingback: 10 reasons why I HAte you! - SQL Server - SQL Server - Toad World

Comments are closed.