- Replication >
- Replica Set High Availability >
- Replica Set Elections
Replica Set Elections¶
Replica sets use elections to determine which set member will become primary. Elections occur after initiating a replica set, and also any time the primary becomes unavailable. The primary is the only member in the set that can accept write operations. If a primary becomes unavailable, elections allow the set to recover normal operations without manual intervention. In the following three-member replica set, the primary is unavailable. One of the remaining secondaries holds an election to elect itself as a new primary.
Elections are essential for independent operation of a replica set; however, elections take time to complete. While an election is in process, the replica set has no primary and cannot accept writes and all remaining members become read-only.
If a majority of the replica set is inaccessible or unavailable to the current primary, the primary will step down and become a secondary. The replica set cannot accept writes after this occurs, but remaining members can continue to serve read queries if such queries are configured to run on secondaries.
Factors and Conditions that Affect Elections¶
Replication Election Protocol¶
New in version 3.2: MongoDB introduces a version 1 of the replication protocol
(protocolVersion: 1
) to reduce replica
set failover time and accelerates the detection of multiple
simultaneous primaries. New replica sets will, by default, use
protocolVersion: 1
. Previous versions of
MongoDB use version 0 of the protocol.
Heartbeats¶
Replica set members send heartbeats (pings) to each other every two seconds. If a heartbeat does not return within 10 seconds, the other members mark the delinquent member as inaccessible.
Member Priority¶
After a replica set has a stable primary, the election algorithm will
make a “best-effort” attempt to have the secondary with the highest
priority
available call an election.
Member priority affects both the timing and the
outcome of elections; secondaries with higher priority call elections
relatively sooner than secondaries with lower
priority, and are also more likely to win. However, a lower priority
instance can be elected as primary for brief periods, even if a higher
priority secondary is available. Replica set members continue
to call elections until the highest priority member available becomes
primary.
Members with a priority value of 0
cannot become primary and do
not seek election. For details, see
Priority 0 Replica Set Members.
Loss of a Data Center¶
With a distributed replica set, the loss of a data center may affect the ability of the remaining members in other data center or data centers to elect a primary.
If possible, distribute the replica set members across data centers to maximize the likelihood that even with a loss of a data center, one of the remaining replica set members can become the new primary.
Network Partition¶
A network partition may segregate a primary into a partition with a minority of nodes. When the primary detects that it can only see a minority of nodes in the replica set, the primary steps down as primary and becomes a secondary. Independently, a member in the partition that can communicate with a majority of the nodes (including itself) holds an election to become the new primary.
Vetoes in Elections¶
Changed in version 3.2: The protocolVersion: 1
obviates the need
for vetos. The following veto discussion applies to replica sets
that use the older protocolVersion: 0
.
For replica sets using protocolVersion: 0
,
all members of a replica set can veto an election, including
non-voting members. A member
will veto an election:
- If the member seeking an election is not a member of the voter’s set.
- If the current primary has more recent operations
(i.e. a higher
optime
) than the member seeking election, from the perspective of another voting member. - If the current primary has the same or more recent operations
(i.e. a higher or equal
optime
) than the member seeking election. - If a priority 0 member [1] is the most current member at the time of the election. In this case, another eligible member of the set will catch up to the state of the priority 0 member member and then attempt to become primary.
- If the member seeking an election has a lower priority than another member in the set that is also eligible for election.
[1] | Hidden and delayed imply priority 0 configuration. |
Voting Members¶
The replica set member configuration setting members[n].votes
and member state
determine whether a
member votes in an election.
All replica set members that have their
members[n].votes
setting equal to 1 vote in elections. To exclude a member from voting in an election, change the value of the member’smembers[n].votes
configuration to0
.Only voting members in the following states are eligible to vote:
Non-Voting Members¶
Although non-voting members do not vote in elections, these members hold copies of the replica set’s data and can accept read operations from client applications.
Because a replica set can have up to 50 members
, but only 7 voting
members
, non-voting
members allow a replica set to have more than seven members.
Non-voting members must have priority
of 0.
For instance, the following nine-member replica set has seven voting members and two non-voting members.
A non-voting member has both votes
and
priority
equal to 0
:
{
"_id" : <num>,
"host" : <hostname:port>,
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 0,
"tags" : {
},
"slaveDelay" : NumberLong(0),
"votes" : 0
}
Important
Do not alter the number of votes to control which
members will become primary. Instead, modify the
members[n].priority
option. Only
alter the number of votes in exceptional cases. For example, to
permit more than seven members.
To configure a non-voting member, see Configure Non-Voting Replica Set Member.