.. _versioning_gss_pk:

Primary key generation
=======================
Primary key generation is happening all over the distributed network, features generated in the various nodes will eventually be merged in Central and distribute to the rest of the units.

This means the system has to make sure no two primary key will ever conflict. This section discusses three possible algorithms for conflict avoidance


GUID based
----------
Most distributed version control systems (Git, Mercurial) use Globally Unique Identifiers (GUID) to mark items that have to be uniquely identified across the distributed network.

A GUID is a 128bit number generated by an algorithm providing uniqueness guarantees across a distributed network. Often, but not always, the algorithm uses the Ethernet card MAC address and a very precise notion of the current time to generate such identifier.

The positive side of this approach is that it does not require any configuration nor synchronization protocol modifications, the downside is that the generated number is long and hard to read.

Explicit discriminator
----------------------
An explicit discriminator approach uses a locally unique number (such as the one generated by a local sequence) plus a separate identifier for the Unit. Assuming unitId is a number the following expression will generate unique ids::

  unitId * 1000000000000 + localSequenceValue 
  
This expression generates 128bit numbers just like a GUID, allows for up to one million different units, and makes it possible to know which unit generated a certain feature by just looking at the identifier. The main downside is that it requires configuration of a unique unitId, if by mistake two Units are configured with the same id the synchronization will consistently fail.

Central replacement
-------------------
In this approach all units will generate locally unique ids using sequences. When the synchronization happens the GSS in central will look at the newly inserted features and generate a new unique identifier based on one or more local sequences, and then send back to the Unit a replacement id for each feature.

The advantage of this approach is lack of configuration and the usage of standard 64 bit sequences.

There are two significant downsides to this approach:

* It makes the synchronization protocol more complex, and makes it harder to cleanly restart in face of network communication issues. An extra call is required, as well as the storage of the id replacement table in order to restart the protocol shall this extra exchange fail to take place due to a network issue.
* If the synchronization happens as a user is still editing the features the change of id will break the user edits requiring a restart of the operation with consequent time lost and user frustration.

Chosen implementation
---------------------
The system will be initially implemented along the GUID approach, which guarantees key uniqueness with the least effort and the best guarantees of proper operation.