Table of Contents
Abstract
This chapter provides a general overview of essential MySQL Cluster, NDB API, and MGM API concepts, terminology, and programming constructs.
Abstract
This section introduces the NDB Transaction and Scanning APIs as well as the NDB Management (MGM) API for use in building applications to run on MySQL Cluster. It also discusses the general theory and principles involved in developing such applications.
The NDB API is an object-oriented application programming interface for MySQL Cluster that implements indexes, scans, transactions, and event handling. NDB transactions are ACID-compliant in that they provide a means to group together operations in such a way that they succeed (commit) or fail as a unit (rollback). It is also possible to perform operations in a "no-commit" or deferred mode, to be committed at a later time.
        NDB scans are conceptually rather similar to the SQL cursors
        implemented in MySQL 5.0 and other common enterprise-level
        database management systems. These allow for high-speed row
        processing for record retrieval purposes. (MySQL Cluster
        naturally supports set processing just as does MySQL in its
        non-Cluster distributions. This can be accomplished via the
        usual MySQL APIs discussed in the MySQL Manual and elsewhere.)
        The NDB API supports both table scans and row scans; the latter
        can be performed using either unique or ordered indexes. Event
        detection and handling is discussed in
        Section 3.5, “The NdbEventOperation Class”, as well as
        Section 6.5, “NDB API Event Handling Example”.
      
In addition, the NDB API provides object-oriented error-handling facilities in order to provide a means of recovering gracefully from failed operations and other problems. See Section 6.2, “Handling Errors and Retrying Transactions”, for a detailed example.
        The NDB API provides a number of classes implementing the
        functionality described above. The most important of these
        include the Ndb,
        Ndb_cluster_connection,
        NdbTransaction, and
        NdbOperation classes. These model
        (respectively) database connections, cluster connections,
        transactions, and operations. These classes and their subclasses
        are listed in Chapter 3, NDB API CLASSES. Error conditions in
        the NDB API are handled using NdbError, a
        structure which is described in
        Section 4.1, “The NdbError Structure”.
      
The MySQL Cluster Management API, also known as the MGM API, is a C-language programming interface intended to provide administrative services for the cluster. These include starting and stopping Cluster nodes, handling Cluster logging, backups, and restoration from backups, as well as various other management tasks. A conceptual overview of MGM and its uses can be found in Chapter 5, THE MGM API.
        The MGM API's principal structures model the states of
        individual modes (ndb_mgm_node_state), the
        state of the Cluster as a whole
        (ndb_mgm_cluster_state), and management
        server response messages (ndb_mgm_reply). See
        Section 5.4, “MGM Structures”, for detailed descirptions
        of these.
      
Abstract
Provides a glossary of terms which are unique to the NDB and MGM APIs, or have a specialised meaning when applied therein.
The following terms are useful to an understanding of MySQL Cluster, the NDB API, or have a specialised meaning when used in one of these contexts. In addition, you may find the MySQL Manual's Cluster Glossary to be useful as well.
Backup: A complete copy of all cluster data, transactions and logs, saved to disk.
Restore: Returning the cluster to a previous state as stored in a backup.
          Checkpoint: Generally
          speaking, when data is saved to disk, it is said that a
          checkpoint has been reached. When working with the
          NDB storage engine, there are two sorts of
          checkpoints which work together in order to ensure that a
          consistent view of the cluster's data is maintained:
        
Local Checkpoint (LCP): This is a checkpoint that is specific to a single node; however, LCPs take place for all nodes in the cluster more or less concurrently. An LCP involves saving all of a node's data to disk, and so usually occurs every few minutes, depending upon the amount of data stored by the node.
More detailed information about LCPs and their behaviour can be found in the MySQL Manual, in the sections Defining MySQL Cluster Data Nodes and Configuring Parameters for Local Checkpoints.
Global Checkpoint (GCP): A GCP occurs every few seconds, when transactions for all nodes are synchronised and the REDO log is flushed to disk.
A related term is GCI, which stands for “Global Check ID”. This marks the point in the REDO log where a GCP took place.
Node: A component of MySQL Cluster. 3 node types are supported:
Management (MGM) nodes
SQL nodes (also known as API nodes, particularly when speaking of application other than mysqld that access cluster data)
data nodes (sometimes referred to as “storage nodes”, although this usage is now discouraged)
For more information about these node types, please refer to Section 1.3.3, “Review of MySQL Cluster Concepts”, or to Cluster Process Management in the MySQL Manual.
Node Failure: MySQL Cluster is not solely dependent upon the functioning of any single node making up the cluster, which can continue to run even when one node fails.
Node Restart: The process of restarting a cluster node which has stopped on its own or been stopped deliberately. This can be done for several different reasons, including the following:
Restarting a node which has shut down on its own (when this has occurred, it is known as forced shutdown or node failure; the other cases dicussed here involve manually shutting down the node and restarting it)
To update the node's configuration
As part of a software or hardware upgrade
                In order to defragment the node's
                DataMemory
              
Initial Node Restart: The process of starting a cluster node with its filesystem removed. This is sometimes used in the course of software upgrades and in other special circumstances.
System Crash (or System Failure): This can occur when so many cluster nodes have failed that the cluster's state can no longer be guaranteed.
System Restart: The process of restarting the cluster and reinitialising its state from disk logs and checkpoints. This is required after either a planned or an unplanned shutdown of the cluster.
          Fragment: Contains a portion
          of a database table; in other words, in the
          NDB storage engine, a table is broken up
          into and stored as a number of subsets, usually referred to as
          fragments. A fragment is sometimes also called a
          partition.
        
          Replica: Under the
          NDB storage engine, each table fragment has
          number of replicas in order to provide redundancy.
        
Transporter: A protocol providing data transfer across a network. The NDB API supports 4 different types of transporter connections: TCP/IP (local), TCP/IP (remote), SCI, and SHM. TCP/IP is, of course, the familiar network protocol that underlies HTTP, FTP, and so forth, on the Internet. SCI (Scalable Coherent Interface) is a high-speed protocol used in building multiprocessor systems and parallel-processing applications. SHM stands for Unix-style shared memory segments. For an informal introduction to SCI, see this essay at dolphinics.com.
          NDB: This originally stood for
          “Network Database”. It now refers to the storage
          engine used by MySQL AB to enable its MySQL Cluster
          distributed database.
        
ACC: Access Manager. Handles hash indexes of primary keys providing speedy access to the records.
TUP: Tuple Manager. This handles storage of tuples (records) and contains the filtering engine used to filter out records and attributes when performing reads and/or updates.
TC: Transaction Coordinator. Handles co-ordination of transactions and timeouts; serves as the interface to the NDB API for indexes and scan operations.
Abstract
This section defines and discusses the high-level architecture of the NDB API, and introduces the NDB classes which are of greatest use and interest to the developer. It also covers most important NDB API concepts, including a review of MySQL Cluster Concepts.
The NDB API is a MySQL Cluster application interface that implements transactions. It consists of the following fundamental classes:
            Ndb_cluster_connection represents a
            connection to a cluster.
          
            Ndb is the main class, and represents a
            connection to a database.
          
            NdbDictionary provides meta-information
            about tables and attributes.
          
            NdbTransaction represents a transaction.
          
            NdbOperation represents an operation
            using a primary key.
          
            NdbScanOperation represents an operation
            performing a full table scan.
          
            NdbIndexOperation represents an operation
            using a unique hash index.
          
            NdbIndexScanOperation represents an
            operation performing a scan using an ordered index.
          
            NdbRecAttr represents an attribute value.
          
        In addition, the NDB API defines an NdbError
        structure, which contains the specification for an error.
      
        It is also possible to receive events triggered when data in the
        database is changed. This is accomplished through the
        NdbEventOperation class.
      
          The NDB event notification API is not
          supported prior to MySQL 5.1. (Bug#19719)
        
        For more information about these classes as well as some
        additional auxiliary classes not listed here, see
        Chapter 3, NDB API CLASSES.
      
The main structure of an application program is as follows:
            Connect to a cluster using the
            Ndb_cluster_connection object.
          
            Initiate a database connection by constructing and
            initialising one or more Ndb objects.
          
            Identify the tables, columns, and indexes on which you wish
            to operate, using NdbDictionary and one
            or more of its subclasses.
          
            Define and execute transactions using the
            NdbTransaction class.
          
            Delete Ndb objects.
          
            Terminate the connection to the cluster (terminate an
            instance of Ndb_cluster_connection).
          
The procedure for using transactions is as follows:
              Start a transaction (instantiate an
              NdbTransaction object).
            
              Add and define operations associated with the transaction
              using instances of one or more of the
              NdbOperation,
              NdbScanOperation,
              NdbIndexOperation, and
              NdbIndexScanOperation classes.
            
              Execute the transaction (call
              NdbTransaction::execute()).
            
              The operation can be of two different types —
              Commit or NoCommit:
              
              
              
                    If the operation is of type
                    NoCommit, then the application
                    program requests that the operation portion of a
                    transaction be executed, but without actually
                    committing the transaction. Following the execution
                    of a NoCommit operation, the
                    program can continue to define additional
                    transaction operations for later execution.
                  
                    NoCommit operations can also be
                    rolled back by the application.
                  
                    If the operation is of type
                    Commit, then the transaction is
                    immediately committed. The transaction must be
                    closed after it has been committed (even if the
                    commit fails), and no further operations can be
                    added to or defined for this transaction.
                  
Synchronous transactions are defined and executed as follows:
              Begin (create) the transaction, which is referenced by an
              NdbTransaction object typically created
              using Ndb::startTransaction(). At this
              point, the transaction is merely being defined; it is not
              yet sent to the NDB kernel.
            
Define operations and add them to the transaction, using one or more of the following:
                    NdbTransaction::getNdbOperation()
                  
                    NdbTransaction::getNdbScanOperation()
                  
                    NdbTransaction::getNdbIndexOperation()
                  
                    NdbTransaction::getNdbIndexScanOperation()
                  
              along with the appropriate methods of the
              respectiveNdbOperation class (or
              possibly one or more of its subclasses). Note that, at
              this point, the transaction has still not yet been sent to
              the NDB kernel.
            
              Execute the transaction, using the
              NdbTransaction::execute() method.
            
              Close the transaction by calling
              Ndb::closeTransaction().
            
For an example of this process, see Section 6.1, “Using Synchronous Transactions”.
          To execute several synchronous transactions in parallel, you
          can either use multiple Ndb objects in
          several threads, or start multiple application programs.
        
          An NdbTransaction consists of a list of
          operations, each of which is represented by an instance of
          NdbOperation,
          NdbScanOperation,
          NdbIndexOperation, or
          NdbIndexScanOperation (that is, of
          NdbOperation or one of its child classes).
        
After the operation is created using NdbTransaction::getNdbOperation() or NdbTransaction::getNdbIndexOperation(), it is defined in the following three steps:
                Specify the standard operation type using
                NdbOperation::readTuple().
              
                Specify search conditions using
                NdbOperation::equal().
              
                Specify attribute actions using
                NdbOperation::getValue().
              
Here are two brief examples illustrating this process. For the sake of brevity, we omit error handling.
            This first example uses an NdbOperation:
          
// 1. Retrieve table object
myTable= myDict->getTable("MYTABLENAME");
// 2. Create an NdbOperation on this table
myOperation= myTransaction->getNdbOperation(myTable);
// 3. Define the operation's type and lock mode
myOperation->readTuple(NdbOperation::LM_Read);
// 4. Specify search conditions
myOperation->equal("ATTR1", i);
// 5. Perform attribute retrieval
myRecAttr= myOperation->getValue("ATTR2", NULL);
For additional examples of this sort, see Section 6.1, “Using Synchronous Transactions”.
            The second example uses an
            NdbIndexOperation:
          
// 1. Retrieve index object
myIndex= myDict->getIndex("MYINDEX", "MYTABLENAME");
// 2. Create
myOperation= myTransaction->getNdbIndexOperation(myIndex);
// 3. Define type of operation and lock mode
myOperation->readTuple(NdbOperation::LM_Read);
// 4. Specify Search Conditions
myOperation->equal("ATTR1", i);
// 5. Attribute Actions 
myRecAttr = myOperation->getValue("ATTR2", NULL);
Another example of this second type can be found in Section 6.4, “Using Secondary Indexes in Scans”.
We now discuss in somewhat greater detail each step involved in the creation and use of synchronous transactions.
Define single row operation type. The following operation types are supported:
                    NdbOperation::insertTuple():
                    Inserts a non-existing tuple.
                  
                    NdbOperation::writeTuple():
                    Updates a tuple if one exists, otherwise inserts a
                    new tuple.
                  
                    NdbOperation::updateTuple():
                    Updates an existing tuple.
                  
                    NdbOperation::deleteTuple():
                    Deletes an existing tuple.
                  
                    NdbOperation::readTuple(): Reads
                    an existing tuple using the specified lock mode.
                  
                All of these operations operate on the unique tuple key.
                When NdbIndexOperation is used, then
                each of these operations operates on a defined unique
                hash index.
              
                  If you want to define multiple operations within the
                  same transaction, then you need to call
                  NdbTransaction::getNdbOperation()
                  or
                  NdbTransaction::getNdbIndexOperation()
                  for each operation.
                
Specify Search Conditions. 
                  The search condition is used to select tuples. Search
                  conditions are set using
                  NdbOperation::equal().
                
Specify Attribute Actions. Next, it is necessary to determine which attributes should be read or updated. It is important to remember that:
Deletes can neither read nor set values, but only delete them.
Reads can only read values.
Updates can only set values. Normally the attribute is identified by name, but it is also possible to use the attribute's identity to determine the attribute.
                NdbOperation::getValue() returns an
                NdbRecAttr object containing the
                value as read. To obtain the actual value, one of two
                methods can be used; the application can either
              
                    Use its own memory (passed through a pointer
                    aValue) to
                    NdbOperation::getValue(), or
                  
                    receive the attribute value in an
                    NdbRecAttr object allocated by
                    the NDB API.
                  
                The NdbRecAttr object is released
                when Ndb::closeTransaction() is
                called. For this reason, the application cannot
                reference this object following any subsequent call to
                Ndb::closeTransaction(). Attempting
                to read data from an NdbRecAttr
                object before calling
                NdbTransaction::execute() yields an
                undefined result.
              
            Scans are roughly the equivalent of SQL cursors, providing a
            means to perform high-speed row processing. A scan can be
            performed on either a table (using an
            NdbScanOperation) or an ordered index (by
            means of an NdbIndexScanOperation).
          
Scan operations have the following characteristics:
They can perform read operations which may be shared, exclusive, or dirty.
They can potentially work with multiple rows.
They can be used to update or delete multiple rows.
They can operate on several nodes in parallel.
            After the operation is created using
            NdbTransaction::getNdbScanOperation() or
            NdbTransaction::getNdbIndexScanOperation(),
            it is carried out as follows:
          
                Define the standard operation type, using
                NdbScanOperation::readTuples().
              
                Specify search conditions, using
                NdbScanFilter,
                NdbIndexScanOperation::setBound(), or
                both.
              
                Specify attribute actions using
                NdbOperation::getValue().
              
                Execute the transaction using
                NdbTransaction::execute().
              
                Traverse the result set by means of successive calls to
                NdbScanOperation::nextResult().
              
Here are two brief examples illustrating this process. Once again, in order to keep things relatively short and simple, we forego any error handling.
            This first example performs a table scan using an
            NdbScanOperation:
          
// 1. Retrieve a table object
myTable= myDict->getTable("MYTABLENAME");
// 2. Create a scan operation (NdbScanOperation) on this table
myOperation= myTransaction->getNdbScanOperation(myTable);
// 3. Define the operation's type and lock mode
myOperation->readTuples(NdbOperation::LM_Read);
// 4. Specify search conditions
NdbScanFilter sf(myOperation);
sf.begin(NdbScanFilter::OR);
sf.eq(0, i);   // Return rows with column 0 equal to i or
sf.eq(1, i+1); // column 1 equal to (i+1)
sf.end();
// 5. Retrieve attributes
myRecAttr= myOperation->getValue("ATTR2", NULL);
The second example uses an NdbIndexScanOperation to perform an index scan:
// 1. Retrieve index object
myIndex= myDict->getIndex("MYORDEREDINDEX", "MYTABLENAME");
// 2. Create an operation (NdbIndexScanOperation object)
myOperation= myTransaction->getNdbIndexScanOperation(myIndex);
// 3. Define type of operation and lock mode
myOperation->readTuples(NdbOperation::LM_Read);
// 4. Specify search conditions
// All rows with ATTR1 between i and (i+1)
myOperation->setBound("ATTR1", NdbIndexScanOperation::BoundGE, i);
myOperation->setBound("ATTR1", NdbIndexScanOperation::BoundLE, i+1);
// 5. Retrieve attributes 
myRecAttr = MyOperation->getValue("ATTR2", NULL);
Some additional discussion of each step required to perform a scan follows:
Define Scan Operation Type. 
                  It is important to remember that only a single
                  operation is supported for each scan operation
                  (NdbScanOperation::readTuples() or
                  NdbIndexScanOperation::readTuples()).
                
                  If you want to define multiple scan operations within
                  the same transaction, then you need to call
                  NdbTransaction::getNdbScanOperation()
                  or
                  NdbTransaction::getNdbIndexScanOperation()
                  separately for each operation.
                
Specify Search Conditions. 
                  The search condition is used to select tuples. If no
                  search condition is specified, the scan will return
                  all rows in the table. The search condition can be an
                  NdbScanFilter (which can be used on
                  both NdbScanOperation and
                  NdbIndexScanOperation) or bounds
                  (which can be used only on index scans - see
                  NdbIndexScanOperation::setBound()).
                  An index scan can use both
                  NdbScanFilter and bounds.
                
When NdbScanFilter is used, each row is examined, whether or not it is actually returned. However, when using bounds, only rows within the bounds will be examined.
Specify Attribute Actions. 
                  Next, it is necessary to define which attributes
                  should be read. As with transaction attributes, scan
                  attributes are defined by name, but it is also
                  possible to use the attributes' identities to define
                  attributes as well. As discussed elsewhere in this
                  document (see
                  Section 1.3.2.2, “Synchronous Transactions”),
                  the value read is returned by the
                  NdbOperation::getValue() method as
                  an NdbRecAttr object.
                
Scanning can also be used to update or delete rows. This is performed by
                Scanning with exclusive locks using
                NdbOperation::LM_Exclusive.
              
                (When iterating through the result
                set:) For each row, optionally calling either
                NdbScanOperation::updateCurrentTuple()
                or
                NdbScanOperation::deleteCurrentTuple().
              
                (If performing
                NdbScanOperation::updateCurrentTuple():)
                Setting new values for records simply by using
                NdbOperation::setValue().
                NdbOperation::equal() should not be
                called in such cases, as the primary key is retrieved
                from the scan.
              
              The update or delete is not actually performed until the
              next call to NdbTransaction::execute()
              is made, just as with single row operations.
              NdbTransaction::execute() also must be
              called before any locks are released; for more
              information, see
              Section 1.3.2.3.4, “Lock Handling with Scans”.
            
Features Specific to Index Scans. 
              When performing an index scan, it is possible to scan only
              a subset of a table using
              NdbIndexScanOperation::setBound(). In
              addition, result sets can be sorted in either ascending or
              descending order, using
              NdbIndexScanOperation::readTuples().
              Note that rows are returned unordered by default unless
              sorted is set to
              true. It is also important to note
              that, when using
              NdbIndexScanOperation::BoundEQ() on a
              partition key, only fragments containing rows will
              actually be scanned. Finally, when performing a sorted
              scan, any value passed as the
              NdbIndexScanOperation::readTuples()
              method's parallel argument will
              be ignored and maximum parallelism will be used instead.
              In other words, all fragments which it is possible to scan
              are scanned simultaneously and in parallel in such cases.
            
            Performing scans on either a table or an index has the
            potential to return a great many records; however, Ndb locks
            only a predetermined number of rows per fragment at a time.
            The number of rows locked per fragment is controlled by the
            batch parameter passed to
            NdbScanOperation::readTuples().
          
            In order to allow the application to handle how locks are
            released, NdbScanOperation::nextResult()
            has a Boolean parameter
            fetchAllowed. If
            NdbScanOperation::nextResult() is called
            with fetchAllowed equal to
            false, then no locks may be released as
            result of the function call. Otherwise the locks for the
            current batch may be released.
          
This next example shows a scan delete that handles locks in an efficient manner. For the sake of brevity, we omit error-handling.
     
int check;
// Outer loop for each batch of rows
while((check = MyScanOperation->nextResult(true)) == 0)
{
  do
  {
    // Inner loop for each row within the batch
    MyScanOperation->deleteCurrentTuple();
  } 
  while((check = MyScanOperation->nextResult(false)) == 0);
  // When there are no more rows in the batch, execute all defined deletes       
  MyTransaction->execute(NoCommit);
}
For a more complete example of a scan, see Section 6.3, “Basic Scanning Example”.
            Errors can occur either when operations making up a
            transaction are being defined, or when the transaction is
            actually being executed. Catching and handling either sort
            of error requires testing the value returned by
            NdbTransaction::execute(), and then, if
            an error is indicated (that is, if this value is equal to
            -1), using the following two methods in
            order to identify the error's type and location:
          
                NdbTransaction::getNdbErrorOperation()
                returns a reference to the operation causing the most
                recent error.
              
                NdbTransaction::getNdbErrorLine()
                yields the method number of the erroneous method in the
                operation, starting with 1.
              
This short example illustrates how to detect an error and to use these two methods to identify it:
theTransaction = theNdb->startTransaction();
theOperation = theTransaction->getNdbOperation("TEST_TABLE");
if(theOperation == NULL)
  goto error;
theOperation->readTuple(NdbOperation::LM_Read);
theOperation->setValue("ATTR_1", at1);
theOperation->setValue("ATTR_2", at1);  //  Error occurs here
theOperation->setValue("ATTR_3", at1);
theOperation->setValue("ATTR_4", at1);
if(theTransaction->execute(Commit) == -1) 
{
  errorLine = theTransaction->getNdbErrorLine();
  errorOperation = theTransaction->getNdbErrorOperation();
}
            Here, errorLine is 3,
            as the error occurred in the third method called on the
            NdbOperation object (in this case,
            theOperation). If the result of
            NdbTransaction::getNdbErrorLine() is
            0, then the error occurred when the
            operations were executed. In this example,
            errorOperation is a pointer to the object
            theOperation. The
            NdbTransaction::getNdbError() method
            returns an NdbError object providing
            information about the error.
          
              Transactions are not automatically
              closed when an error occurs. You must call
              Ndb::closeTransaction() or
              NdbTransaction::close() to close the
              transaction.
            
              See Section 3.1.1.8, “Ndb::closeTransaction()”, and
              Section 3.9.2.7, “NdbTransaction::close()”.
            
One recommended way to handle a transaction failure (that is, when an error is reported) is as shown here:
                Roll back the transaction by calling
                NdbTransaction::execute() with a
                special ExecType value for the
                type parameter.
              
                See Section 3.9.2.5, “NdbTransaction::execute()” and
                Section 3.9.1.3, “The NdbTransaction::ExecType Type”, for
                more information about how this is done.
              
                Close the transaction by calling
                NdbTransaction::closeTransaction().
              
If the error was temporary, attempt to restart the transaction.
            Several errors can occur when a transaction contains
            multiple operations which are simultaneously executed. In
            this case the application must go through all operations and
            query each of their NdbError objects to
            find out what really happened.
          
              Errors can occur even when a commit is reported as
              successful. In order to handle such situations, the NDB
              API provides an additional
              NdbTransaction::commitStatus() method
              to check the transaction's commit status.
            
Abstract
This seciton covers the NDB Kernel, and discusses MySQL Cluster transaction handling and transaction coordinators. It also describes NDB record structures and concurrency issues.
The NDB Kernel is the collection of data nodes belonging to a MySQL Cluster. The application programmer can for most purposes view the set of all storage nodes as a single entity. Each data node is made up of three main components:
When an application executes a transaction, it connects to one transaction coordinator on one data node. Usually, the programmer does not need to specify which TC should be used, but in some cases where performance is important, the programmer can provide “hints” to use a certain TC. (If the node with the desired transaction coordinator is down, then another TC will automatically take its place.)
Each data node has an ACC and a TUP which store the indexes and data portions of the database table fragment. Even though a single TC is responsible for the transaction, several ACCs and TUPs on other data nodes might be involved in that transaction's execution.
The default method is to select the transaction coordinator (TC) determined to be the "nearest" data node, using a heuristic for proximity based on the type of transporter connection. In order of nearest to most distant, these are:
SCI
SHM
TCP/IP (localhost)
TCP/IP (remote host)
If there are several connections available with the same proximity, one is selected for each transaction in a round-robin fashion. Optionally, you may set the method for TC selection to round-robin mode, where each new set of transactions is placed on the next data node. The pool of connections from which this selection is made consists of all available connections.
          As noted in Section 1.3.3, “Review of MySQL Cluster Concepts”, the
          application programmer can provide hints to the NDB API as to
          which transaction coordinator should be uses. This is done by
          providing a table and a partition key (usually the primary
          key). If the primary key as the partition key, then the
          transaction is placed on the node where the primary replica of
          that record resides. Note that this is only a hint; the system
          can be reconfigured at any time, in which case the NDB API
          chooses a transaction coordinator without using the hint. For
          more information, see
          Section 3.4.2.2.16, “Column::getPartitionKey()”, and
          Section 3.1.1.7, “Ndb::startTransaction()”. The application
          programmer can specify the partition key from SQL by using
          this construct:
        
CREATE TABLE ... ENGINE=NDB PARTITION BY KEY (attribute_list);
          For additional information, see
          Partitioning
          and in particular
          KEY
          Partitioning in the MySQL Manual.
        
          The NDB Cluster storage engine used by
          MySQL Cluster is a relational database engine storing records
          in tables as with other relational database systems. Table
          rows represent records as tuples of relational data. When a
          new table is created, its attribute schema is specified for
          the table as a whole, and thus each table row has the same
          structure. Again, this is typical of relational databases, and
          NDB is no different in this regard.
        
Primary Keys. Each record has from 1 up to 32 attributes which belong to the primary key of the table.
Transactions. Transactions are committed first to main memory, and then to disk, after a global checkpoint (GCP) is issued. Since all data are (in most NDB Cluster configurations) synchronously replicated and stored on multiple data nodes, the system can handle processor failures without loss of data. However, in the case of a system-wide failure, all transactions (committed or not) occurring since the most recent GCP are lost.
Concurrency Control. 
            NDB Cluster uses pessimistic
            concurrency control based on locking. If a
            requested lock (implicit and depending on database
            operation) cannot be attained within a specified time, then
            a timeout error results.
          
Concurrent transactions as requested by parallel application programs and thread-based applications can sometimes deadlock when they try to access the same information simultaneously. Thus, applications need to be written in a manner such that timeout errors occurring due to such deadlocks are handled gracefully. This generally means that the transaction encountering a timeout should be rolled back and restarted.
Hints and Performance. Placing the transaction coordinator in close proximity to the actual data used in the transaction can in many cases improve performance significantly. This is particularly true for systems using TCP/IP. For example, a Solaris system using a single 500 MHz processor has a cost model for TCP/IP communication which can be represented by the formula
[30 microseconds] + ([100 nanoseconds] * [number of bytes])
This means that if we can ensure that we use “popular” links we increase buffering and thus drastically reduce the costs of communication. The same system using SCI has a different cost model:
[5 microseconds] + ([10 nanoseconds] * [number of bytes])
This means that the efficiency of an SCI system is much less dependent on selection of transaction coordinators. Typically, TCP/IP systems spend 30 to 60% of their working time on communication, whereas for SCI systems this figure is in the range of 5 to 10%. Thus, employing SCI for data transport means that less effort from the NDB API programmer is required and greater scalability can be achieved, even for applications using data from many different parts of the database.
          A simple example would be an application that uses many simple
          updates where a transaction needs to update one record. This
          record has a 32-bit primary key which also serves as the
          partitioning key. Then the keyData is used
          as the address of the integer of the primary key and
          keyLen is 4.
        
Abstract
Discusses the mechanics of transaction handling and transmission in MySQL Cluster and the NDB API, and the objects used to implement these.
        When transactions are sent using
        NdbTransaction::execute(), they are not
        immediately transferred to the NDB Kernel. Instead, transactions
        are kept in a special send list (buffer) in the
        Ndb object to which they belong. The adaptive
        send algorithm decides when transactions should actually be
        transferred to the NDB kernel.
      
        The NDB API is designed as a multi-threaded interface, and so it
        is often desirable to transfer database operations from more
        than one thread at a time. The NDB API keeps track of which
        Ndb objects are active in transferring
        information to the NDB kernel and the expected number of threads
        to interact with the NDB kernel. Note that a given instance of
        Ndb should be used in at most one thread;
        different threads should not share the same
        Ndb object.
      
        There are four conditions leading to the transfer of database
        operations from Ndb object buffers to the NDB
        kernel:
      
            The NDB Transporter (TCP/IP, SCI, or shared memory) decides
            that a buffer is full and sends it off. The buffer size is
            implementation-dependent and may change between MySQL
            Cluster releases. When TCP/IP is the transporter, the buffer
            size is usually around 64 KB. Since each
            Ndb object provides a single buffer per
            data node, the notion of a “full” buffer is
            local to each data node.
          
The accumulation of statistical data on transferred information may force sending of buffers to all storage nodes (that is, when all the buffers become full).
Every 10 ms, a special transmission thread checks whether or not any send activity has occurred. If not, then the thread will force transmission to all nodes. This means that 20 ms is the maximum amount of time that database operations are kept waiting before being dispatched. A 10-millisecond limit is likely in future releases of MySQL Cluster; checks more frequent than this require additional support from the operating system.
            For methods that are affected by the adaptive send algorithm
            (such as NdbTransaction::execute()),
            there is a force parameter that
            overrides its default behaviour in this regard and forces
            immediate transmission to all nodes. See the individual NDB
            API class listings for more information.
          
The conditions listed above are subject to change in future releases of MySQL Cluster.