Data Encoding

The Ice Protocol : 37.2 Data Encoding

37.2 Data Encoding

The key goals of the Ice data encoding are simplicity and efficiency. In keeping with these principles, the encoding does not align primitive types on word boundaries and therefore eliminates the wasted space and additional complexity that alignment requires. The Ice data encoding simply produces a stream of contiguous bytes; data contains no padding bytes and need not be aligned on word boundaries.

Data is always encoded using little-endian byte order for numeric types. (Most machines use a little-endian byte order, so the Ice data encoding is “right” more often than not.) Ice does not use a “receiver makes it right” scheme because of the additional complexity this would introduce. Consider, for example, a chain of receivers that merely forward data along the chain until that data arrives at an ultimate receiver. (Such topologies are common for event distribution services.) The Ice protocol permits all the intermediates to forward the data without requiring it to be unmarshaled: the intermediates can forward requests by simply copying blocks of binary data. With a “receiver makes it right” scheme, the intermediates would have to unmarshal and remarshal the data whenever the byte order of the next receiver in the chain differs from the byte order of the sender, which is inefficient.

Ice requires clients and servers that run on big-endian machines to incur the extra cost of byte swapping data into little-endian layout, but that cost is insignificant compared to the overall cost of sending or receiving a request.

37.2.1 Sizes

Many of the types involved in the data encoding, as well as several protocol message components, have an associated size or count. A size is a non-negative number. Sizes and counts are encoded in one of two ways:

1. If the number of elements is less than 255, the size is encoded as a single byte indicating the number of elements.

2. If the number of elements is greater than or equal to 255, the size is encoded as a byte with value 255, followed by an int indicating the number of elements.

Using this encoding to indicate sizes is significantly cheaper than always using an int to store the size, especially when marshaling sequences of short strings: counts of up to 254 require only a single byte instead of four. This comes at the expense of counts greater than 254, which require five bytes instead of four. However, for sequences or strings of length greater than 254, the extra byte is insignificant.

37.2.2 Encapsulations

An encapsulation is used to contain variable-length data that an intermediate receiver may not be able to decode, but that the receiver can forward to another recipient for eventual decoding. An encapsulation is encoded as if it were the following structure:

struct Encapsulation {
    int size;
    byte major;
    byte minor;
    // [... size - 6 bytes ...]
};

The size member specifies the size of the encapsulation in bytes (including the size, major, and minor fields). The major and minor fields specify the encoding version of the data contained in the encapsulation (see Section 37.5.2). The version information is followed by size‑6 bytes of encoded data.

All the data in an encapsulation is context-free, that is, nothing inside an encapsulation can refer to anything outside the encapsulation. This property allows encapsulations to be forwarded among address spaces as a blob of data.

Encapsulations can be nested, that is, contain other encapsulations.

An encapsulations can be empty, in which case its byte count is 6.

37.2.3 Slices

Exceptions and classes are subject to slicing if the receiver of a value only partially understands the received value (that is, only has knowledge of a base type, but not of the actual run-time derived type). To allow the receiver of an exception or class to ignore those parts of a value that it does not understand, exception and class values are marshaled as a sequence of slices (one slice for each level of the inheritance hierarchy). A slice is a byte count encoded as a fixed-length four-byte integer, followed by the data for the slice. (The byte count includes the four bytes occupied by the count itself, so an empty slice has a byte count of four and no data.) The receiver of a value can skip over a slice by reading the byte count b, and then discarding the next b−4 bytes in the input stream.

37.2.4 Basic Types

The basic types are encoded as shown in Table 37.1. Integer types (short, int, long) are represented as two’s complement numbers, and floating point types (float, double) use the IEEE standard formats [6]. All numeric types use a little-endian byte order.

Table 37.1. Encoding for basic types.

Type

Encoding

bool

A single byte with value 1 for true, 0 for false

byte

An uninterpreted byte

short

Two bytes (LSB, MSB)

int

Four bytes (LSB .. MSB)

long

Eight bytes (LSB .. MSB)

float

Four bytes (23‑bit fractional mantissa, 8‑bit exponent, sign bit)

double

Eight bytes (52‑bit fractional mantissa, 11‑bit exponent, sign bit)

37.2.5 Strings

Strings are encoded as a size (see Section 37.2.1), followed by the string contents in UTF‑8 format [23]. Strings are not NUL-terminated. An empty string is encoded with a size of zero.

37.2.6 Sequences

Sequences are encoded as a size (see Section 37.2.1) representing the number of elements in the sequence, followed by the elements encoded as specified for their type.

37.2.7 Dictionaries

Dictionaries are encoded as a size (see Section 37.2.1) representing the number of key–value pairs in the dictionary, followed by the pairs. Each key–value pair is encoded as if it were a struct containing the key and value as members, in that order.

37.2.8 Enumerators

Enumerated values are encoded depending on the number of enumerators:

• If the enumeration has 1 - 127 enumerators, the value is marshaled as a byte.

• If the enumeration has 128 - 32767 members, the value is marshaled as a short.

• If the enumeration has more than 32767 members, the value is marshaled as an int.

The value is the ordinal value of the corresponding enumerator, with the first enumerator value encoded as zero.

37.2.9 Structures

The members of a structure are encoded in the order they appear in the struct declaration, as specified for their types.

37.2.10 Exceptions

Exceptions are marshaled as shown in Figure 37.1

Figure 37.1. Marshaling format for exceptions.

Every exception instance is preceded by a single byte that indicates whether the exception uses class members: the byte value is 1 if any of the exception members are classes (or if any of the exception members, recursively, contain class members) and 0, otherwise.

Following the header byte, the exception is marshaled as a sequence of pairs: the first member of each pair is the type ID for an exception slice, and the second member of the pair is a slice containing the marshaled members of that slice. The sequence of pairs is marshaled in derived-to-base order, with the most-derived slice first, and ending with the least-derived slice. Within each slice, data members are marshaled as for structures: in the order in which they are defined in the Slice definition.

Following the sequence of pairs, any class instances that are used by the members of the exception are marshaled. This final part is optional: it is present only if the header byte is 1. (See Section 37.2.11 for a detailed explanation of how class instances are marshaled.)

To illustrate the marshaling, consider the following exception hierarchy:

exception Base {
    int baseInt;
    string baseString;
};

exception Derived extends Base {
    bool derivedBool;
    string derivedString;
    double derivedDouble;
};

Assume that the exception members are initialized to the values shown in Table 37.2.

Table 37.2. Member values of an exception of type Derived.

Member

Type

Value

Marshaled Size (in bytes)

baseInt

int

99

4

baseString

string

"Hello"

6

derivedBool

bool

true

1

derivedString

string

"World!"

7

derivedDouble

double

3.14

8

From Table 37.2, we can see that the total size of the members of Base is 10 bytes, and the total size of the members of Derived is 16 bytes. None of the exception members are classes. An instance of this exception has the on-the-wire representation shown in Table 37.3. (The size, type, and byte offset of the marshaled representation is indicated for each component.)

Table 37.3. Marshaled representation of the exception in Table 37.2.

Marshaled Value

Size in Bytes

Type

Byte offset

0 (no class members)

1

bool

0

"::Derived" (type ID)

10

string

1

20 (byte count for slice)

4

int

11

1 (derivedBool)

1

bool

15

"World!" (derivedString)

7

string

16

3.14 (derivedDouble)

8

double

23

"::Base" (type ID)

7

string

31

14 (byte count for slice)

4

int

38

99 (baseInt)

4

int

42

"Hello" (baseString)

6

string

46

Note that the size of each string is one larger than the actual string length. This is because each string is preceded by a count of its number of bytes, as explained in Section 37.2.5.

The receiver of this sequence of values uses the header byte to decide whether it eventually must unmarshal any class instances contained in the exception (none in this example) and then examines the first type ID (::Derived). If the receiver recognizes that type ID, it can unmarshal the contents of the first slice, followed by the remaining slices; otherwise, the receiver reads the byte count that follows the unknown type (20) and then skips 20−4 bytes in the input stream, which is the start of the type ID for the second slice (::Base). If the receiver does not recognize that type ID either, it again reads the byte count following the type ID (14), skips 14−4 bytes, and attempts to read another type ID. (This can happen only if client and server have been compiled with mismatched Slice definitions that disagree in the exception specification of an operation.) In this case, the receiver will eventually encounter an unmarshaling error, which it can report with a MarshalException.

If an exception contains class members, these members are marshaled following the exception slices as described in the following section.

37.2.11 Classes

The marshaling for classes is complex, due to the need to deal with the pointer semantics for graphs of classes, as well as the need for the receiver to slice classes of unknown derived type. In addition, the marshaling for classes uses a type ID compression scheme to avoid repeatedly marshaling the same type IDs for large graphs of class instances.

Basic Marshaling Format

Classes are marshaled similar to exceptions: each instance is divided into a number of pairs containing a type ID and a slice (one pair for each level of the inheritance hierarchy) and marshaled in derived-to-base order. Only data members are marshaled—no information is sent that would relate to operations. Unlike exceptions, no header byte precedes a class. Instead, each marshaled class instance is preceded by a (non-zero) positive integer that provides an identity for the instance. The sender assigns this identity during marshaling such that each marshaled instance has a different identity. The receiver uses that identity to correctly reconstruct graphs of classes. The overall marshaling format for classes is shown in Figure 37.2.

Figure 37.2. Marshaling format for classes.

Class Type IDs

Unlike for exception type IDs, class type IDs are not simple strings. Instead, a class type ID is marshaled as a boolean followed by either a string or a size, to conserve bandwidth. To illustrate this, consider the following class hierarchy:

class Base {
    // ...
};

class Derived extends Base {
    // ...
};

The type IDs for the class slices are ::Derived and ::Base. Suppose the sender marshals three instances of ::Derived as part of a single request. (For example, two instances could be out-parameters and one instance could be the return value.)

The first instance that is sent on the wire contains the type IDs ::Derived and ::Base preceding their respective slices. Because marshaling proceeds in derived-to-base order, the first type ID that is sent is ::Derived. Every time the sender sends a type ID that it has not sent previously in the same request, it sends the boolean value false, followed by the type ID. Internally, the sender also assigns a unique positive number to each type ID. These numbers start at 1 and increment by one for each type ID that has not been marshaled previously. This means that the first type ID is encoded as the boolean value false, followed by ::Derived, and the second type ID is encoded as the boolean value false, followed by ::Base.

When the sender marshals the remaining two instances, it consults a lookup table of previously-marshaled type IDs. Because both type IDs were sent previously in the same request (or reply), the sender encodes all further occurrences of ::Derived as the value true followed by the number 1 encoded as a size (see Section 37.2.1), and it encodes all further occurrences of ::Base as the value true followed by the number 2 encoded as a size.

When the receiver reads a type ID, it first reads its boolean marker:

• If the boolean is false, the receiver reads a string and enters that string into a lookup table that maps integers to strings. The first new class type ID received in a request is numbered 1, the second new class type ID is numbered 2, and so on.

• If the boolean value is true, the receiver reads a number encoded as a size and uses it to retrieve the corresponding class type ID from the lookup table.

Note that this numbering scheme is re-established for each new encapsulation. (As we will see in Section 37.3, parameters, return values, and exceptions are always marshaled inside an enclosing encapsulation.) For subsequent or nested encapsulation, the numbering scheme restarts, with the first new type ID being assigned the value 1. In other words, each encapsulation uses its own independent numbering scheme for class type IDs to satisfy the constraint that encapsulations must not depend on their surrounding context.

Encoding class type IDs in this way provides significant savings in bandwidth: whenever an ID is marshaled a second and subsequent time, it is marshaled as a two-byte value (assuming no more than 254 distinct type IDs per request) instead of as a string. Because type IDs can be long, especially if you are using nested modules, the savings are considerable.

Simple Class Marshaling Example

To make the preceding discussion more concrete, consider the following class definitions:

interface SomeInterface {
     void op1();
};

class Base {
    int baseInt;
    void op2();
    string baseString;
};

class Derived extends Base implements SomeInterface {
    bool derivedBool;
    string derivedString;
    void op3();
    double derivedDouble;
};

Note that Base and Derived have operations, and that Derived also implements the interface SomeInterface. Because marshaling of classes is concerned with state, not behavior, the operations op1, op2, and op3 are simply ignored during marshaling and the on-the-wire representation is as if the classes had been defined as follows:

class Base {
    int baseInt;
    string baseString;
};

class Derived extends Base {
    bool derivedBool;
    string derivedString;
    double derivedDouble;
};

Suppose the sender marshals two instances of Derived (for example, as two in-parameters in the same request). The member values are as shown in Table 37.4.

Table 37.4. Member values for two instances of class Derived.

Member

Type

Value

Marshaled Size (in bytes)

First instance

baseInt

int

99

4

baseString

string

"Hello"

6

derivedBool

bool

true

1

derivedString

string

"World!"

7

derivedDouble

double

3.14

8

Second instance

baseInt

int

115

4

baseString

string

"Cave"

5

derivedBool

bool

false

1

derivedString

string

"Canem"

6

derivedDouble

double

6.32

8

The sender arbitrarily assigns a non-zero identity (see page 1300) to each instance. Typically, the sender will simply consecutively number the instances starting at 1. For this example, assume that the two instances have the identities 1 and 2. The marshaled representation for the two instances (assuming that they are marshaled immediately following each other) is shown in Table 37.5.

Table 37.5. Marshaled representation of the two instances in Table 37.4.

Marshaled Value

Size in Bytes

Type

Byte offset

1 (identity)

4

int

0

0 (marker for class type ID)

1

bool

4

"::Derived" (class type ID)

10

string

5

20 (byte count for slice)

4

int

15

1 (derivedBool)

1

bool

19

"World!" (derivedString)

7

string

20

3.14 (derivedDouble)

8

double

27

0 (marker for class type ID)

1

bool

35

"::Base" (type ID)

7

string

36

14 (byte count for slice)

4

int

43

99 (baseInt)

4

int

47

"Hello" (baseString)

6

string

51

0 (marker for class type ID)

1

bool

57

"::Ice::Object" (class type ID)

14

string

58

5 (byte count for slice)

4

int

72

0 (number of dictionary entries)

1

size

76

2 (identity)

4

int

77

1 (marker for class type ID)

1

bool

81

1 (class type ID)

1

size

82

19 (byte count for slice)

4

int

83

0 (derivedBool)

1

bool

87

"Canem" (derivedString)

6

string

88

6.32 (derivedDouble)

8

double

94

1 (marker for class type ID)

1

bool

102

2 (class type ID)

1

size

103

13 (byte count for slice)

4

int

104

115 (baseInt)

4

int

108

"Cave" (baseString)

5

string

112

1 (marker for class type ID)

1

bool

117

3 (class type ID)

1

size

118

5 (byte count for slice)

4

int

119

0 (number of dictionary entries)

1

size

123

Note that, because classes (like exceptions) are sent as a sequence of slices, the receiver of a class can slice off any derived parts of a class it does not understand. Also note that (as shown in Table 37.5) each class instance contains three slices. The third slice is for the type ::Ice::Object, which is the base type of all classes. The class type ID ::Ice::Object has the number 3 in this example because it is the third distinct type ID that is marshaled by the sender. (See entries at byte offsets 58 and 118 in Table 37.5.) All class instances have this final slice of type ::Ice::Object.

Marshaling a separate slice for ::Ice::Object dates back to Ice versions 1.3 and earlier. In those versions, classes carried a facet map that was marshaled as if it were defined as follows:

module Ice {
    class Object;

    dictionary<string, Object> FacetMap;

    class Object {
        FacetMap facets; // No longer exists
    };
};

As of Ice version 1.4, this facet map is always empty, that is, the count of entries for the dictionary that is marshaled in the ::Ice::Object slice is always zero. If a receiver receives a class instance with a non-empty facet map, it must throw a MarshalException.

Note that if a class has no data members, a type ID and slice for that class is still marshaled. The byte count of the slice will be 4 in this case, indicating that the slice contains no data.

Marshaling Pointers

Classes support pointer semantics, that is, you can construct graphs of classes. It follows that classes can arbitrarily point at each other. The class identity (see page 1300) is used to distinguish instances and pointers as follows:

• A class identity of 0 denotes a null pointer.

• A class identity > 0 precedes the marshaled contents of an instance (see page 1300).

• A class identity < 0 denotes a pointer to an instance.

Identity values less than zero are pointers. For example, if the receiver receives the identity −57, this means that the corresponding class member that is currently being unmarshaled will eventually point at the instance with identity 57.

For structures, classes, exceptions, sequences, and dictionary members that do not contain class members, the Ice protocol uses a simple depth-first traversal algorithm to marshal the members. For example, structure members are marshaled in the order of their Slice definition; if a structure member itself is of complex type, such as a sequence, the sequence is marshaled in toto where it appears inside its enclosing structure. For complex types that contain class members, this depth-first marshaling is suspended: instead of marshaling the actual class instance at this point, a negative identity is marshaled that indicates which class instance that member must eventually denote. For example, consider the following definitions:

class C {
    // ...
};

struct S {
    int i;
    C firstC;
    C secondC;
    C thirdC;
    int j;
};

Suppose we initialize a structure of type S as follows:

S myS;
myS.i = 99;
myS.firstC = new C;             // New instance
myS.secondC = 0;                // null
myS.thirdC = myS.firstC;        // Same instance as previously
myS.j = 100;

When this structure is marshaled, the contents of the three class members are not marshaled in‑line. Instead, the sender marshals the negative identities of the corresponding instances. Assuming that the sender has assigned the identity 78 to the instance assigned to myS.firstC, myS is marshaled as shown in Table 37.6.

Table 37.6. Marshaled representation of myS.

Marshaled Value

Size in Bytes

Type

Byte offset

99 (myS.i)

4

int

0

‑78 (myS.firstC)

4

int

4

0 (myS.secondC)

4

int

8

‑78 (mys.thirdC)

4

int

12

100 (myS.j)

4

int

16

Note that myS.firstC and myS.thirdC both use the identity −78. This allows the receiver to recognize that firstC and thirdC point at the same class instance (rather than at two different instances that happen to have the same contents).

Marshaling the negative identities instead of the contents of an instance allows the receiver to accurately reconstruct the class graph that was sent by the sender. However, this begs the question of when the actual instances are to be marshaled as described at the beginning of this section. As we will see in Section 37.3, parameters and return values are marshaled as if they were members of a structure. For example, if an operation invocation has five input parameters, the client marshals the five parameters end-to-end as if they were members of a single structure. If any of the five parameters are class instances, or are of complex type (recursively) containing class instances, the sender marshals the parameters in multiple passes: the first pass marshals the parameters end-to-end, using the usual depth-first algorithm:

• If the sender encounters a class member during marshaling, it checks whether it has marshaled the same instance previously for the current request or reply:

• If the instance has not been marshaled before, the sender assigns a new identity to the instance and marshals the negative identity.

• Otherwise, if the instance was marshaled previously, the sender sends the same negative identity that is previously sent for that instance.

In effect, during marshaling, the sender builds an identity table that is indexed by the address of each instance; the lookup value for the instance is its identity.

Once the first pass ends, the sender has marshaled all the parameters, but has not yet marshaled any of the class instances that may be pointed at by various parameters or members. The identity table at this point contains all those instances for which negative identities (pointers) were marshaled, so whatever is in the identity table at this point are the classes that the receiver still needs. The sender now marshals those instances in the identity table, but with positive identities and followed by their contents, as described on page 1302. The outstanding instances are marshaled as a sequence, that is, the sender marshals the number of instances as a size (see Section 37.2.1), followed by the actual instances.

In turn, the instances just sent may themselves contain class members; when those class members are marshaled, the sender assigns an identity to new instances or uses a negative identity for previously marshaled instances as usual. This means that, by the end of the second pass, the identity table may have grown, necessitating a third pass. That third pass again marshals the outstanding class instances as a size followed by the actual instances. The third pass contains all those instances that were not marshaled in the second pass. Of course, the third pass may trigger yet more passes until, finally, the sender has sent all outstanding instances, that is, marshaling is complete. At this point, the sender terminates the sequence of passes by marshaling an empty sequence (the value 0 encoded as a size).

To illustrate this with an example, consider the definitions shown in Section 4.11.7 on page 140 once more:

enum UnaryOp { UnaryPlus, UnaryMinus, Not };
enum BinaryOp { Plus, Minus, Multiply, Divide, And, Or };

class Node {
    idempotent long eval();
};

class UnaryOperator extends Node {
    UnaryOp operator;
    Node operand;
};

class BinaryOperator extends Node {
    BinaryOp op;
    Node operand1;
    Node operand2;
};

class Operand {
    long val;
};

These definitions allow us to construct expression trees. Suppose the client initializes a tree to the shape shown in Figure 37.3, representing the expression
(1 + 6 / 2) ∗ (9 − 3). The values outside the nodes are the identities assigned by the client.

Figure 37.3. Expression tree for the expression (1 + 6 / 2) ∗ (9 − 3). Both p1 and p2 denote the root node.

The client passes the root of the tree to the following operation in the parameters p1 and p2, as shown on page 1309. (Even though it does not make sense to pass the same parameter value twice, we do it here for illustration purposes):

interface Tree {
    void sendTree(Node p1, Node p2);
};

The client now marshals the two parameters p1 and p2 to the server, resulting in the value ‑1 being sent twice in succession. (The client arbitrarily assigns an identity to each node. The value of the identity does not matter, as long as each node has a unique identity. For simplicity, the Ice implementation numbers instances with a counter that starts counting at 1 and increments by one for each unique instance.) This completes the marshaling of the parameters and results in a single instance with identity 1 in the identity table. The client now marshals a sequence containing a single element, node 1, as described on page 1302. In turn, node 1 results in nodes 2 and 3 being added to the identity table, so the next sequence of nodes contains two elements, nodes 2 and 3. The next sequence of nodes contains nodes 4, 5, 6, and 7, followed by another sequence containing nodes 8 and 9. At this point, no more class instances are outstanding, and the client marshals an empty sequence to indicate to the receiver that the final sequence has been marshaled.

Within each sequence, the order in which class instances are marshaled is irrelevant. For example, the third sequence could equally contain nodes 7, 6, 4, and 5, in that order. What is important here is that each sequence contains nodes that are an equal number of “hops” away from the initial node: the first sequence contains the initial node(s), the second sequence contains all nodes that can be reached by traversing a single link from the initial node(s), the third sequence contains all nodes that can be reached by traversing two links from the initial node(s), and so on.

Now consider the same example once more, but with different parameter values for sendTree: p1 denotes the root of the tree, and p2 denotes the − operator of the right-hand sub-tree, as shown in Figure 37.4.

Figure 37.4. The expression tree of Figure 37.3, with p1 and p2 denoting different nodes.

The graph that is marshaled is exactly the same, but instances are marshaled in a different order and with different identities:

• During the first pass, the client sends the identities ‑1 and ‑2 for the parameter values.

• The second pass marshals a sequence containing nodes 1 and 2.

• The third pass marshals a sequence containing nodes 3, 4, and 5.

• The fourth pass marshals a sequence containing nodes 6 and 7.

• The fifth pass marshals a sequence containing nodes 8 and 9.

• The final pass marshals an empty sequence.

In this way, any graph of nodes can be transmitted (including graphs that contain cycles). The receiver reconstructs the graph by filling in a patch table during unmarshaling:

• Whenever the receiver unmarshals a negative identity, it adds that identity to a patch table; the lookup value is the memory address of the parameter or member that eventually will point at the corresponding instance.

• Whenever the receiver unmarshals an actual instance, it adds the instance to an unmarshaled table; the lookup value is the memory address of the instantiated class. The receiver then uses the address of the instance to patch any parameters or members with the actual memory address.

Note that the receiver may receive negative identities that denote class instances that have been unmarshaled already (that is, point “backward” in the unmarshaling stream), as well as instances that are yet to be unmarshaled (that is, point “forward” in the unmarshaling stream). Both scenarios are possible, depending on the order in which instances are marshaled, as well as their in‑degree.

To provide another example, consider the following definition:

class C {
    // ...
};

sequence<C> CSeq;

Suppose the client marshals a sequence of 100 C instances to the server, with each instance being distinct. (That is, the sequence contains 100 pointers to 100 different instances, not 100 pointers to the same single instance.) In that case, the sequence is marshaled as a size of 100, followed by 100 negative identities, ‑1 to ‑100. Following that, the client marshals a single sequence containing the 100 instances, each instance with its positive identity in the range 1 to 100, and completes by marshaling an empty sequence.

On the other hand, if the client sends a sequence of 100 elements that all point to the same single class instance, the client marshals the sequence as a size of 100, followed by 100 negative identities, all with the value ‑1. The client then marshals a sequence containing a single element, namely instance 1, and completes by marshaling an empty sequence.

Class Graphs and Slicing

It is important to note that when a graph of class instances is sent, it always forms a connected graph. However, when the receiver rebuilds the graph, it may end up with a disconnected graph, due to slicing. Consider:

class Base {
    // ...
};

class Derived extends Base {
    // ...
    Base b;
};

interface Example {
    void op(Base p);
};

Suppose the client has complete type knowledge, that is, understands both types Base and Derived, but the server only understands type Base, so the derived part of a Derived instance is sliced. The client can instantiate classes to be sent as parameter p as follows:

DerivedPtr p = new Derived;
p‑>b = new Derived;
ExamplePrx e = ...;
e‑>op(p);

As far as the client is concerned, the graph looks like the one shown in Figure 37.5.

Figure 37.5. Sender-side view of a graph containing derived instances.

However, the server does not understand the derived part of the instances and slices them. Yet, the server unmarshals all the class instances, leading to the situation where the class graph has become disconnected, as shown in Figure 37.6.

Figure 37.6. Receiver-side view of the graph in Figure 37.5.

Of course, more complex situations are possible, such that the receiver ends up with multiple disconnected graphs, each containing many instances.

Exceptions with Class Members

If an exception contains class members, its header byte (see page 1297) is 1 and the exception members are followed by the outstanding class instances as described on the preceding pages, that is, the actual exception members are followed by one or more sequences that contain the outstanding class instances, followed by an empty sequence that serves as an end marker.

37.2.12 Interfaces

Interfaces can be marshaled by value (see Section 4.11.12). For an interface marshaled by value (as opposed to a class instance derived from that interface), only the type ID of the most-derived interface is encoded. Here are the Slice definitions once more:

interface Base { /* ... */ };

interface Derived extends Base { /* ... */ };

interface Example {
    void doSomething(Base b);
};

If the client passes a class instance to doSomething that does not have a Slice definition (but derives from Derived), the on-the-wire representation of the interface is as follows:

Table 37.7. Marshaled representation of a Derived instance.

Marshaled Value

Size in Bytes

Type

Byte offset

1 (identity)

4

int

0

0 (marker for class type ID)

1

bool

4

"::Derived" (class type ID)

10

string

5

4 (byte count for slice)

4

int

15

0 (marker for class type ID)

1

bool

19

"::Ice::Object" (class type ID)

14

string

20

5 (byte count for slice)

4

int

34

0 (number of dictionary entries)

1

size

38

37.2.13 Proxies

The first component of an encoded proxy is a value of type Ice::Identity. If the proxy is a nil value, the category and name members are empty strings, and no additional data is encoded. The encoding for a non-null proxy consists of general parameters followed by endpoint parameters.

General Proxy Parameters

The general proxy parameters are encoded as if they were members of the following structure:

struct ProxyData {
    Ice::Identity id;
    Ice::StringSeq facet;
    byte mode;
    bool secure;
};

The general proxy parameters are described in Table 37.8.

Table 37.8. General proxy parameters.

Parameter

Description

id

The object identity

facet

The facet name (zero- or one-element sequence)

mode

The proxy mode (0=twoway, 1=oneway, 2=batch oneway, 3=datagram, 4=batch datagram)

secure

true if secure endpoints are required, otherwise false

The facet field has either zero elements or one element. An empty sequence denotes the default facet, and a one-element sequence provides the facet name in its first member. If a receiver receives a proxy with a facet field with more than one element, it must throw a ProxyUnmarshalException.

Endpoint Parameters

A proxy optionally contains an endpoint list (see Appendix E) or an adapter identifier, but not both.

• If a proxy contains endpoints, they are encoded immediately following the general parameters. A size specifying the number of endpoints is encoded first (see Section 37.2.1), followed by the endpoints. Each endpoint is encoded as a short specifying the endpoint type (1=TCP, 2=SSL, 3=UDP), followed by an encapsulation (see Section 37.2.2) of type-specific parameters. The type-specific parameters for TCP, UDP, and SSL are presented in the sections that follow.

• If a proxy does not have endpoints, a single byte with value 0 immediately follows the general parameters and a string representing the object adapter identifier is encoded immediately following the zero byte.

Type-specific endpoint parameters are encapsulated because a receiver may not be capable of decoding them. For example, a receiver can only decode SSL endpoint parameters if it is configured with the SSL plug‑in (see Chapter 41). However, the receiver must be able to re-encode the proxy with all of its original endpoints, in the order they were received, even if the receiver does not understand the type-specific parameters for an endpoint. Encapsulation of the parameters allows the receiver to do this.

TCP Endpoint Parameters

A TCP endpoint is encoded as an encapsulation containing the following structure:

struct TCPEndpointData {
    string host;
    int port;
    int timeout;
    bool compress;
};

The endpoint parameters are described in Table 37.9.

Table 37.9. TCP endpoint parameters.

Parameter

Description

host

The server host (a host name or IP address)

port

The server port (1‑65535)

timeout

The timeout in milliseconds for socket operations

compress

true if compression should be used (if possible), otherwise false

See Section 37.4 for more information on compression.

UDP Endpoint Parameters

A UDP endpoint is encoded as an encapsulation containing the following structure:

struct UDPEndpointData {
    string host;
    int port;
    byte protocolMajor;
    byte protocolMinor;
    byte encodingMajor;
    byte encodingMinor;
    bool compress;
};

The endpoint parameters are described in Table 37.10.

Table 37.10. UDP endpoint parameters.

Parameter

Description

host

The server host (a host name or IP address)

port

The server port (1‑65535)

protocolMajor

The major protocol version supported by the endpoint

protocolMinor

The highest minor protocol version supported by the endpoint

encodingMajor

The major encoding version supported by the endpoint

encodingMinor

The highest minor encoding version supported by the endpoint

compress

true if compression should be used (if possible), otherwise false

See Section 37.4 for more information on compression.

SSL Endpoint Parameters

An SSL endpoint is encoded as an encapsulation containing the following structure:

struct SSLEndpointData {
    string host;
    int port;
    int timeout;
    bool compress;
};

The endpoint parameters are described in Table 37.11.

Table 37.11. SSL endpoint parameters.

Parameter

Description

host

The server host (a host name or IP address)

port

The server port (1‑65535)

timeout

The timeout in milliseconds for socket operations

compress

true if compression should be used (if possible), otherwise false

See Section 37.4 for more information on compression.

Type	Encoding
bool	A single byte with value 1 for true, 0 for false
byte	An uninterpreted byte
short	Two bytes (LSB, MSB)
int	Four bytes (LSB .. MSB)
long	Eight bytes (LSB .. MSB)
float	Four bytes (23‑bit fractional mantissa, 8‑bit exponent, sign bit)
double	Eight bytes (52‑bit fractional mantissa, 11‑bit exponent, sign bit)

Member	Type	Value	Marshaled Size (in bytes)
baseInt	int	99	4
baseString	string	"Hello"	6
derivedBool	bool	true	1
derivedString	string	"World!"	7
derivedDouble	double	3.14	8

Marshaled Value	Size in Bytes	Type	Byte offset
0 (no class members)	1	bool	0
"::Derived" (type ID)	10	string	1
20 (byte count for slice)	4	int	11
1 (derivedBool)	1	bool	15
"World!" (derivedString)	7	string	16
3.14 (derivedDouble)	8	double	23
"::Base" (type ID)	7	string	31
14 (byte count for slice)	4	int	38
99 (baseInt)	4	int	42
"Hello" (baseString)	6	string	46

Parameter	Description
id	The object identity
facet	The facet name (zero- or one-element sequence)
mode	The proxy mode (0=twoway, 1=oneway, 2=batch oneway, 3=datagram, 4=batch datagram)
secure	true if secure endpoints are required, otherwise false

Parameter	Description
host	The server host (a host name or IP address)
port	The server port (1‑65535)
timeout	The timeout in milliseconds for socket operations
compress	true if compression should be used (if possible), otherwise false