Batch Rekeying for Secure Group Communications¹

Xiaozhou Steve Li, Yang Richard Yang, Mohamed G. Gouda, Simon S. Lam
Department of Computer Sciences
University of Texas at Austin
Austin, TX 78712-1188
{xli,yangyang,gouda,lam}@cs.utexas.edu

Abstract:

Many emerging web and Internet applications are based on a group communications model. Thus, securing group communications is an important Internet design issue. The key graph approach has been proposed for group key management. Key tree and key star are two important types of key graphs. Previous work has been focused on individual rekeying, i.e., rekeying after each join or leave request. In this paper, we first identify two problems with individual rekeying: inefficiency and an out-of-sync problem between keys and data. We then propose the use of periodic batch rekeying which can improve efficiency and alleviate the out-of-sync problem. We devise a marking algorithm to process a batch of join and leave requests. We then analyze the key server's processing cost for batch rekeying. Our results show that batch rekeying, compared to individual rekeying, saves server cost substantially. We also show that when the number of requests in a batch is not large, the best key tree degree is four; otherwise, key star (a special key tree with root degree equal to group size) outperforms small-degree key trees.

Keywords: Secure group communications, group key management, rekeying.

1 Introduction

Many emerging web applications are based on a group communications model [12,4,15]. In the Internet, multicast has been used successfully to provide an efficient, best-effort delivery service from a sender to a large group of receivers [6]. Thus, securing group communications (i.e., providing confidentiality, authenticity, and integrity of messages delivered between group members) will become an important Internet design issue.

One way to achieve secure group communications is to have a symmetric key, called group key, shared only by group members (also called users in this paper). The group key is distributed by a key server which provides group key management service. Messages sent by a member to the group are encrypted with the group key, so that only members of the group can decrypt and read the messages.

Compared to two-party communications, a unique characteristic of group communications is that group membership can change over time: new users can join and current users can leave or be expelled. If a user wants to join the group, it sends a join request to the key server. The user and key server mutually authenticate each other using a protocol such as SSL [7]. If authenticated and accepted into the group, the user shares with the key server a symmetric key, called the user's individual key.

For a group of users, initially distributing the group key to all users requires messages each encrypted with an individual key (the computation and communication costs are proportional to group size ). To prevent a new user from reading past communications (called backward access control) and a departed user from reading future communications (called forward access control), the key server may rekey (change the group key) whenever group membership changes. For large groups, join and leave requests can happen frequently. Thus, a group key management service should be scalable with respect to frequent key changes.

It is easier to rekey after a join than a leave. After a join, the new group key can be sent via unicast to the new member (encrypted with its individual key) and via multicast to existing members (encrypted with the previous group key). After a leave, however, since the previous group key cannot be used, the new group key may be securely distributed by encrypting it with individual keys. This straightforward approach, however, is not scalable. In particular, rekeying costs 2 encryptions for a join and encryptions for a leave, where is current group size (see Section 2.2).

The key graph approach [16,17] has been proposed for scalable rekeying. In this approach, besides the group key and its individual key, each user is given several auxiliary keys. These auxiliary keys are used to facilitate rekeying. Key graph is a data structure that models user-key and key-key relationships. Key tree is an important type of key graph where key-key relationships are modeled as a tree. For a single leave request, key tree reduces server processing cost to $O(\log N)$ . Key star is a special case of key tree. We provide more details of the key graph approach in Section 2.

A thorough performance analysis of individual rekeying (rekeying after each join or leave) is given in [17]. Individual rekeying, however, has two problems. First, it is relatively inefficient. Second, there is an out-of-sync problem between keys and data (see Section 4).

In this paper, we propose the use of periodic batch rekeying that can improve efficiency and alleviate the out-of-sync problem. In batch rekeying, the key server collects join and leave requests in a period of time and rekeys after a batch has been collected. We devise a marking algorithm to process a batch of requests. We then analyze the worst case and average case server processing costs for batch rekeying. Our results show that batch rekeying, compared to individual rekeying, saves server cost substantially. We also show that when the size of a batch is not large (roughly, when the number of joins is less than half of current group size, and the number of leaves is less than a quarter of current group size), four is the best key tree degree; otherwise, key star outperforms small-degree key trees. Our results provide guidelines for a key server to choose an appropriate data structure for group key management.

This paper is organized as follows. Section 2 briefly reviews the key graph approach. Section 3 identifies two problems with individual rekeying. Section 4 presents the batch rekeying idea. Section 5 presents the marking algorithm. Section 6 analyzes the server's processing cost for batch rekeying. Section 7 shows how to minimize server cost. Section 8 discusses related work. Section 9 concludes the paper.

2 Key Graph Approach

The key graph approach [17] assumes that there is a single trusted and secure key server, and the key server uses a key graph for group key management. Key graph is a directed acyclic graph with two types of nodes: u-nodes, which represent users, and k-nodes, which represent keys. User is given key if and only if there is a directed path from u-node to k-node in the key graph. Key tree and key star are two important types of key graph. In a key tree, the k-nodes and u-nodes are organized as a tree. Key star is a special key tree where tree degree equals group size. To avoid confusion, from now on, we use key tree to mean small-degree key tree. We only consider key tree and key star in this paper.

2.1 Key Tree

In a key tree, the root is the group key, leaf nodes are individual keys, and the other nodes are auxiliary keys. Consider a group of 9 users , ..., . A key tree of degree 3 is shown in Figure 2.1(a). In this figure, boxes represent u-nodes and circles represent k-nodes. User is given 3 keys on the path from to $k_{1-9}$ : , $k_{789}$ , and $k_{1-9}$ . $k_{1-9}$ is the group key. is 's individual key, which is shared by only and the key server. $k_{789}$ is an auxiliary key shared by , , and .

**Figure 1:** Example of a key tree.
$\scalebox{0.8}{\includegraphics{figs/kt_example.eps}}$

Suppose

wants to leave the group (from Figure 2.1(a) to 2.1(b)), the key server needs to change the keys that

has, that is, change $k_{1-9}$ to a new key $k_{1-8}$ , $k_{789}$ to a new key $k_{78}$ . There are three strategies to distribute the new keys to the remaining users: user-oriented, key-oriented, and group-oriented [17]. For simplicity, we only consider group-oriented rekeying in this paper. Using group-oriented rekeying, the key server constructs the following rekey message and multicasts it to the whole group:

$s \rightarrow u_1, \dots u_8$ : $\{ k_{78} \}_{k_{7}} , \{ k_{78} \}_{k_{8}}, \{ k_{1-8} \}_{k_{123}}$ ,

$\{ k_{1-8} \}_{k_{456}}, \{ k_{1-8} \}_{k_{78}}$

Here $\{ k' \}_k$ means key encrypted with key . We call each item in the rekey message an encrypted key. Upon receiving the rekey message, a user extracts the encrypted keys that it needs. For example, only needs $\{ k_{1-8} \}_{k_{78}}$ and $\{ k_{78} \}_{k_{7}}$ .

Similarly, suppose wants to join a group of 8 users (from Figure 2.1(b) to 2.1(a)). The key server finds a joining point ( $k_{78}$ in this example) for the new user, constructs the rekey message, multicasts it to the whole group, and unicasts the keys needed by :

$s \rightarrow u_1, \dots, u_9$ : $\{ k_{1-9} \}_{k_{1-8}} , \{ k_{789} \}_{k_{78}}$

$s \rightarrow u_9$ : $\{ k_{1-9}, k_{789} \} _{k_9}$

From the above example, we can see that both the server's computation and communication costs are proportional to the number of encryptions to be performed (5 for the first example and 4 for the second example). Thus, we use server cost to mean the number of encryptions the key server has to perform. If the key tree degree is and there are users, assuming the key tree is a completely balanced tree, the server cost is $2 \log_d N$ for a join and $d \log_d N - 1$ for a leave. [17] showed that is the optimal degree for a leave.

2.2 Key Star

Key star is a special case of key tree where tree root degree equals group size. Key star models the straightforward approach. In key star, every user has two keys: its individual key and the group key. There is no auxiliary key. Figure 2(a) shows the key star for 4 users. Suppose wants to leave the group (from Figure 2(a) to 2(b)), the key server encrypts the new group key $k_{1-3}$ using every user's individual key, puts the encrypted keys in a message and multicasts it to the whole group:

$s \rightarrow u_1, u_2, u_3$ : $\{ k_{1-3} \}_{k_1}, \{ k_{1-3} \}_{k_2}, \{ k_{1-3} \}_{k_3}$

Suppose wants to join the group (from Figure 2(b) to 2(a)), the key server sends out the following messages:

$s \rightarrow u_1, u_2, u_3$ : $\{ k_{1-4} \}_{k_{1-3}}$

$s \rightarrow u_4$ : $\{ k_{1-4} \}_{k_4}$

**Figure 2:** Example of a key star.
$\scalebox{0.8}{\includegraphics{figs/ks_example.eps}}$

Clearly, using a key star, the server cost is for a join and for a leave.

3 Problems with Individual Rekeying

Ideally, a departed user should be expelled from the group, and a new user be accepted to the group, as early as possible. Thus, the key server should rekey immediately after receiving a join or leave request, We call this individual rekeying. Individual rekeying, however, has two problems: inefficiency and an out-of-sync problem between keys and data.

3.1 Inefficiency

Individual rekeying is relatively inefficient for two reasons. First, the rekey message has to be signed for authentication purpose, otherwise a compromised group user can send out bogus rekey messages and mess up the whole system. Signing operation is computationally expensive [18]. If, for every single request, the key server has to generate and sign a rekey message, the signing operation alone will place a heavy burden on the key server, especially when requests are frequent.

Second, consider two leaves that happen one after another. The key server generates two sets of new keys (group key and auxiliary keys) for these two leaves. These two leaves, however, might temporally happen so close to each other that the first set of new keys are actually not used and are immediately replaced by the second set of new keys. When requests are frequent, like during the startup or teardown of a multicast session, many new keys may be generated and distributed, while not used at all. This is a waste of server cost.

3.2 Out-of-Sync Problem

Individual rekeying also has the following out-of-sync problem between keys and data: a user might receive a data message encrypted by an old group key, or it might receive a data message encrypted with a group key that it has not received yet. Figure 3 shows an example of this problem. In this example, at time , receives a data message encrypted with group key GK(2) from , but has not received GK(2); at time , receives a data message encrypted with group key GK(0) from , but 's current group key is GK(2). Delay of reliable rekey message delivery ² can be large and variable. Thus, this out-of-sync problem may require a user to keep many old group keys, and/or buffer a large amount of data encrypted with group keys that it has not received.

**Figure 3:** Out-of-sync problem
$\scalebox{0.8}{\includegraphics{figs/oos_problem.eps}}$

4 Batch Rekeying

To address the above two problems, we propose the use of periodic batch rekeying. In batch rekeying, the key server waits for a period of time, called a rekey interval, collects all the join and leave requests during the interval, generates new keys, constructs a rekey message, and multicasts the rekey message.

4.1 Efficiency

Batch rekeying improves efficiency because it reduces the number of rekey messages to be signed: one for a batch of requests, instead of one for each. Batch rekeying also takes advantage of the possible overlap of new keys for multiple rekey requests, and thus reduces the possibility of generating new keys that will not be used. In Section 6, we will quantify the performance benefit of batch rekeying.

4.2 Out-of-Sync Problem Alleviated

Batch rekeying alleviates the out-of-sync problem. To see this, let be the maximum delay of reliable delivery of rekey messages, be the maximum delay of data messages, be the length of the rekey interval, and GK() be the -th group key. That is, if a user's current group key is GK(), when it receives a rekey message, its group key becomes GK(). It is not hard to show the following theorem. We skip the proof for simplicity.

Theorem 1 If

and a user's current group key is GK(

), then the user will only receive data messages encrypted with GK(

), GK(

), or GK(

), but not other group keys.

This theorem indicates that if we make relatively large compared to and , then a user only needs to keep GK(), GK(), and its current auxiliary keys. If the user receives a data message encrypted with GK(), it has to keep them in some buffer and wait until the next rekey message arrives. Since the rekey interval is likely to be larger than message delays, the condition is easy to satisfy.

4.3 Security Considerations

Batch rekeying is a tradeoff between performance and security. Since the key server does not rekey immediately, a departed user will remain in the group longer, and a new user has to wait longer to be accepted to the group.

We define vulnerability window to be the period of time from the key server receives a join or leave request, to all users receive the rekey message. We use the maximum size of vulnerability window, denoted as , to measure security, because any group traffic sent within the vulnerability window is susceptible to be read by a departed user. The smaller , the more secure the system is. We note that also measures how soon a new user can be accepted to the group.

Figure 4(a) shows an example to illustrate the vulnerability window notion for individual rekeying. In this example, requests to leave the group at . The key server receives the request at and rekeys immediately. The rekey message arrives at . But between and , sends out a message encrypted with GK(0) at . The message arrives at . Although has left the group at , it is still able to read this message. The period between and is the vulnerability window. Figure 4(b) shows the vulnerability window notion for batch rekeying.

**Figure 4:** Vulnerability window.
$\scalebox{0.8}{\includegraphics{figs/vul_win.eps}}$

It is not hard to see that for individual rekeying and for batch rekeying. So even in individual rekeying, a departed user is able to continue reading group communication for some period of time. Batch rekeying only makes this period longer and adjustable. Many applications can tolerate such a compromise. For example, a pay-per-view application can tolerate a viewer remaining in the group for a few more minutes after its subscription expires. An application can choose an acceptable based on its own requirements.

As noted in the Introduction section, it is easier to rekey after a join than a leave. Thus, the access delay of a new user may be reduced by processing each new join immediately without rekeying the entire group, i.e., each new user is given the current group key and its auxiliary keys encrypted with its individual key, but leave requests are processed periodically in a batch. This approach offers new users faster access to the group at the cost of also giving new users a small amount of backward access. For simplicity, we will not discuss this approach in detail in the balance of this paper.

5 Marking Algorithm

In this section, we present a marking algorithm for the key server to process a batch of requests. Obviously, if the key server uses key star, batch rekeying is a straightforward extension to individual rekeying. Thus, the marking algorithm applies to key tree only. In Section 6, we will analyze the resulting server processing cost for batch rekeying.

We use to denote the number of joins in a batch and to denote the number of leaves in a batch. We assume that within a batch, a user will not first join then leave, or first leave then join.

5.1 Marking Algorithm

Given a batch of requests, the main task for the key server is to identify which keys should be added, deleted, or changed. In individual rekeying, all the keys on the path from the request location to the root of the key tree have to be changed. When there are multiple requests, there are multiple paths. These paths form a subtree, called rekey subtree, which includes all the keys to be added or changed. The rekey subtree does not include individual keys.

The key server cannot control which users might leave, but it can control where in the key tree to place the new users. Thus, the key server should carefully place the new users (if there were any) so that the number of encryptions it has to perform is minimized. The following marking algorithm uses some heuristics to achieve this objective. Figure 5 illustrates the idea. We use the word key and node interchangeably.

**Figure 5:** Marking algorithm.
$\scalebox{0.8}{\includegraphics{figs/marking.eps}}$

Case 1: .
1. Replace leaves by joins.
2. Mark all the nodes from the replacement locations to the root UPDATE.
Case 2: .
1. Out of the leaves, pick shallowest (smallest height) leaves. ³ Replace these leaves with the joins.
2. Mark all the nodes from the root to the leave and replacement locations UPDATE or DELETE. Those leaving nodes without joining replacements are marked DELETE. A non-leaf node is marked DELETE if and only if all of its children are marked DELETE.
Case 3: and .
1. Find a shallowest leaf node . Remove from the tree.
2. Construct , a complete but not necessarily balanced tree [11], that has all the new users and as leaf nodes. The other nodes of are new keys.
3. Attach to the old location of .
4. Mark all 's internal nodes NEW and mark all the nodes from the root to the parent of 's old location UPDATE.
Case 4: and .
1. Replace all leaves by joins.
2. Find a shallowest leaf node, , among the replacement locations. Remove from the tree.
3. Construct a complete tree that has the extra joins and as leaf nodes. The other nodes of are new keys.
4. Attach to the old location of .
5. Mark all 's internal nodes NEW and mark all the keys from the replacement locations (except the old location of ) to the root UPDATE.

After marking the key tree, the key server removes all nodes that are marked DELETE. The nodes marked UPDATE or NEW form the rekey subtree. The key server then traverses the rekey subtree, generates new keys, encrypts every new key by each of its children, constructs and multicasts the rekey message. It is not hard to see that the running time of the marking algorithm is $O((J+L)\log_d N + N)$ , which is efficient (simulation results in section 6.3).

5.2 Two Examples

Figure 6 shows an example to illustrate the marking algorithm. In this example, and leave the group, $u_{10}$ joins the group. The rekey subtree is marked as in the figure. Using group-oriented rekeying, the key server multicasts the following message to the group:

$s \rightarrow G$ : $\{ k_{456}' \}_{k_{10}}, \{ k_{456}' \}_{k_{5}},$ $\{ k_{456}' \}_{k_{6}}, \{ k_{789}' \}_{k_{7}},$

$\{ k_{789}' \}_{k_{8}}, \{ k_{1-9}' \}_{k_{123}},$ $\{ k_{1-9}' \}_{k_{456}'}, \{ k_{1-9}' \}_{k_{789}'}$

**Figure 6:** Marking algorithm example 1.
$\scalebox{0.8}{\includegraphics{figs/rekey_example1.eps}}$

**Figure 7:** Marking algorithm example 2.
$\scalebox{0.8}{\includegraphics{figs/rekey_example2.eps}}$

Figure 7 shows another example. In this example, leaves the group, and join the group. The rekey subtree is marked as in the figure. Using group-oriented rekeying, the key server multicasts the following message to the group:

$s \rightarrow G$ : $\{ k_{56} \}_{k_5}, \{ k_{56} \}_{k_6},$ $\{ k_{356} \}_{k_{56}}, \{ k_{356} \}_{k_3},$

$\{ k_{1-6} \}_{k_{12}}, \{ k_{1-6} \}_{k_{356}}$

5.3 Keeping the Key Tree Balanced

To achieve best performance, a key tree should be kept more or less balanced. Our marking algorithm aims to keep the tree balanced across multiple batches, by adding extra joins to the shallowest leaf node of the tree in each batch.

However, depending on the actual locations of the requests, even if the key tree starts complete and balanced, it is possible that the key tree may grow unbalanced after some number of batches. For example, many users located close to each other in the key tree may decide to leave at the same time. It is impossible to keep the key tree balanced all the time, without incurring extra cost. We refer the interested reader to [14] for discussions on this topic.

6 Analysis

In this section, we analyze the server processing cost for batch rekeying. We consider the worst case and average case and compare batch rekeying against individual rekeying.

Key star's batch rekeying server cost, denoted as , is:

$\displaystyle R_B(N,J,L)= \left\{ \begin{array}{ll} J+1 & \mbox{if $L=0$} \\ N+J-L & \mbox{if $L>0$} \end{array} \right.$

Thus, our analysis mainly focuses on key trees. For the purpose of analysis, we assume that the key tree is a complete and balanced tree at the beginning of a batch, and that each current user has equal probability of leaving.

Let be the key tree degree, be the group size at the beginning of a batch, be the height of the key tree ( $h=\log_d N$ ), be the worst case batch rekeying server cost, and be the average case batch rekeying server cost.

6.1 Worst Case Analysis

The marking algorithm can control where to place joins, but cannot control where leaves happen. Thus, worst case analysis mainly considers how the locations of leaves affect the server cost. Since the marking algorithm takes different operations for four cases, worst case analysis is also divided into four cases.

Case 1: .
For simplicity, we first assume for some integer . When , the worst case happens when the leaves are evenly distributed across the leaf nodes in the key tree (see Figure 6.1(a)). The server cost for this case is:
$\displaystyle W_{=}(N,d,J,L) = L d \log_d \frac{N}{L} + \frac{d(L-1)}{d-1}$

If is not some power of , suppose , , then in the worst case, each of the additional leaves adds to the total cost (see Figure 6.1(b)). Thus,
$\displaystyle W_{=}(N,d,J,L) = d^{k+1}(h-k) + \frac{d(d^k-1)}{d-1} + rd(h-k-1)$
In other words, if $d^k < L < d^{k+1}$ , $W_{=}(N,d,J,L)$ grows linearly between $W_{=}(N,d,J,d^k)$ and $W_{=}(N,d,J,d^{k+1})$ .

Figure 8: Worst case analysis.
$\scalebox{0.8}{\includegraphics{figs/worstanal.eps}}$
Case 2: .
The analysis for this case is similar to the previous case. The only difference is that there are only users left in this case. Thus,
$\displaystyle W_{<}(N,d,J,L) = W_{=}(N,d,L,L) - (L-J)$
Case 3: and .
In this case, the marking algorithm has full control of the server cost. It is not hard to see that a complete -ary tree with leaves is of size $\lceil \frac{dn-1}{d-1} \rceil$ . Thus, 's size is $\lceil \frac{d(J+1)-1}{d-1} \rceil$ (see Figure 5) and there are $\log_d N+1$ nodes from the root to . Thus:
$\displaystyle W_{>}(N,d,J,0) = \lceil \frac{dJ}{d-1} \rceil + 2 \log_d N$
Case 4: and .
When there are more joins than leaves, the analysis is a combination of cases 1 and 3. Thus:
$\displaystyle W_{>}(N,d,J,L) = W_{=}(N,d,L,L) + \lceil \frac{d(J-L)}{d-1} \rceil$

**Figure 8:** Worst case analysis.
$\scalebox{0.8}{\includegraphics{figs/worstanal.eps}}$

**Figure 9:** Worst case server cost.
$\begin{figure}\center{ \psfig{figure=dfigs/worst.n1024.d2.eps,width=\figlen,an... ...} \psfig{figure=dfigs/worst.n4096.d4.eps,width=\figlen,angle=-90}} \end{figure}$

Figure 9 shows for and , on a wide range of and values. We can see that, for fixed , is an increasing function of (because more joins helps the key tree to ``grow back''). For fixed , as becomes larger, first increases (because more leaves means more keys to be changed), then decreases (because now some keys are pruned from the tree). Clearly, is an increasing function of . We will investigate the effect of in Section 7.

6.2 Average Case Analysis

The server cost depends on the number of nodes belonging to the rekey subtree, and the number of children each node has. Thus, our technique for average case analysis is to consider the probability that an individual node belongs to the rekey subtree, and to consider the node's expected number of children (some children might be pruned). Again, we consider the following four cases.

**Figure 10:** Average case analysis.
$\scalebox{0.8}{\includegraphics{figs/avganal.eps}}$

Case 1: .
This case forms the basis of our analysis for the other cases. Let the root of the key tree be at level 0, and the leaf nodes be at level , where $h=\log_d N$ . Let be the subtree rooted at node and be the leaf nodes of . Consider a node at level , $0 \leq l \leq h-1$ (see Figure 10). belongs to the rekey subtree if and only if there is at least one leave in . Assuming every current user has equal probability of leaving, there are $\binom{N}{L}$ ways to pick leaving users out of users. Among these many ways, $\binom{N-N/d^l}{L}$ of them have no leaves in . Thus, the probability that belongs to the rekey subtree is $1-{\binom{N-N/d^l}{L}}/{\binom{N}{L}}$ . Therefore,

$\displaystyle E_{=}(N,d,J,L) = d \displaystyle{\sum_{l=0}^{h-1}} d^l (1- \frac{\binom{N-N/d^l}{L}}{\binom{N}{L}})$
Case 2: .
When , we should take into account the probability that some nodes might be marked DELETE and pruned from the tree. A node is pruned if and only if all nodes in are leaves and none of them are replaced by joins. Using a similar technique as the previous case, we know the probability that a node at level is pruned is
$\displaystyle \frac {\binom{N-N/d^l}{L-N/d^l}} {\binom{N}{L}} \cdot \frac {\binom{L-N/d^l}{J}} {\binom{L}{J}} = \frac {\binom{L-J}{N/d^l}} {\binom{N}{N/d^l}}.$
Thus,

$\displaystyle E_{<}(N,d,J,L) = E_{=}(N,d,J,L) - \displaystyle{\sum_{l=0}^{h}} \ d^l \frac{\binom{L-J}{N/d^l}}{\binom{N}{N/d^l}}$
Case 3: and .
For this case,

$\displaystyle E_{>}(N,d,J,0) = W_{>}(N,d,J,0)$
Case 4: and .
The analysis for this case is a combination of cases 1 and 3. Thus:

$\displaystyle E_{>}(N,d,J,L) = E_{=}(N,d,L,L) + \lceil \frac{d(J-L)}{d-1} \rceil$

6.3 Simulation

We built a simulator for the marking algorithm. The simulator first constructs a key tree and randomly picks users to leave, it then runs the marking algorithm and calculates the server cost. We ran the simulation for times and calculated the mean server cost. Simulations were done on a Sun Ultra Sparc I with 167MHz CPU and 128MB memory. The marking algorithm took at most ms when , and at most ms when , for all and .

Figure 11 shows for and , on a wide range of and values. Our analysis and simulation results match so well that we cannot tell the difference.

**Figure 11:** Average case server cost.
$\begin{figure}\center{ \psfig{figure=dfigs/sim_calc.n1024.d2.eps,width=\figlen... ...psfig{figure=dfigs/sim_calc.n4096.d4.eps,width=\figlen,angle=-90}} \end{figure}$

Figure 12 compares average case and worst case server costs. These two costs are rather close to each other when and are relatively small compared to . Thus, we can consider worst case a good approximation to average case.

**Figure 12:** Average case and worst case server costs.
$\begin{figure}\center{ \psfig{figure=dfigs/avg_worst.n1024.d2.eps,width=\figle... ...sfig{figure=dfigs/avg_worst.n4096.d4.eps,width=\figlen,angle=-90}} \end{figure}$

6.4 Batch vs. Individual Rekeying

In this section, we show that batch rekeying saves server cost substantially over individual rekeying. The actual save depends on whether the key server uses key star or key tree.

6.4.1 Key Star

Let be the cost for processing joins and leaves individually. Clearly,

$\displaystyle R_I(N,J,L) = \left\{ \begin{array}{ll} 2J & \mbox{if $L=0$} \\ (N-1)L + 2J & \mbox{if $L>0$} \end{array} \right.$

Thus, the difference between batch rekeying and individual rekeying is:

$\displaystyle R_I(N,J,L) - R_B(N,J,L) = \left\{ \begin{array}{ll} J-1 & \mbox{if $L=0$} \\ N(L-1) + J & \mbox{if $L>0$} \end{array} \right.$

The difference is substantial, especially when

and

are large. ⁴

6.4.2 Key Tree

We have mentioned that using key tree for individual rekeying, the server cost is $d \log_d N - 1$ for a leave and $2 \log_d N$ for a join. Let be the server cost for rekeying a batch of joins and leaves individually. Clearly,

$\displaystyle S(N,d,J,L) = (dL + 2J) \log_d N - L$

We use this to be individual rekeying's cost ⁵, and we use average case cost as batch rekeying's cost.

Figure 13 compares batch rekeying against individual rekeying for and . On the left figure, we show a narrower range of and values. On the right figure, we show a wider range. We can see that batch rekeying has a substantial benefit over individual rekeying.

**Figure 13:** Batch rekeying vs. individual rekeying.
$\begin{figure}\center{ \psfig{figure=dfigs/batch_single.n1024.d2.eps,width=\fi... ...g{figure=dfigs/batch_single.n4096.d4.eps,width=\figlen,angle=-90}} \end{figure}$

7 Minimization of Server Cost

In the previous section, we have shown that batch rekeying saves server cost substantially. But when a key server uses batch rekeying, there are still two questions to answer. Shall the key server use key tree or key star? If it uses key tree, what degree should be the key tree? In this section, we show that the choice of data structure has big influence on server cost, and we provide guidelines for the key server to choose an appropriate one.

7.1 Key Tree vs. Key Star

First, we compare key star cost against -ary and -ary key tree costs. Figure 14 shows the comparison for and . The shadowed area is where key star is better. We observe that key star is better when or when and are large.

Figure 15 compares key star cost against key tree costs for , , and , . We temporarily ignore the case where . Each curve in the figures corresponds to a value. The curves are where key star cost is the same as -ary key tree cost. Above a curve is where key star is better, while below a curve is where key tree is better. From this figure we draw a similar conclusion that key star is better than key tree when and are large. Roughly, when $J < \frac{N}{2}$ and $L < \frac{N}{4}$ , key tree is better than key star; otherwise, key star is better than key tree.

**Figure 14:** Area where key star is better than key tree.
$\begin{figure}\center{ \psfig{figure=dfigs/tree_star.n1024.d2.eps,width=\figle... ...sfig{figure=dfigs/tree_star.n4096.d4.eps,width=\figlen,angle=-90}} \end{figure}$

**Figure 15:** Curves where key tree and key star costs are the same.
$\begin{figure}\center{ \psfig{figure=dfigs/tree_star.n1024.eps,width=\figlen,a... ... \psfig{figure=dfigs/tree_star.n4096.eps,width=\figlen,angle=-90}} \end{figure}$

7.2 What tree degree is the best

[17] showed that is optimal for individual rekeying. We show in this section that is still optimal for batch rekeying.

Since server cost depends on both and , we first look at the special case when . Figure 16 shows key tree costs for various values, when . We can see that is better than others when and are small. Figure 17 shows the server costs for key trees with various values when . We also observe that is better than others when and are small. These two figures also show that has a big influence on the resulting server cost. Figure 18 shows the area where is better than other values. We can see that roughly, in the area where key tree is better than key star, is the best.

**Figure 16:** Comparison of different tree degrees, when .
$\begin{figure}\center{ \psfig{figure=dfigs/d_comp.jeql.n1024.eps,width=\figlen... ...psfig{figure=dfigs/d_comp.jeql.n4096.eps,width=\figlen,angle=-90}} \end{figure}$

**Figure 17:** Comparison of different tree degrees, when .
$\begin{figure}\center{ \psfig{figure=dfigs/d_comp.jeq0.n1024.eps,width=\figlen... ...psfig{figure=dfigs/d_comp.jeq0.n4096.eps,width=\figlen,angle=-90}} \end{figure}$

**Figure 18:** Area where degree 4 is better.
$\begin{figure}\center{ \psfig{figure=dfigs/d_comp.n1024.eps,width=\figlen,angle=-90} \psfig{figure=dfigs/d_comp.n4096.eps,width=\figlen,angle=-90}} \end{figure}$

7.3 How to minimize server cost

From the above discussions we have seen that choosing an appropriate data structure has big influence on server cost. An application usually has an estimate of and , based on the expected number of users, length of the rekey interval, and the application's other characteristics. Thus, the guideline for a key server to choose an appropriate data structure is: if $J < \frac{N}{2}$ and $L < \frac{N}{4}$ , use a -ary key tree; otherwise, use a key star.

8 Related work

The topic of secure group communications has been investigated in [2,3,8,10,9,13]. [13] addressed the need for frequent group key changes and the associated scalability problem. The approach proposed in [13] decomposes a large group of users into many subgroups and employs a hierarchy of group security agents. This approach, however, requires many trusted entities.

Approaches that assume only a single trusted key server are proposed in [17,16,1]. In these approaches, a user is given some auxiliary keys, besides the group key and the individual key, to reduce the server cost for rekeying. These approaches have been mainly focused on reducing the server cost for individual rekeying.

[5] addressed the problem of batch rekeying and proposed using boolean function minimization technique to facilitate batch rekeying. Their approach, however, has the collusion problem, namely, two users can combine their knowledge of keys to continue reading group communications, even after they leave the group.

[14] addressed the problem of keeping the key tree balanced. Their approach essentially is to add joins at the shallowest leaf nodes of the tree, and re-structure the tree periodically. [14] also briefly described an algorithm for batch rekeying, in which joins replace leaves one by one, and if there are still extra joins, they are added to the shallowest leaf nodes of the tree one by one. [14]'s objective is to keep the key tree balanced, while our marking algorithm aims to reduce the server cost for batch rekeying. [14] did not provide any quantitative analysis for the benefit of batch rekeying.

9 Conclusion

This paper addressed the scalability problem of group key management. We identified two problems with individual rekeying: inefficiency and an out-of-sync problem between keys and data. We proposed the use of periodic batch rekeying to improve the key server's performance and alleviate the out-of-sync problem. We devised a marking algorithm for the key server to process a batch of join and leave requests, and we analyzed the key server's processing cost for batch rekeying. Our results show that batch rekeying, compared to individual rekeying, saves server cost substantially. We also show that when the number of requests is not large in a batch, four is the best key tree degree; otherwise, key star outperforms small-degree key trees.

10 Acknowledgement

We thank the anonymous reviewers for helpful comments.

Bibliography

1

David Balenson, David McGrew, and Alan Sherman.
Key Management for Large Dynamic Groups: One-way Function Trees and Amortized Initialization, INTERNET-DRAFT, 1999.

2

A. Ballardie.
Scalable Multicast Key Distribution, RFC 1949, May 1996.

3

A. Ballardie and J. Crowcroft.
Multicast-specific security threats and counter measures.
In Symposium on Network and Distributed System Security, San Diego, CA, February 1995.

4

E. Burns.
Webcast: collaborative document sharing via the mbone.
Technical report, NCSA, 1995.

5

Isabella Chang, Robert Engel, Dilip Kandlur, Dimitrios Pendarakis, and Debanjan Saha.
Key management for secure Internet multicast using boolean function minimization techniques.
In Proceedings of IEEE INFOCOM '99, volume 2, March 1999.

6

Stephen E. Deering.
Multicast routing in internetworks and extended LANs.
In Proceedings of ACM SIGCOMM '88, August 1988.

7

Alan O. Freier, Philip Karlton, and Paul C. Kocher.
The SSL protocol version 3.0.
Work in progress, IETF Internet-Draft, March 1996.

8

Li Gong.
Enclaves: Enabling secure collaboration over the internet.
IEEE Journal of Selected Areas in Communications, pages 547-575, 1997.

9

H. Harney, C. Muckenhirn, and T. Rivers.
Group Key Management Protocol Architecture, RFC 2094, July 1997.

10

H. Harney, C. Muckenhirn, and T. Rivers.
Group Key Management Protocol Specification, RFC 2093, July 1997.

11

Donald Knuth.
The Art of Computer Programming, volume 1, pages 401-402.
Addison Wesley, 3rd edition, 1997.

12

T. Liao.
Webcanal: a multicast web application.
In 6th International WWW Conference, Santa Clara, CA, April 1997, 1997.

13

S. Mittra.
Iolus: A framework for scalable secure multicasting.
In Proceedings of ACM SIGCOMM '97, September 1997.

14

M. J. Moyer, J. R. Rao, and P. Rohatgi.
Maintaining Balanced Key Trees for Secure Multicast, INTERNET-DRAFT, June 1999.

15

P. Parnes, M. Mattson, K. Synnes, and D. Schefstrom.
The mWeb presentation framework.
Computer Networks and ISDN Systems, pages 1083-1090, 1997.

16

D. Wallner, E. Harder, and Ryan Agee.
Key Management for Multicast: Issues and Architectures, INTERNET-DRAFT, September 1998.

17

Chung Kei Wong, Mohamed Gouda, and Simon S. Lam.
Secure group communications using key graphs.
In Proceedings of ACM SIGCOMM '98, September 1998.

18

Chung Kei Wong and Simon S. Lam.
Keystone: a group key management system.
In International Conference on Telecommunications, Acapulco, Mexico, May 2000.

Footnotes

... Communications ¹: Research sponsored by National Science Foundation grant No. ANI-9977267.
... delivery ²: Reliable delivery of rekey messages is itself a research topic [18]. In this paper, we assume that there exists some mechanism for reliable delivery of rekey messages.
... leaves.³: Keeping height information at the nodes is not difficult. We skip the details in this paper.
... large.⁴: We observe that the exact cost for individually rekeying a batch of requests depends on the actual sequence of the requests. Since our purpose is to estimate the benefit of batch rekeying, instead of finding the exact benefit, we do not go into this detail. We will see that the benefit of batch rekeying is so substantial that the actual sequence variation is unlikely to make a big difference.
... cost ⁵: See footnote 5.

$s \rightarrow u_1, \dots u_8$	:	$\{ k_{78} \}_{k_{7}} , \{ k_{78} \}_{k_{8}}, \{ k_{1-8} \}_{k_{123}}$ ,
		$\{ k_{1-8} \}_{k_{456}}, \{ k_{1-8} \}_{k_{78}}$

$s \rightarrow u_1, \dots, u_9$	:	$\{ k_{1-9} \}_{k_{1-8}} , \{ k_{789} \}_{k_{78}}$
$s \rightarrow u_9$	:	$\{ k_{1-9}, k_{789} \} _{k_9}$

$s \rightarrow u_1, u_2, u_3$	:	$\{ k_{1-4} \}_{k_{1-3}}$
$s \rightarrow u_4$	:	$\{ k_{1-4} \}_{k_4}$

$s \rightarrow G$	:	$\{ k_{456}' \}_{k_{10}}, \{ k_{456}' \}_{k_{5}},$ $\{ k_{456}' \}_{k_{6}}, \{ k_{789}' \}_{k_{7}},$
		$\{ k_{789}' \}_{k_{8}}, \{ k_{1-9}' \}_{k_{123}},$ $\{ k_{1-9}' \}_{k_{456}'}, \{ k_{1-9}' \}_{k_{789}'}$

$s \rightarrow G$	:	$\{ k_{56} \}_{k_5}, \{ k_{56} \}_{k_6},$ $\{ k_{356} \}_{k_{56}}, \{ k_{356} \}_{k_3},$
		$\{ k_{1-6} \}_{k_{12}}, \{ k_{1-6} \}_{k_{356}}$

Batch Rekeying for Secure Group Communications1