MCMP: A Transport/Session Level Distributed Protocol for Desktop Conference Setup

Share Embed


Descripción

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 14, NO. I, SEPTEMBER 1996

1404

Mai-Huong Nguyen, Member, IEEE, and Mischa Schwartz, Life Fellow, IEEE

Abstract- The Multiparty Conference Management Protocol (MCMP) is an end-to-enddistributed sessiodtransport level protocol intended for group management of desktop conferencing applications. This paper describes the MCMP conference setup algorithms and proves that they meet specified correctness properties. MCMP sets up control channels for use in exchanging control information while the conference is in progress. The logical topology used is a completely interconnectedmesh. MCMP always provides best effort services in cases of failures. Correctness properties guaranteed upon termination of the protocol include connectivity, validity, uniqueness, and consistency of local views. Application control information that can be exchanged using the control channels include resource negotiation and subconferencing.

I. INTRODUCTION OST existing networkmg protocol paradigms are based on the open systems interconnect (OSI) reference model, which supports only point-to-point communication involving the transfer of a single traffic type. Multiparty multimedia conferencing (MMC) applications extend this model in two dimensions: multiparty and multimedia. One paradigm in particular that needs revision concerns the role of session control in the exchange of information. The issues involved can be classified into those dealing with the multiparty aspects, such as conference membership management, and those dealing with the multimedia aspects, such as preserving timing relationship among the multiple information streams. This paper addresses the conference membership management issues at conference setup time. It describes and proves correctness for a set of distributed control algorithms intended for multiparty conference setup. The algorithms form a component of the multiparty conference management protocol (MCMP), which is an end-to-end session/transport level protocol consisting of a set of distributed control algorithms for conference configuration and membership management for desktop MMC. Specifically, the connections set up by MCMP are control channels used only for the purpose of exchanging control information in managing the conference. This framework remains in place for the duration of the conference. Once it is established, then separate connections may be set up Manuscnpt received May 1, 1995; revised March 29, 1996 M Schwartz was supported by ONR Grant N00014-90-5-1289 and National Science Foundation Grant CDR-188-11111. M-H. Nguyen is with the AT&T Bell Laboratories, Middletown, NJ 07748 USA. M. Schwartz is with Columbia University, New York, NY 10027-6699 USA Publisher Item Identifier S 0733-8716(96)06344-5

for information exchange; MCMP is not involved in these tasks. Conceptually, MCMP resides at the session level in that it establishes the infrastructure to enable the transfer of information among the conference participants. Functionally, MCMP encompasses both the session and transport levels in that it directly makes use of network level services in setting up the control channels, and hence must accommodate network level unreliability. MCMP algorithms establish and maintain complete interconnection among all current conference participants, guaranteeing that upon completion of any operation, local views at all participants are identical and accurately reflect the current conference membership. The algorithms are designed to provide best effort services, subject to the complete interconnection constraint. A general design principle that is used throughout is that although low probability events are not ignored, all algorithms are designed for the high probability events, i.e., these events incur lower complexity than the low probability events. A conferencing application may be characterized roughly in two dimensions, by the degree of interaction among the conference members and the conference size. For example, in a lecturing situation, the degree of interaction among members is low and the size of the conference is usually large; whereas in desktop conferencing, the degree of interaction among members is high and the size of the conference is usually small. The information flow requirements in the two applications and their implication on the control algorithms would be very different. Thus, it is important to note that the requirements and their impact on the algorithm designs discussed here are with respect to desktop conferencing applications only. Two conference control models can be used in a conferencing system, centralized and distributed. In a centralized system, a central site, usually predesignated, called the master, is responsible for maintaining state information about the conference participants as well as controlling information flow among the Participants.' The master may or may not be a participant in the conference that it is regulating. If the master is disconnected while the conference is in progress, the conference is terminated since no state or flow information exist elsewhere to allow either recovery or continuation. Conferencing protocols utilizing a centralized approach include 'Although the master is normally a single user, one or more special sites may also cooperate to provide the necessary control functions. The distinguishing factor of centralized systems is that the conference control functions, regardless of the initiating member, must be done through these special nodes.

0733-8716/96$05.00 0 1996 IEEE

NGUYEN AND SCHWARTZ: MCMP: A TRANSPORTISESSION LEVEL DISTRIBUTED PROTOCOL FOR DESKTOP CONFERENCE SETUP

[5], [25], and [35]. In a distributed system, every participant involved maintains local information about the conference. Participants also have equal status in the sense that any member has the capability to change the conference membership and the information flow during the conference. In addition, if any participant fails, the remaining members are still able to communicate. Multimedia conferencing protocols utilizing a distributed approach include [7], [12], and [29]. As the MMC applications being supported may use either centralized or distributed control, the approach considered by MCMP should be flexible enough to support either type of control. With respect to this, the flexibility and additional reliability associated with the distributed approach make it more attractive as a framework upon which to build a set of mechanisms. The conference application should also accommodate user heterogeneity, i.e., the different resources available at each participant. For example, those with audio capabilities are able to form voice connections; those without monitor the conference with whatever capabilities available. This implies that MCMP should have the ability to support the application resource negotiation process at the connection setup time as well as throughout the duration of the conference. Another aspect of conference control involves information flow policies. These refer to controlling the flow of each information stream to and from each participant during the conference, e.g., allowed input sources and output sinks. An example of flow policy is application input control, where only users in a specified set are allowed to send input to an application at any given time. An example of a more general information flow policy is an intra-stream subconference, where a subgroup of participants in the conference want to exchange information privately among themselves. A. Assumptions About the Operating Environment and Overview

In designing MCMP, the following general assumptions are made about the capabilities of a conference member participating in a desktop MMC, which may be either an end user or an information server. 1) All participants in the conference have equal status in that any user can set up the conference and that there is not a master in the conference. 2 ) Potential participants in the conference may have dissimilar capabilities, but all have at least data processing ability to permit the exchange of data messages. The following assumptions are also made about the underlying network and the network layer protocols. The network layer supports multicast addresses and allows any host to join or leave a multicast group dynamically. An example of such capabilities is the IP multicast address support and the Internet group management protocol (IGMP) [9]. The network searches for all possible routes in establishing a requested connection, and automatically switches routes in case of failures. Therefore, if a transport connection cannot be made between any two

1405

members, then no physical paths exists between those members, either due to intermediate node failures, end node failures, link failures, or resource limitation. 3 ) The underlying network has a low bit-error rate (BER). The implication is that the emphasis of the protocol design will be on optimizing the error free cases. Since the multicast services provided by the network may be unreliable, MCMP must deal with issues such as lost data, data arriving out of sequence, and duplicate data arriving at the receiver. These are the transport layer aspects of MCMP. MCMP is an end-to-end protocol, which means that the protocol entity resides at the end system. Each user that participates in a conference has a local MCMP entity that cooperates with other entities in a distributed manner to provide best effort services in setting up the requested conference. MCMP forms a conference with as many potential members as possible, subject to the constraints that all members in the resulting conference be able to communicate with each other and have consistent views of the conference membership. In cases where a single conference cannot be established with all participants, MCMP finds all completely interconnected subsets of nodes that involve the initiator(s) and commits them. During the setup, resource information may also be exchanged to facilitate the resource negotiation process at the application level. Since the setup is unordered, priority-based conference setup policies can be supported as follows. The user responsible for setting up the conference divides participants into one or more groups depending on the number of priority levels, and makes separate requests to MCMP for each of these groups. The highest priority groups will be set up first, and if MCMP notifies the user that one or more of these participants cannot be connected, then the user needs not issue further requests for the remaining groups. To support flexible policies with regard to information flow, MCMP locally associates an input set and an output set with each information stream in the conference. An MCMP entity only passes information received from participants listed in the input set to its user, and can request a restricted set of recipients by specifying an output set to its peers. Membership in each set is dynamic and is user-specified for each stream, with the exception of the control channels, which always have all participants in both the input and output sets. As such, flexible information flow policy can be easily supported. Intra-stream subconferencing can be implemented simply by changing membership of the input and output sets. B. Related Work There exists a large body of work addressing control issues in multiparty multimedia conferencing [2], [7], [ l l l , 1121, [171, [19], [25], [29]. Except for [2] and [25], all utilize a distributed control approach. Of the remaining, the Touring machine, the Connection Control Protocol (CCP), the Distributed Collaborative Environment (DEE), and Sticky concentrate on conference setup. Of these, only Sticky explicitly describes protocol behavior in cases of failures. None of the work adequately addresses correctness and formal properties of the resulting protocol. For work addressing these issues, one needs to turn to existing literature in distributed systems.

1406

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 14, NO. I, SEPTEMBER 1996

With the services of distributed conference setup provided by MCMP as described in the previous section, the attainment of a consistent view aspects of the problem combine some elements of three related problems in distributed systems: distributed consensus, group membership, and stopology maintenance. Distributed consensus (DC) is concerned with the sequence of message exchanges required to get each nonfaulty node in a group of nodes to arrive at a single value (either zero or one) in the presence of some node failures. Algorithms for DC cannot accommodate link failures since then each node may not have access to all other nodes, hence be unable to gather all other nodes’ inputs. Consensus in DC is achievable only in some systems, even with reliable communication links [lo], [ W , WI: i) synchronous processors with synchronous communication; ii) synchronous processors with synchronous message order; iii) synchronous message order with broadcast transmission. Within the boundaries of the above possibility conditions, Byzantine failures can be overcome only in synchronous systems that meet certain connectivity or fault conditions. In all cases, a number of rounds of message exchanges among all nodes are needed to overcome the failures. The group membership problem (GMP) is an application of DC [1], [8], and [28]. Instead of a single value, a single view of the current group membership is to be maintained across changes, which may occur due to network partitions, processor failures, or new processors joining the group. As above, since only a single view is allowed, link failures that do not lead to distinct network partitions cannot be accommodated. Again, this serves as a distinction between this class of problems and the conference setup problem. The last class of related problems is topology maintenance (TM), which is concerned with maintaining a consistent view of the topology across all changes [21], [221, [261, [301, [321. Compared to GMP, the view is here expanded to include both the nodes and the links between the nodes. As such, there is no problem with coming to a consensus in cases with link failures. A node fail-stop failure is modeled as a failure of all the connecting links. The topology update messages are either sent out in waves, e.g., [22], or flooded over the topology, e.g., [26]. Usually, upon receipt of a message, a node forwards the new information to a selected set of nodes (most often its neighbors) if the message received results in a local view update. The updates propagate throughout the network in this manner until no further changes are detected. As can be seen, the problem of setting up a conference for MCMP combines elements of all three problems discussed above. It is similar to DC and GMP in that each node in a group needs to arrive at a consistent group consensus in a distributed manner. It differs from them in that the group consensus is a view of the group that must consist of completely interconnected nodes; furthermore, in some cases of link failures, there may be multiple groups and

nodes in different groups would have different views. In this sense, conference setup is more similar to TM in that enough information needs to be exchanged to determine the existing topology. But conference setup requires an extra step in that once the existing topology is established, all possible subsets of nodes that are completely interconnected need to be identified. At termination of the setup, nodes in the same conference must have consistent local views on the conference membership. With regards to the relevance of the approaches and solutions summarized here, the following results are deemed applicable. From DC, the achievable results obtained can be used to refine the operating environment for MCMP. From GMP, it is learned that handling simultaneous changes as a single event and using a single node to coordinate events can reduce the message complexity of an algorithm. From TM, the mechanism of passing on information until there are no more changes is demonstrated as an effective way for propagating changes in the topology. The rest of this paper is organized as follows. Section I1 defines the problem of multiparty conference setup and examines desirable behavior required of any proposed solution. The MCMP conference setup protocol is detailed in Section 111, followed by its proofs of correctness in Section IV. Section V describes a specification and verification of the protocol, and Section VI presents two examples of application control infomation that can be transported over the control channels set up by MCMP. The notation used in the paper is as follows. Each member in the conference is referred to as U,, where 2 E { 1, . . . , N } , N is the total number of participants in the final conference, which may be less than the number of participants M in the original proposed conference membership list. The user at v, is U,; the MCMP entity at U, is MCMP,. The originating node is UO, with its corresponding U 0 and MCMPo. U, is also used as an abbreviated notation to refer the node in general. The connectivity topology realizable in the conference is modeled as a directed graph with conference members as nodes, and possible connections between members as directed edges. A directed edge exists from node u3 to node U, only if U, can receive messages from vJ.If both v, and v3 can receive each other’s messages, then ut is said to be completely connected to vj, and a bidirectional edge exists connecting v, to u3. If more than two nodes are involved and all pairs of nodes are completely connected, then the set of nodes is completely interconnected. 11. PROBLEM DEFINITION

A. Rejining Assumptions for the Operating Environment Using the achievable results for DC, the assumptions made in Section I-A are further refined as below: 1) synchronous processors, i.e., processor delay can be bounded; 2) synchronous communication, i.e., message delay can be bounded; 3 ) asynchronous message order, i.e., the channel can deliver messages out of order;

NGUYEN AND SCHWARTZ:MCMP A TRANSPORTISESSION LEVEL DISTRIBUTED PROTOCOL FOR DESKTOP CONFERENCE SETUP

a ) node failure:

1407

b) network partitioning:

physical level connections: (

a switch)

Fig. 1. Different failure conditions and desirable behavior of a conference setup algorithm.

4) unreliable communication links, i.e., messages can be lost and delivered in duplicates; 5 ) point-to-point communication links, although it is assumed that support for multicast addressing and routing exist. The multicast service is assumed unreliable. Although the above assumptions are made to ensure that the resulting operating environment falls within the theoretical boundaries of solvable conditions, i.e., a consensus can be achieved, they are also chosen so as to reflect a realistic networking environment. Therefore, the most general but realistic set of network services possible have been assumed to enable MCMP to function in as many situations as possible. With respect to failure modes, it is assumed that both links and nodes may fail, in particular, links may fail in only one direction. However, only fail-stop node failures are considered, eliminating situations where different nodes can receive different messages from the same sender. One implication of this is that errored packets received are discarded before MCMP processing begins. In addition, eventual network quiescence is assumed, i.e., there may be many failures occurring simultaneously, but eventually, the failure conditions stabilize and no additional failures occur for a given time interval, allowing messages to be exchanged. The initiator of the conference setup is given some priorities over other nodes; in particular, nodes that cannot establish complete connectivity to the initiator cannot be included in the final conference.

Note that since MCMP operates at the sessiordtransport level, a physical link failure may not lead to a transport level connection failure if the network can reroute the connection over other links available (see assumption in Section I-A). As such, if a MCMP connection fails, it is most likely due to an end system failure or an intermediate switch failure, which would lead to the network partitioning. The former case is shown in Fig. l(a), where a conference is setup with the remaining nodes (A, C, 0).In the latter case, when a switch fails, then all connections going through it to and from a particular set of nodes also fail, leading to a partitioning of the network, as shown in Fig. l(b). In this case, a conference can only be set up with (A, C). Note that although ( B , D ) is another completely connected subset, it is not identified because neither B nor D is completely connected to the initiator A. Although Fig. l(a) and (b) are the most likely failures, Fig. l(c) can also happen, possibly due to lack of resources on the path between C and D. Even though these situations are possible, they cannot be common since it is rare indeed that information can be routed at the sessiodtransport levels from C to D via A and B,but that a route cannot be established between C and D directly. In these cases, as shown in Fig. l(c), the conference setup algorithm must identify all completely interconnected subsets of nodes that are also completely connected to the initiator, i.e., (A, B,C) and (A, B,0). Note that in Fig. l(c), it is also possible that the connection failure is one-way only. In reality, existence of such configuraB. Desirable Behavior tions may depend on the multicast routing service provided by This section prescribes the desirable behavior of a con- the network layer. If multicast is supported by nodes joining ference setup algorithm in different failure scenarios. In all into a group address, such as with the Internet Group Multicast cases, what is strived for is a best effort approach, i.e., Protocol (IGMP), then links between two nodes are more likely if the requested conference cannot be set up due to lack to use the same physical path [9]. If multicast is supported by of connectivity, then all the remaining eligible nodes and building multicast trees rooted at the sender, such as with the completely interconnected configurations are identified for the Internet Experimental Stream Protocol I1 (ST-IQ then each application to make a decision. In this manner, the application node is the root of a multicast tree and a leaf of N - 1 other is provided with complete information about the connectivity trees, where N is the number of nodes in the graph [33]. It is possible instead of having the MCMP entities decide on a then possible that for a given node, links to and from the node single view automatically. This is in contrast to the voting use different physical paths since those links are part of two approach of DC and GMP algorithms, where only a single different trees. As such, it is possible that only one direction of the connection has failed. view is allowed, arrived at by a majority voting procedure. From the above discussion then, although configurations (a) Different failure conditions and the associated desirable behavior are shown in Fig. 1, where the user at node A and (b) of Fig. 1 are more common, the algorithm must also has requested a conference setup with nodes ( A , B , C, 0). accommodate configuration (c). When a connection failure

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 14, NO. 7, SEPTEMBER 1996

1408

3

A -

B

resultant conferences with A as sole initiator

1

resultant conferences withAandBas simultaneous initiators

C

*’

1

A-B

D

C Fig. 2. Simultaneous conference setup requests with incomplete connectivity.

does not lead to a network partition, it is referred to in discussion as an isolated link or connection failure. When more than one node initiates the setup for the same conference, it may be a case of simultaneous setups. Multiple conference setup requests received during the conference setup process are considered simultaneous only if their respective proposed conference membership lists are identical. Otherwise, two different conferences are formed. In the case of simultaneous requests, only a single conference is formed if the possible connectivity topologies resulting from the requests are identical. If the connection topologies realizable are different, then more than one conference may be setup, depending on the subsets of nodes that are completely interconnected. The resulting conference(s) must accommodate all possible connection topologies generated by all the requests, with the constraint that the same group of nodes cannot be committed twice. For example, refer to Fig. 2, where A and B are simultaneously requesting a conference setup with (A, B, C , 0). Separately, a conference request from A would have generated (A, B) and (A, C ) ;the same request from B would have generated (A, B) and ( B , 0).If the requests are made simultaneously, then the following subsets must be found: (A, B), ( A , C ) ,and ( B ,0); and (A,B) is committed exactly once. The behavior of the conference setup protocol for simultaneous setups is consistent with the best effort approach in the case of connectiodnode failures discussed earlier. C. Correctness Conditions This section further refines the behavior of any setup algorithms with correctness conditions that must be satisfied once the conference is set up. Algorithms that meet all conditions are considered correct. Connectivity: nodes are in the same conference only if they are completely interconnected. Furthermore, all sets of nodes completely interconnected to the initiator(s) of the setup request(s) must be identified and committed. Validity: the agreed upon configuration is a result of some valid input as determined by the conference initiator, i.e., the final conference membership is a subset of the initial proposed conference membership. Uniqueness: the same group of nodes is committed exactly once. Consistency: all users in the same conference have identical views of the conference configuration, i.e., local-view, = local-view,, i # j , for all U, and v3 in the same conference.

5 ) Termination: the conference setup algorithm terminates in finite time. 111. MCMP CONFERENCE SETUPPROTOCOL

Two parameters of interest for a conference setup protocol are its message complexity (C) and the time latency ( L )from receipt of the setup request by the local MCMP entity to completion of the setup process. The minimum latency Lmin required to complete a conference setup is max, {RTT,}, where RTT, is the round trip propagation delay from the conference initiator WO to a node U%. L,,, can only be achieved when the logical conference topology is a completely interconnected mesh. However, this implies a total message complexity C of O ( N 2 ) .Approaches which decrease C have focused on using special topologies such as a tree or a ring, which would result in O(log, N ) or O(1) message complexity [3],[14], [22];another alternative is to have special nodes within the topology that can act on behalf of the remaining nodes [6].The trade-off in using special topologies is an increase in latency L and additional delay to set them up. With both alternatives, additional management protocols are required to maintain and configure the logical network topology. As a first step, the decision was made to minimize L at some cost to C since this would decouple the protocol services issue from the topology issue. Thus, the conference topology used at the sessiodtransport levels is a completely interconnected mesh. A.

The following services are provided by MCMP. 1) Given a proposed list of M users from U,, MCMPo establishes a completely interconnected mesh among all users that are completely connected to the initiator and to each other. The request is treated as an atomic request. 2) The setup is unordered. 3 ) The conference is only set up with participants that have enough protocol resources, e.g., buffer and memory requirements, to accommodate all participants on the user list. Note that this is different from the application level resources such as audio or video capabilities. 4) Application level resource negotiation at conference setup time is facilitated by MCMP entities exchanging resource information with conference setup messages. This information is then passed on to the application. Once the control channels are set up, then MCMP allows

NGUYEN AND SCHWARTZ MCMP A TRANSPORTISESSION LEVEL DISTRIBUTED PROTOCOL FOR DESKTOP CONFERENCE SETUP

these negotiations to be carried out over the control channels. 5 ) At termination of the conference setup, all correctness conditions are satisfied. B. Algorithm Overview Since the conference setup request is atomic, MCMP treats all messages associated with the request as a single transaction with the same transaction ID composed of the tuple trans-id = (node-address, user-id, sequencenu)

where node-address is the network address of the local end system, user-id is the ID of the local user, and sequence-nu is a locally generated number which is incremented by one for every new transaction. It is assumed that a node-address is globally unique, a user-id is unique within the local system, and that both can be ordered in a monotonic manner. Within the conference, each user is uniquely identified by a member-id composed of the tuple member-id

=

(nodeaddress, user-id).

The relationship max ( ) of multiple member-id’s has the following result: Definition: Let m, denote the member-id of node v, in a given conference with N members, i E { 1, . . . , N } . Then m, = max ( m l , . . . , m N ) when m, satisfies exactly one of the following properties: a) w, has the highest numbered address, i.e., nodeaddress, > nodeaddress, V j E (1, ..., N } , U,

#

1409

two-phase message exchange in MCMP for conference setup is as follows. 1) The initiator notifies all nodes involved of the conference setup request and determines whether potential participants have enough resources and can establish the full connectivity required to support a conference call with M members. 2 ) Among participants remaining after Phase 1, the initiator commits all conference topologies that consist of completely interconnected members that include the initiator. Phase 1 resolves the possibility of membership inconsistencies by determining the conference topologies realizable among conference members. For example, A wants to set up a conference for (A,B , C). However, only connections (A,B ) and (A,C) succeed; ( B ,C) cannot be set up. An inconsistency results if A thinks that the conference is set up, but B and C disagree. Phase 1 is used to resolve this problem. Once the appropriate topology can be determined, Phase 2 commits the members involved to the conference being setup. C. The Basic Conference Setup Algorithm

The basic algorithm for conference setup is shown pictorially in Fig. 3. Phase 1 consists of Steps 1, 2, and 3 ; Phase 2 consists of Step 4. The basic two-phase commit, such as that used in [28] and [29], normally consists of just Steps 1, 3 , and 4. Step 2 has been added here to enable each potential participant in the conference to determine its reachability with respect to other potential participants in the conference; this is necessary for the establishment of the completely interconnected mesh.

U,;

b) if there exists a group of nodes V which has the same Phase 1: node-address, then v, has the highest ordered user-id, MCMPo initiates the setup by notifying all MCMP,, i E i.e., user-id, > user-id, Vu, E V , U, # v,. { 1, . , M } , of the request from U 0 for a conference The definition is valid for any number of member-id’s. In with M participants. The connect message contains 0 all cases, m, is unique, i.e., exactly one m, exists. a conf-id which uniquely identifies the conference, and a ulist which lists the member-id’s of all invited parThis definition is used when it is necessary for a group of ticipants. nodes to select a unique node among them to act as a “leader” Each MCMP, in the ulist acknowledges receipt of the and perform a specified task. connect message by multicasting an ack, to the The MCMP conference setup algorithm uses the initiator conference address. These acknowledgments serve two of the conference setup as the special node responsible for purposes: determining and committing the resulting conference topology. a) receipt of ackz at MCMPo indicates that In the case of multiple initiators, an algorithm is used to select MCMP, is completely connected to MCMPo, a unique node among the simultaneous initiators to be this i.e., M C M P o cs MCMP,, and special node; this is discussed later in Section 111-D. It was b) receipt of ack, at MCMP, indicates that chosen to use a special node throughout the setup process for MCMP, can be reached from MCMP,, i.e., two reasons: lower the message complexity of the resulting MCMP, c MCMP,. (Note that this does not algorithm, and with just one node responsible for the setup, imply M C M P , -+ MCMP,.) it is simpler to guarantee uniqueness and consistency for the resulting conference topology. This will be clarified below. These ack messages are used by each node to build up As connectivity among the potential conference members its local view, i.e., each MCMPi keeps a reachability is unknown at startup, the conference setup uses a twovector (reach-vectori), with an entry for each of the phase protocol. From the discussion on achievable results nodes in ulist as below. When an ack is received from for distributed consensus, it is known that multiple rounds node vj, the associated entry reach-vectori[vj]is marked of information exchanges are necessary to determine faulty with an 1; reach-vectori[i] is always 1. When a nak nodes before a common view can be established. The basic +

-

TEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 14, NO. 7, SEPTEMBER 1996

1410

_____

(@ ...

L

.................

_ _ I . -

.. ................................................

...

1i

”4.j

\

”\

/’”’--”--.

/’

...

I qk

”’

G 6 iJ (

’L

,. member-idB, A sends out a commitA for the same conference, as the commitB is on its way. The same group of nodes ( A ,B, C) is thus committed twice, forming two separate conferences.

1411

Observation c) thus requires a synchronization of all initiators so that they are aware of each other’s request. This can be done as follows. If an initiator I, receives a simultaneous request from one or more initiators I, after I, has sent out its own connect, then I, includes the following information in all its subsequent ack’s and reply’s to any I,: (con.-id,, member-id,, trans-id,), where trans-id, is the transaction ID of the connect from I,. Since any initiator needs a r e p l y from all the conference members to calculate possible topologies, any initiator would have become aware of all existing requests, and can determine if it is responsible for committing a topology. Therefore, at the time of commitment, it is clear that there are simultaneous requests, and the identities of the simultaneous initiators are also known. This is then the final modified algorithm SU-4 for handling simultaneous conference setups. Fig. 5 shows the message exchange with SU-4 for the example in Fig. 2. Two new responses are introduced with SU-4, a s im-connec t in place of the ack, and a sim-reply in place of the r e p l y , generated by the simultaneous initiators A and B with the additional information as discussed above to inform each other of their respective requests. Generalized FSM’s for the final algorithm are included in the Appendix. E. Calculating the Topology

This section describes algorithms’for the initiator to use in calculating the committed topology and determining if it is the initiator for any of the topologies discovered. The topology shown in Fig. 6 is used as an example throughout the discussion. In the following section, G = {VI, . . . , V M } refers to the set of potential conference participants. In addition, the subscript “0” is used to refer to the current local node, e.g., reach-vector, refers to the reach-vector of the local node currently under discussion. The topology calculation starts with the assumption that all U, E G can be included in the final conference, and shrinks the list as nodes are identified that cannot establish complete interconnection with other nodes. This is done by examining each reach-vector in the topology-table for unreachable nodes. If there are no unreachable nodes, then the resultant conference consists of all nodes in G. Otherwise, a base list of nodes, base-con. is established which consists of nodes that are completely interconnected to all other nodes in G, i.e., base-conf = {w,Ireach~ector,Ij] = 1 V j } . Applying this condition to the original topology-table of Fig. 6 yields base-con. = { A , B } even though B is not completely interconnected. This is due to the unidirectional failure between B and D . Therefore, in identifying base-con$ it is necessary to transform the topology-table to ensure that all reach-vectors are symmetric with respect to failures, i.e., (reach-vector,[j] = 0) + (reach-vector,[i]= 0). This transformed topology-table is shown at the right of Fig. 6, where the affected entries have been italicized. The resulting base-con. in this case only consists of node A. All subsequent topology calculations use the transformed table. To search for completely interconnected subsets of G, we start with all the remaining nodes which are unreachable

1412

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 14, NO. I, SEFFEMBER 1996

A sim-comecg (conf_idA,A)

A-B

X!

hf7J

B sim connectA (COG-idB,B)

Ct---,D

all possible connections

.

!C.-....................

DJ

possible conferences committed

0 Fig. 5. Message exchange with simultaneous conference setup requests.

-

G {A,B,C,D,E,F}

original topoZogy_tabZe:

::

i: 9

9

P

3' 2!

%

transform

pm

5'

F

i

i

i

o

i

i

Resultant conferences: (A,B,C,F) (A,B ,E,F) (A,D,E) Fig. 6. An example topology.

from at least one other node, e.g., B , C , D , E , and F in Fig. 6. Selecting a particular node w, with which to start a search round, e.g., B , its reach-vector is examined to identify unreachable nodes; the reach-vectors of such nodes are then omitted from further consideration in the current round of calculation, e.g., D. We keep on identifying and deleting unreachable nodes until all the remaining reach-vectors have been examined for the current search. The set of nodes remaining in the table forms a subset which is completely interconnected. This process is repeated successively until all nodes have been selected as the starting point for a search round. Note after the first time that a subset S of G is found to be completely interconnected, there are another I S 1 - 1 redundant search rounds that will result in the same

subset. Refer to [24] for a complete topology calculation algorithm that also incorporates an algorithm for identifying such redundant searches to cut down on the number of search rounds. When a calculated topology contains more than one initiator, then it is necessary to determine whether the local node is the appropriate initiator for any of the conference(s) calculated. The only information that the local initiator Io has about another initiator I, is reach-vectorI%.From this, Io must determine whether it is committing a conference that I, will also discover and commit. The problem can be stated as follows. Referring to Fig. 7(a), So is a set of completely interconnected nodes calculated by 1, that includes both Io and one or more initiators I,; R, is the set of nodes from

NGUYEN AND SCHWARTZ MCMP A TRANSPORTISESSION LEVEL DISTRIBUTED PROTOCOL FOR DESKTOP CONFERENCE SETUP

(a)

(b)

1413

(c)

Fig. 7. Different scenarios in committing conference topologies with simultaneous conference requests. (a) So and R, . (b)

So C R,. (c) So 3 R,

We examine each situation and show that it leads to a which I, is reachable, i.e., R, = {v,Ireach-vector~~[j] = l}. The issue is then to identify circumstances under which I, contradiction. In each case, there are four possibilities: neither would also obtain So in its topology calculation. There are v, nor v, is an initiator, v, is an initiator, v k is an initiator, and both are initiators. four possible cases as follows. a) No connection exists between U; and U,: i) So c R,: Referring to Fig. 7(b), 3v, E Rzlv,$ So. Any single conference calculated by I, cannot contain Neither v, nor v, is an initiator-given the topoli) all nodes in R,, but one of the conferences that can be ogy, the respective reach-vectors for both nodes obtained is So. Therefore, So should be committed by in the topoZogy-table at the conference initiator(s) max (I,,, I,);this can be readily determined by I,,. are as follows: ii) So = R,: The set of nodes that are completely connected U1 U ', vlc to I, are also completely interconnected, thus I, would also calculate a conference with all nodes in So. As reach-vector, above, S,, should be committed by max (I,,, I,); this can be readily determined by I,. reach-vector, iii) So 3 R,: 3v, E R,(v,$ So, with connectivity as shown in Fig. 7(c). By the definition of So above, all During the topology calculation, since all nodes in So are completely interconnected. Since I,, I,, reach-vectors # 1, it is necessary to start rounds and U, are all nodes in So, the connectivity as shown is to search for subsets of completely interconnected not possible. Therefore, the case So 3 R, cannot arise. nodes. If the search starts with reach-vector,, iv) (So g' R,) A (R, g' So):By the same reasoning as in U, is removed from further consideration since iii) above, this situation cannot arise. reach-vector,[j] = 0. Similarly, if the search Therefore, the only cases where Io needs to determine if starts with reach-vector,, U, is removed from it is the appropriate initiator for a calculated topology So is further consideration since reach-vector, [i] = 0. either (So C R,) or (So = R,). Io then multicasts a commit if Therefore, w, and v, cannot be in the same it is max(I,,, I,); otherwise, it waits for a commit to arrive conference. for that set of nodes. U, is an initiator-since there is no connection ii) between U, and U,, eventually w, detects that vJ is unreachable, and marks reach-vector, as all 0's. This precludes v, from including v, in any Iv. PROOFS OF CORRECTNESS conference. This section establishes the correctness conditions discussed iii) U, is an initiator-this is similar to ii) above, with in Section 11-C for the final MCMP conference setup algorithm v, and v, exchanging roles. su-4. iv) Both are initiators: this is the combination of ii) and iii), with the same result. --r. In all cases, if a connection cannot be established A. Connectivity between U, and v, , then w, and v, cannot be in the Lemma 1 (Connectivity Condition): Nodes end up in the same conference, contradicting the assumption. same conference only if they are completely interconnected. b) v ~ jis reachable from U;, but not vice versa (U, + U,): Proof: We show this by contradiction. Assume there Neither v, nor U, is an initiator-the untransi) exists a node vi which is not completely connected to just one formed and transformed reach-vectors for both other node w, in the same conference. There are three possible nodes in the topology-table at the initiator(s) are scenarios as below, where vh represents the remaining nodes as follows, respectively: After the reach-vectors in the conference (if any):

vk

vk

vk

are transformed as discussed in Section 111-E,

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 14, NO I, SEPTEMBER 1996

1414

ii)

this is the same case as in i) above, with the same result that v, and v, cannot be in the same Conference. U, is an initiator-since there is no connection from v, to U,, U, eventually detects that v, is unreachable, marks reachvector, as all O’s, thus precluding v, from any conferences calculated by

v, is an initiator-since U, did not receive the conference setup request from U‘ , v, cannot return a reply. Thus, even though the connectivity exists from v, to vg, since there is no information forthcoming from v,, v, eventually concludes that v, is unreachable. As in ii) above, this precludes vugfrom including v, in any conference topology calculated by v,. iv) Both are initiators-this is a combination of ii) and iii), with the same result. In all cases, vz and v, cannot be in the same conference. c) U, is reachable from U,, but not vice versa (w, +- vug): This is the same as b), with U, and U, exchanging roles. The result is thus the same, i.e., U , and v, cannot be in the same conference. Since all cases lead to a contradiction in the assumption that U, and v, are in the same conference, no such nodes U, and U, can exist. Therefore, nodes are in the same conference only if they are completely interconnected to each other. iii)

-

***

Corollary 1.1: The algorithm for topology calculation produces all sets of nodes that are completely interconnected to each other and to the initiator in the realizable connection topology. Proofi From Lemma 1, nodes end up in the same conference only if they are completely interconnected. Each search round of the algorithm thus produces a set of completely interconnected nodes. Except for the redundant searches, each node has its turn in starting a new search, resulting in a new set of nodes. Therefore, at the end of the algorithm, when each node has had its turn, all completely interconnected subsets are identified.

***

B. Validity

Lemma 2 (Validity Condition): The agreed upon configuration is the result of valid input as determined by the conference initiator(s), i.e., the final conference membership is a subset of the initial proposed conference membership from Uo. Proof Denote the final conference membership as Mf and the original proposed membership as M,. Assume that Mf @ M,. Then 3v,l(uz E M f ) A ( U % $ M,). To be included as part of M f , one of the two situations below must have occurred: a) v, would have to send a r e p l y message to the initiator vo as well as be included as part of the reach-vector in v3’s r e p l y message for all j E {I, . . . , M } . Therefore,

during Phase 1, v, must have sent ack messages to all potential participants. However, since M, is part of the initial connect message from the initiator in setting up the conference, v3 would recognize that v, is not an invited member of the conference and would ignore all messages from 21., Therefore, v, cannot be included in the reach-vector in the r e p l y sent by v,, and thus cannot be a potential candidate in M f . b) vo only received a r e p l y from v,, and did not receive a r e p l y from any v,, j E (1, M } . However, as v, is not a member of M,, VO also ignores all messages from U,. Therefore, v, cannot be considered as a potential candidate in Mf by VO. Both of the above situations lead to contradictions to the assumption that v, is a member of Mf but not a member of M,. Therefore, v, cannot exist, which implies that Mf C M,. s . ’ ,

***

C. Uniqueness Lemma 3 (Uniqueness Condition): Each set of completely interconnected nodes in the proposed conference membership is committed exactly once. Proof If there is only one initiator, then it is clear that any conference topology is committed only once. Thus, the case of interest is that involving simultaneous initiators and showing that more than one initiator cannot commit the same conference topology. Assume that both I, and I, comnlit the same topology S.Let R, and R, denote the set of reachable nodes in the reach-vectors of I, and I,, respectively. From Lemma 1 (Connectivity) and Lemma 2 (Validity), note that any conference topology S calculated must be a subset of both R, and Rug. From the algorithm to determine the appropriate initiator in cases of simultaneous setups discussed in Section 111-E, the initiator of any S that satisfies the condition (S C_ R,) or (S C R,) is max (member-il%, member-id13), which by definition gives a unique answer. Therefore, it is not possible for both I, and I, to decide that each is the initiator for a conference topology S .

***

D. Consistency

Lemma 4 (Consistency Condition): All nodes in the same conference have consistent views of the resulting conference configuration, i.e., local-view, = local-view, for each node U,, v, in the same conference. The final committed conference configuration is that included in the commit from the ultimate initiator of the conference. Proof We show this by contradiction. Assume that at the end of Phase 2, v,, v, are in the same conference, but local-viewvt # local-viewV3for the same conference. This is possible in the following cases: a) U, and U, receive different comrni t messages from the same initiator: the commit is multicast to all nodes in the original proposed conference membership, i.e., all nodes that receive the conference setup request also re-

NGUYEN AND SCHWARTZ MCMP: A TRANSPORTISESSION LEVEL DISTRIBUTED PROTOCOL FOR DESKTOP CONFERENCE SETUP

1415

ceive the commit if they are completely interconnected A connection exists from vo to all nodes V I , , and a to the initiator. Making use of the assumption that there connection exists from all nodes v k to v;:as soon as v k are no Byzantine failures and that all errored messages receives the setup request from V O , V I , sends an ack are dlscarded before algorithm processing begins, it is to all nodes in the proposed membership list, which not possible for v, and vJ to receive different commit includes U,. With the synchronous assumptions made on messages from the same initiator if v, and vJ are in the the operating environment, it is guaranteed that the ack same conference. arrives at U, in finite time. Whenever v, has received v, receives a commit and v J did not receive a commit: ack's from all other nodes, its local reach-vector is if v3 did not receive a commit,it must not be comcomplete and v, can send off a r e p l y to V O , which will pletely connected to the initiator. From the Connectivity arrive at vo in finite time, again using the synchronous Condition, this means that v, and vJ cannot be in the assumptions. same conference, contradicting the assumption above. No connection exists from vo to a node VI,, but a v, and vJ receive different commit messages from connection exists from V I , to v,:since V I , does not receive difSerent initiators: from the Uniqueness Condition, each the setup request from vo,v k does not send out any ack. set of completely interconnected nodes is committed Invoking the synchronous assumptions, v, concludes that exactly once. With simultaneous initiators, each initiav k is unreachable in finite time. Although this conclusion tor sends a commit message at the end of Phase 2. may not be true, since vk cannot be included in any Therefore, exactly one of these messages contain the conferences set up by vo,the conclusion is consistent committed conference; the rest are null messages sent with the desired result. In any case, v, uses the result to for the purpose of completing the setup process. As such, complete its local reach-vector, and send o f f a r e p l y although v, and vJ can receive different commit mesto V O , which will arrive in finite time. sages from different initiators for the same conference, No connection exists from V I , to 'U,: although the conthe messages are consistent in content, thus violating nection topology here is different than in b) above, the the assumption that local-viewVt # local-view,J for the resulting effect is the same, i.e., v, eventually concludes that v k is unreachable, marks the local reach-vector, as same conference. such, and sends a r e p l y off to vo. Since all cases lead to a contradiction, it is not possible for v; and v j to be in the same conference and yet have different Therefore, vo always receives a r e p l y from v, if vo and local views at the end of Phase 2. Therefore, all nodes must U, are completely connected. This means that in all cases, agree on the final configuration at the end of Phase 2. whether v, is completely connected to vo or note, vo will * * * either eventually receive a r e p l y from v, or can decide that U, is unreachable in finite time. Phase 1 thus always terminates in finite time. Phase 2 terminates when all participants receive a commit E. Termination from V O . Assume that a node v, does not receive a commit Lemma 5 (Termination Condition): The conference setup message. This can happen in two situations: algorithm terminates in finite time. a) vo never sent the commit message: vo sends a comProof: The conference setup protocol consists of two mi t after it has completed its topology-table, which phases. It is shown that each phase completes in finite time. from above, happens in finite time. Therefore, vo always Phase 1 is dedicated to discovering the degree of connectivsends out a commit message if it is up. If vo has failed, ity possible. It terminates when the initiator vo has completed v, can detect this in finite time, and aborts the conference its topology-table, i.e., for every node v, in the proposed setup. conference membership, either a reach-vector, is received or b) The commit is lost due to connection failure from vo to vo has determined that v, is unreachable. In proving the v,: the resulting effect here is as above, i.e., v, detects Connectivity Condition above, it was shown that if v, is not this in finite time, and aborts the conference setup. completely connected to V O , W O can detect that v, is unreachIn both cases, v, eventually either receives a commit able in finite time by invoking the synchronous assumptions. Therefore, it remains to be shown that when v, is completely and the conference is established, or v, decides to abort the connected to V O , a reach-vector is always received. There are conference setup. Therefore, Phase 2 always terminates. :. Since both Phases 1 and 2 terminate in finite time, the three possible cases as shown below, where V I , represents the remaining nodes in the conference (if any)-connections that algorithm terminates in finite time. *** are not shown are don't care connections, i.e., their existence or lack thereof makes no difference in the proof Theorem I : The conference setup algorithm is correct.

Vk

Pruofi From Section 11-C, it is stated that any algorithm that can satisfy all the correctness conditions is correct. Therefore, the theorem follows directly from Lemma 1 (Connectivity), Lemma 2 (Validity), Lemma 3 (Uniqueness), Lemma 4 (Consistency), and Lemma 5 (Termination).

***

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 14, NO. I, SEPTEMBER 1996

1416

v.

SPECIFICATION AND VERIFICATION CONFERENCE SETUP PROTOCOL

OF THE

This section describes the design, implementation, and verification of the conference setup protocol. The main purpose of the verification is to show that the message exchange achieves the results desired and that the protocol executes free of livelocks and deadlocks. It focuses on establishing correctness of the sequence of messages exchanged, showing that the initiator will always be able to gather information from a given node or detect its failure, under different channel conditions. The approach used is based on a functional decomposition of the protocol services into simpler interacting components, which may then be easily designed and verified. The protocol design is thus reduced to the specification of a collection of relatively simple finite state machines. These FSM’s are then executed in parallel for the protocol verification. Due to their simplicity, these modules may also be readily implemented in hardware, resulting in a straightforward translation of the protocol specification into a protocol implementation for either a mixed hardware-software environment or a multiprocessor environment. The FSM model used involves a collection of extended FSM’s communicating synchronously via inputloutput (YO) messages. Once specified, the FSM’s are verified using probabilistic random verification [20]. A software utility composes the states of all interacting FSM’s involved to form the set of possible states. The resulting state space is then explored randomly, until either a deadlock is encountered, a predetermined number of transitions is reached, or the program is stopped. The state exploration proceeds as follows. At every step, the set of all possible transitions is determined; a transition is then selected at random from this set. The effect obtained is that of a loosely coupled parallel processing environment, where each FSM is a processor executing independently at its own unpredictable pace; the processors can only be synchronized by the messages exchanged. Depending on the complexity of the FSM’s involved, the state space can be quite large, rendering complete state exploration impossible and keeping records of the states visited impractical. As such, unless a deadlock is reached, the following are used as indicators of the state space coverage: i) states visited in each FSM; ii) U 0 messages transmitted and received by each FSM; iii) percentage of rendezvous points (rpoznt) encountered; an rpoint is a composed inputloutput pair Metric i) indicates that the random exploration has visited a state m in some FSM A at least once; but not that all states in the state space involving state m in FSM A have been traversed. Similarly, ii) gives an indication that an U 0 message msg has been exchanged, but not necessarily that all compositions of U 0 pairs involving msg have been explored. Metric iii) is intended to give an indication of this coverage.

j

channellk channelh

Fig. 8. FSM model for each node in the conference.

1) user,: The application entity requesting service of the local MCMP entity to set up a conference.

2) ccm, (conference control and management at node

U,):

Interfaces to the user and implements the conference setup algorithm. The services of ccmz include controlling and maintaining the state during the setup process, and determining when to proceed to the next phase of the setup. This module is the “session” part of MCMP. 3) mcast, (multicast entity at node U,): Reliably multicasts messages from node U, to all other nodes in the conference, and receives messages from all other nodes. mcast, interfaces with the multicast service assumed from the network to provide a reliable multicast transport level service. This is the “transport” aspect of MCMP. 4) channel,,: Characterizes the channel used to exchange messages between node U, and node U,. The channel can be reliable or unreliable; if unreliable, messages can be lost, reordered, or duplicated. By separating out the reliable multicast data delivery service, different reliable transport level multicast protocols can be experimented with or substituted for. Since the service specification for the module is clearly defined, any protocol entities that provide the same services may be used. The final configuration used to model each node in the conference is as shown in Fig. 8; the modules shown in solid lines form the local MCMP entity at each node. user, and ccm, are as described above. Each mcast, has been implemented as ( N - 1) modules of ucastz3,each of which is responsible for the reliable exchange of messages between the local node U, and another node w, in the conference. The bidirectional channelt3 is modeled as two separate unidirectional channels, channel,, and channel3,, to allow for a more flexible characterization of the data transfer in each direction, as well as for simultaneous transmit and receive. The specification of chan,, allows for experimenting with the effect of a limited buffer size or limited network capacity, and the reliability of the channel. lmkx,, and lznkr,, send messages into channel,, and receive messages from channel,%,respectively.

A. SpeciJcation of the Conference Setup Protocol

B. Verification of the FSM’s

A first level functional partitioning of the entities involved in a conference setup results in the following.

In the design process, each FSM is verified individually for the services provided. The FSM’s for each node are

NGUYEN AND SCHWARTZ MCMP: A TRANSPORTISESSION LEVEL DISTRIBUTED PROTOCOL FOR DESKTOP CONFERENCE SETUP

MCMPo COMect (confjd,ulist)

McMpI..M-I

\

MCMPo connect

MCMP,

1417

MCMPj

1 -

ack

~

/)ak

/ from wait to etanack alfother participants commit(u1ist)

timer expires on replyfrom vi ProwrePlY 1

timer e v i r e s on ack from v ; probe( ackj

1 1 replyeusy) /

\

have heardfrom all wrticinants

Fig. 9. Message exchanges in the ideal case.

then interfaced together as shown in Fig. 8, and verified for correct node behavior. After the partial verification is successfully completed, then the originator and the participants are interfaced together, and verified using different channel characteristics. For each verification run, if a deadlock is encountered, then the transitions leading up to the deadlock are traced and the FSM’s involved corrected. Otherwise, the following steps are taken. 1) The number of rpoints covered is monitored until there is no change in the last 5 transitions, where 2 is determined experimentally as a function of the number of participants in the conference. 2) Analyze the unexplored states to detect possible livelocks. 3) Analyze the unexplored rpoints to determine if one of the following is the cause: redundant or unnecessary messages that can be eliminated, the rpoint is impossible to exercise due to the combination of states involved, not enough transitions have been taken, or a livelock. 4) Take random traces of the FSM’s interacting to ensure correct behavior, i.e., the FSM’s are making useful progress and exchanging messages as expected. Verification has been carried out successfully for a conference setup involving two, three, four, and eight participants. The number of transitions involved in all cases is in the order of lo8. Some examples of the message exchanges encountered during the verification are as follows: Ideal environment: In the ideal case, i.e., no message losses, no timeouts, and all participants are reachable, the message exchanges is as shown in Fig. 9. In a two-party conference, this exchange collapses down to a three-way handshake. MCMPl would only send a r e p l y as it does not need to wait for an ack from any other participants. The commit from MCMPo forms the third part of the handshake. No Reply is received from a node U,: There are two possibilities in this case, either a node U, is waiting for an ack from another node, or U, is unreachable. Fig. 10 shows the message exchanges in the first case. After receiving the ack from U,, if uo does not get a r e p l y within a fixed interval, it probes U, with a probe ( r e p l y ) .If U, is still waiting for other acks to arrive, it returns a r e p l y ( b u s y ) ,and then sends a r e p l y when U, completes its local reach-vector. If U, does not respond to a probe ( r e p l y ) after ma-rxmt retransmissions of the probe, vo marks vE as being

Fig. 10. No r e p l y received at vo because v 2 is waiting for an ack from v j .

unreachable. c) Message exchanges over unreliable channels: Fig. 11 shows the beginning of a typical scenario traced during a verification run for a conference setup of three members where the channel can be unreliable. As can be seen, if a request is duplicated, the participant returns another response since there is no way of knowing if the second request is a duplicate or a retransmission sent due to timeout. Messages can also cross in the channel, such as the connect and the probe ( c o n n e c t ) from 212. VI. CONTROL INFORMATION EXCHANGE This section briefly discusses two different types of information that can be exchanged over the control channels established by MCMP: resource negotiation and information flow policies.

A. Resource Negotiation At the applicationlevel, a resource is a capability of the user, or a service that can be provided, e.g., video and voice. At the session level, a resource translates into a type of stream that can be supported, e.g., a particular transport protocol such as RTP [31]. Ideally, the result of a negotiation is a solution that can accommodate all parties involved in terms of the number and types of streams to be used in the conference. Although a protocol entity at this level does not have enough information to achieve this for the more complicated instances, MCMP can act on behalf of the users and set up the streams in some cases if the application can specify the following as part of the init-conf ( ) request: init-conf (

conf-id, ulist,

(stream-id-1, t y p e , s t - u l i s t - 1 , o p t i o n , stream-QOS, group-QOS) ,

(stream-id-K,

type, st-ulist-K,

stream-QOS,

option,

group-QOS) )

where conf - i d and uli s t are as from before; the remaining fields are as follows: stream-id tme s t -ulis t

ID of the stream to be established type of stream or protocol used members in the stream (this is always a subset of the conference uli s t)

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 14, NO. I, SEPTEMBER 1996

1418

MCMP,

MCMPl

MCMP?

connect connect

........................................................ .e

ack

timeout on connect

._________--________-----------------probe(connect)

timeout on ackfrom v2

timeout on replyfrom V I ProWrePW 6 -3

timeout on ack probe(ack) -

w

63 message lost W message

Fig 11

I S duplicated

Start of a message exchange in a typical verification for a conference setup with three participants over unreliable channels

condition under which the stream should be established CI all users must be completely interconnected, otherwise abort, BE best effort s t ream-QOS stream dependent quality of servlce specifications group-QOS group quality of service, e g , synchronization with other streams option

Input/Output Set Membership

le a) Data flow for conference with no subconferencing

Then the streams can be setup, if possible, upon completion of the conference setup procedure; the actual setting up of a stream is dependent upon the protocol used for the type of stream requested. Some efforts are made at automating the common cases with the option field. Note that the quality of service specification should be changeable at any time during the conference, and not just at conference setup time. If none of the above fields are specified, MCMP can still aid in the user level negotiation process by exchanging a list of resources supported locally and passing this information up to the application. B. A Subconferencing Paradigm

Subconferencing refers to the ability of a subset of the conference members to communicate privately, without being overheard by other members of the conference while the conference is in progress. There are different alternatives to implementing subconferencing: i) request the creation of new streams with the desirable participants as members, and ii) modify the flow of information of existing streams so that information is exchanged only among desirable

expected to be in existence for some time, e.g., the duration of

InpuVOutput Set Membership

Users B

I,O

I,O

b) Data flow when subconference formed between A and B Fig. 12. Input/output set membership for subconferencing.

better use of existing resources. Implementing a subconference with the first approach is simply a request to set up a new stream with the specified members. This section discusses the second approach. Once a conference is setup, messages can be exchanged to modify the flow of information in some streams. Each stream that is set up has as its default each member being able to accept input from and send output to all other members of the conference. Selectively changing this flow can create subgroups of users within the stream. To support this, each node in a stream keeps a record of the input and output sets associated with the information flow for each node in the stream as below:

streamid

I, 0

1, 0

NGUYEN AND SCHWARTZ MCMP A TRANSPORTISESSION LEVEL DISTRIBUTED PROTOCOL FOR DESKTOP CONFERENCE SETUP

8

1419

U,?init-conf(conf-id,ulist)

PHASE-1

ulist!ack

ulist!connect (conf-id,ulist)

ulist?ack

(conf-id,commit-conf) (commit-conf)

(a)

(b)

Fig. 13. Generalized FSM for the basic conference setup algorithm SU-1. (a) Conference setup FSM for initiator FSM for initiator U " .

'6

U,?init-conf(conf_id,ulist)

and (b) conference setup

(cod-iQulist)

PHASE-I ulist!connect (conf-idplist)

ulist?ack

WO

ulist!acka

reach:vector reach-vector

reach vector corn Etefor con[idi

update all relevant

U

ulist!commit(commi-conf)

(commi-conf)

a

only send this message if1, is a new simult-init must also make sure that the reach-vector has been completedfor every setup request received, and the corresponding sim-reply has been sent

(b)

(a)

Fig. 14. Generalized conference FSM, including simultaneous setups. (a) Generalized conference setup FSM for conference initiator and (b) generalized conference setup FSM for other nodes.

from the associated node, and 0 denotes that the associated node is a member of the output set of the local node, i.e., the associated node can receive output from the local node. If a node w, desires to communicate privately with a set of nodes P , it simply multicasts a message to all nodes in P to request that the input/output flow be set to ( I , for

a)

each node vj P. Referring to Fig. 12(a), assume that the conference currently consists of (A, B, C , 0). With the input and output sets at each participant as shown for the traffic stream of interest, all participants can send and receive data from all other participants. Now, suppose that A wants to establish a subconference with B while still being able to

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL 14, NO 7, SEPTEMBER 1996

1420

monitor what C and D are discussing. Then A and B would use the control channels to request a change in their respective input and output sets as shown in Fig. 12(b) to achieve the desired conference interaction for this particular stream. A and B would still be able to receive information from C and D ; however, C and D cannot see the information exchanged between A and B . The flow for a stream can be modified at any point during the conference. VII. CONCLUSION This paper addresses control issues in setting up a multiparty conference for desktop conferencing applications. The paper presents and shows correctness for the MCMP conference setup algorithms as a solution to the problem of distributed multiparty conference setup. Given a proposed conference membership list, the algorithms provide best effort services and form one or more conferences with as many potential members as possible, subject to the constraint that all members are completely interconnected to each other and to the conference initiator. The algorithms guarantee consistency of local views, uniqueness and validity of the resulting conference(s), and proper termination conditions. The paper then presents a verification of the protocol specified as a collection of communicating finite state machines. Resource negotiation and subconferencing are presented as examples of control information that can be exchanged using the control channels set up by MCMP. In comparison with existing work on multiparty multimedia conferencing protocols, MCMP is unique in providing a set of algorithms with specified correctness properties. Furthermore, the MCMP conference setup behavior is completely specified under different combinations of node and link failures. In all cases, MCMP sets up the conference and provides the application with complete topology information. Other MCMP multiparty conferencing algorithms that have been developed include the additioddeletion of conference participants, new members joining, splivmerge of existing conferences, and conference reconfiguration when failures occur. For all algorithms, MCMP guarantees all the correctness properties specified here. APPENDIXA BASIC CONFERENCE SETUPALGORITHM

The notation used in the FSM’s is based on a CSPlike syntax: a!msg(args) denotes that the message m s g with arguments args, if any, is transmitted to FSM a, and a?msg(args) denotes that the message msg with arguments args is received from FSM a. Complete pseudo-codes can be found in [24]. The basic conference setup algorithm is shown in Fig. 13. APPENDIXB THE FINALALGORITHM

Incorporating the modifications from Section 111-D for SU-4 and the topology calculations discussed above, generalized

FSM’s of the final conference setup algorithm SU-4 are as shown in Fig. 14.

REFERENCES [l] Y. Amir et al., “Membership algorithms for multicast communication groups,” in Distributed Algorithms, A. Segall and S. Zaks, Eds. Haifa, Israel: Springer-Verlag, 1992, pp. 292-3 12. [2] M. Arango et al., “The touring machine system,” Commun. ACM, vol. 36, no. 1, pp. 68-77, Jan. 1993. [3] D. P. Bertsekas and J. N. Tsitsiklis, Parallel and Distributed Computation-Numerical Methods. Englewood Cliffs, NJ: Prentice-Hall, 1989. [4] G. Blair et aL, “Toward new transport services to support distributed multimedia applications,” in Proc. Multimedia’Y2, Monterey, CA, Apr. 1 4 , 1992, pp. 250-259. [5] W.-T. Chang, N. Chang, and D. Messerschmitt, “Call processing and signaling in a desktop multimedia conferencing system,” in Proc. GLOBECOM’Y2, Orlando, FL, Dec. 6-9, 1992, pp. 225-229. [6] J.-M. Chang and N. F. Maxemcbuk, “Reliable broadcast protocols,” ACM Trans. Comp. Syst., vol. 2, no. 3 , pp. 251-273, Aug. 1984. [7] M.-S. Chen et al., “A multimedia desktop collaboration system,” Comp. Sci., IBM Res. Rep. RC 17951 (#78882), May 1, 1992. [8] F. Cristian, “Reaching agreement on processor-group membership in synchronous distributed systems,” Distributed Computing, vol. 4, no. 4, pp. 175-187, 1991. [9] S. Deering, “Host extensions for IP multicasting,” RFC 1112, Aug. 1989. [lo] D. Dolev, C. Dwork, and L. Stockmeyer, “On the minimal synchronism needed for distributed consensus,” J. Assoc. Commtinn .~ Mach., vol. 34, no. 1, pp. 77-97, Jan. 1987. and perfor1111 A. Eleftheriadis, S. Peihan, and D. Anastassiou, “Algorithms mance evaluation of the Xphone multimedia communication system,” in Proc. ACMMultimedia 93. Anaheim, CA, Aug. 1-6, 1993, pp. 311-320. [ 121 C. Elliott, “High-quality multimedia conferencing through a long-haul packet network,” in Proc. ACM Multimedia 93, Anaheim, CA, Aug. 1-6, 1993, pp. 91-98. [13] M. Fischer, N.Lynch, and M. Paterson, “Impossibility of distributed consensus with one faulty process,” J. Assoc. Computing Mach., vol. 32, no. 2, pp. 374-382, Apr. 1985. [14] A. Ghafoor and P. B. Berra, “An efficient communication structure for distributed commit orotocols.” IEEE J. Select. Areas Commun., vol. 7, no. 3, pp. 375-389: Apr. 1989. [151 . . F. Hartanto and H. Sirisena, “A framework for B-ISDN CPE and signalling,” in Proc. GLOBECOM’Y3, Houston, TX, Nov. 29-Dec. 2, 1993, pp. 1511-1515. [16] C. A. R. Hoare, Communicating Sequential Processes. New York: Prentice-Hall, 1985. [17] R. Keller and W. Effelsherg, “MCAM: An application layer protocol for movie control, access, and management,” in Proc. ACM Multimedia 93, Anaheim, CA, Aug. 1-6, 1993, pp. 21-29. [IS] T. F. LaPorta and M. Veeraraghavan, “Description of a functional signaling architecture for broadband networks,” in Proc. GLOBECOM’93, Houston, TX, Nov. 29-Dec. 2, 1993, pp. 1012-1016. [19] M. R. Macedonia and D. P. Brutzman, “MBone provides audio and video across the internet,” Computer, vol. 27, no. 4, pp. 30-36, Apr. 1994. [20] N. F. Maxemchuk and K. K. Sabnani, “Probabilistic verification of communication protocols,” Dzstribufed Computing, vol. 3, no. 3, pp. 118-129, 1989. [21] J. McQuillan, I. Richer, and E. Rosen, “The new routing algorithm for the ARPANET,” IEEE Trans. Commun., vol. COM-28, no. 5, oo. 711-719, May 1980. 1221. P. Merlin and A. Segall, “A failsafe distributed routing protocol.” IEEE . Trans. Commun., voi. COM-27, no. 9, pp. 1280-1287, Sept. 1979. [23] S. Minzer, “A signaling protocol for complex multimedia services,” IEEE J. Select. Areas Commun., vol. 9, no. 9, pp. 1383-1394, Dec. 1991. [24] M-H. Nguyen, “Addressing multiparty issues in desktop conferencing applications: A multiparty conference management protocol,” Ph.D. dissertation, Columbia University, NY, 1994. [25] L. Y. Ong and M. Schwartz, “Centralized and distributed control for multimedia conferencing,” in Proc. ZCC’Y3, Geneva, June 1993. [26] R. Perlman, “Faul-tolerant broadcast of routing information,” Comp. Net.. vol. 7, no. 6, pp. 3955405, Dec. 1983. [27] J. Postel, Ed., “Transmission control protocol,” RFC793, Sept. 1981. [28] A. Ricciardi, “The group membership problem in asynchronous systems,” Ph.D. dissertation, Cornel1 University, Nov. 1992. ~~

_ I

NGUYEN AND SCHWARTZ MCMP: A TRANSPORTISESSION LEVEL DISTRIBUTED PROTOCOL FOR DESKTOP CONFERENCE SETUP

[29] E. Schooler and S. Casner, “An architecture for multimedia connection management,” in Proc. Multimedia’92, Monterey, CA, Apr. 1-4, 1992, pp. 271-274. [30] M. Schwartz, Telecommunication Networks: Protocols, Modeling, and Analysis. Reading, MA: Addison-Wesley, 1987. [31] H. Schulzrinne and S. Casner, “RTP: A transport protocol for real-time applications,” Internet Engineering Task Force Draft, Oct. 20, 1993. [32] W. Tajibnapis, “A correctness of a topology information maintenance protocol for a distributed computer network,” Commun. ACM, vol. 20, no. 7, pp. 477485, July 1977. [33] C. Topolcic, Ed., “Experimental internet stream protocol, Version 2 (ST-II),” RFC 1190, Oct. 1990. [34] J. Turek and D. Shasha, “The many faces of consensus in distributed systems,” Computer, vol. 25, no. 6, pp. 8-17, June 1992. [35] G. Vonderweidt et al., “A multipoint communication service for interactive applications,” IEEE Trans. Commun., vol. 39, no. 12, pp. 1875-1885, Dec. 1991.

Mai-Huong Nguyen (M’88), received the B.S.E. degree from Princeton University, Princeton, NJ, in 1985, the M.Eng. degree from Cornel1 University, Ithaca, NY, in 1986, and the Ph.D. degree from Columbia University, NY, in 1994, all in electncal engineenng. Since 1985, she has been with AT&T, working on internetworking issues in local-area networks, high performance networking protocols, distnbuted algorithms, and protocol designs to support multiparty multimedia conference applications, and high speed data transmission over copper for broadband applications such as ATM-to-the-desktop and fiber-to-the-curb. She is currently Technical Manager of the Multimedia Networking group in AT&T Bells Laboratories (Lucent Technologies), involved in the design and implementation of distributed platforms to support multimedia contents, performance issues for ATM over shared media, and system level issues related to VDSL.

1421

Mischa Schwartz (S’46-M’54-SM’54-F’66-LF‘92) received the B.E.E. degree from Cooper Union, in 1947, the M.E.E. degree from the Polytechnic Institute of Brooklyn, in 1949, and the Ph.D. degree in applied physics from Harvard University, in 1951. From 1947 to 1952, he was a Project Engineer with the Sperry Gyroscope Company, working in the fields of statistical communications theory, radar detection, and radar system design. From 1952 to 1974, he was Professor of Electrical Engineering at the Polytechnical Institute of Brooklyn, serving as Head of the Electrical Engineering Department from 1961 to 1965. During the year 1965-66, he was an NSF Science Faculty Fellow at the Laboratoire de Physique, Ecole Normale Superieure, Paris, France. During the academic year 1973-74, he was a Visiting Professor at Columbia University, joining the institution in September 1974, as Professor of Electrical Engineering and Computer Science. He is currently Charles Batchelor Professor of Electrical Engineering. From the 1980 calendar year, he was on leave as a Visiting Scientist with IBM Research. During 1986, he was on leave as a Resident Consultant with NYNEX Science and Technology. During 1994, he spent half-time at IBM Research, working in the field of wireless communications systems. During the first six months of 1995 he was, first, a Visiting Professor at University College London, under an EPSRC Visiting Fellowship, and then a half-time consultant at AT&T Bell Laboratories. Dr. Schwa& is a member of the National Academy of Engineering. He is a Fellow and former Director of the IEEE, formerly Chairman of Its Information Theory Group, and past President of the Communications Society. He is author or co-author of nine blocks and over 150 technical publications on communication theory and systems, signal processing, and computer communication networks. He is on the editorial boards of Networks, Computer Networks and ISDN Systems, the Japanese journal IEICE Transactions on Communications, and the Journal on Wireless Networks. He was a recipient of the Distinguished Visitor Award in 1975 from the Australian-American Education Foundation. He is a Fellow of the AAAS, and past Chairman, Commission C., U S . National CommitteelLiRSl. In 1983, he received the IEEE Education Medal. Columbia awarded him its Great Teacher medal in 1983. In 1984, the IEEE Centennial year, he was cited as one of the the ten alltime outstanding Electrical Engineering educators. He served from 1985-88 as Director of the Columbia University Center for Telecommunications Research, one of six national Engineering Research Centers established in 1985 under major grants from the National Science Foundation. In 1986, he was the recipient of the Copper Union Gano Dunn Award, given annually for outstanding achievement in Science and Technology. In 1989, he received the IEEE Region I Award for outstanding engineering management and leadership. In 1994, he received the IEEE Communications Society Edwin H. Armstrong Achievement Award for outstanding contributions to Communications Technology. In 1995, he received the Mayor’s Award for Excellence in Technology, NY.

Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.