Computer Networks 50 (2006) 29–45 www.elsevier.com/locate/comnet
Optimizing multimedia transcoding multicast trees Thijs Lambrecht *, Bart Duysburgh, Tim Wauters, Filip De Turck, Bart Dhoedt, Piet Demeester Department of Information Technology (INTEC), Ghent University-IMEC, Sint-Pietersnieuwstraat 41, B-9000 Gent, Belgium Received 28 March 2003; received in revised form 28 November 2003; accepted 18 October 2004 Available online 31 May 2005 Responsible Editor: S. Kutten
Abstract In this paper, network hosted transcoding is studied for the optimization of resources in multicast applications. The configuration is as follows: one source node is sending a signal (e.g. a video stream) and several destinations request that signal but at a bit rate adjusted to their access link. Instead of transcoding at the source node, the nodes inside the network will take care of the reduction in bit rate. The advantages are that the source load and overall bandwidth usage are reduced while end-users get the service at a customized quality. The focus of this paper is on the calculation of the optimal multicast tree, including transcoding location optimization and estimation of the resource usage gain. 2005 Published by Elsevier B.V. Keywords: Multicast trees; Transcoding; Active networks; Optimization
1. Introduction The current internet is very heterogeneous: Users are connected to the net in different ways *
Corresponding author. Tel.: +32 9 264 99 86; fax: +32 9 264 99 60. E-mail addresses:
[email protected] (T. Lambrecht),
[email protected] (B. Duysburgh),
[email protected] (T. Wauters), filip.deturck@ intec.ugent.be (F. De Turck),
[email protected] (B. Dhoedt),
[email protected] (P. Demeester). 1389-1286/$ - see front matter 2005 Published by Elsevier B.V. doi:10.1016/j.comnet.2004.10.019
with a wide variety in access capabilities. Nevertheless they all want the best quality possible and in case of multimedia streams, each user expects to receive the stream at the best quality possible. If the provider sends out only one multicast stream, there are currently two possibilities: either a high quality stream is sent and consequently users with a slow connection will not be able to receive the stream or either a low quality stream is transmitted, resulting in displeased broadband access users.
30
T. Lambrecht et al. / Computer Networks 50 (2006) 29–45
signal is injected in the tree and that in two nodes a transcoding takes place, thereby optimizing the overall bandwidth usage. The problem at hand has been described by Parnes [7] and the active approach has been studied in a.o. [8,9,1,10,11]. Pasquale et al. [8] use the concept of the relocatable continuous media filter: filters propagate upstream towards the source to decrease bandwidth consumption. In [9], a video gateway is used to perform bandwidth adaptation to match the transmission quality to the heterogeneous bandwidth constraints of distinct regions of a single logical multicast session. In [1], Kouvelas et al. propose a scheme that uses self-organization to form groups out of colocated receivers with bad reception and that provides local repair through the use of transcoders. In [10], a centralized algorithm is proposed that builds a hierarchy of multicast groups to tackle the problem. Singh et al. [11] propose the use of proxies that can cache and transcode Web content. In the work proposed here we focus on finding the optimal transcoding locations, which has not been addressed yet. The Transcoding Multicast Tree problem calculates the minimum cost multicast tree from source to all destinations and it determines the optimal transcoding locations in this multicast tree. The cost is related to the bandwidth usage on the edges and the resource usage in the nodes. An optimal solution and two heuristics
A solution to this problem consists of sending multiple formats of the same stream towards the requesting users (simulcasting [1]). This implies that the provider is responsible for all necessary transcodings and should send out all different formats in separate multicast trees. Another possibility is to use layered media streams and let the users subscribe to multiple layers to get better quality as proposed by McCanne in [2]. This method can only be used with formats supporting this layering of data. The solution discussed in this paper consists of extending (some of) the network nodes with transcoding capabilities making them active nodes ([3– 6]). In this way, only one multicast tree has to be set up and the transcodings can be done inside the network, at optimal locations i.e. at nodes that have enough processing power to do the transcoding and chosen such that the overall bandwidth usage is minimized. As a result, the provider only needs to inject a single format into the active (overlay) network while the users still get a customized service. Fig. 1 illustrates this solution: a video multicast server sends a video stream at 6 Mbps into the network. Three users request the video: user A requests high quality and asks for the video at full rate. User B also has a cable modem but requires only a 3 Mbps signal. User C has a T1 connection and requests the video at 1.2 Mbps. The active solution presented in Fig. 1 shows that the 6 Mbps
Transcoding nodes
6 Mbps
6 Mbps
6 Mbps
user A 6 Mbps 3 Mbps
Video Multicast Server Data classes offered
3 Mbps
3 Mbps
user B
1.2 Mbps
- Class 1: 6 Mbps - Class 2: 3 Mbps - Class 3: 1.2 Mbps
User requests
user C
User A, Cable modem, requests Class 1 User B, Cable modem, requests Class 2 User C, T1 connection, requests Class 3
Fig. 1. Example of a Transcoding Multicast Tree.
T. Lambrecht et al. / Computer Networks 50 (2006) 29–45
are presented. The optimal solution will be used as a reference for the heuristics. The remainder of this paper is organized as follows. The ILP-formulation of the problem is given in Section 2. An optimal solution can be calculated using this formulation. In Section 3, two heuristics that solve the problem in a non-optimal but faster way are presented. In Section 4, simulation results to evaluate the proposed algorithms and the influence of different parameters (number of destinations, number of types, number of active nodes, etc.) are shown. Finally, the conclusions are given in Section 5.
2. ILP-formulation of the problem In order to find an optimal solution to the Transcoding Multicast Tree (TMT) problem and also to present a precise view on the problem, an ILP-formulation is given here. The problem under investigation deals with calculating a minimum cost tree, taking into account both bandwidth and node resources. It is clear that this is related to and is even harder than the Minimal Steiner Tree problem [13]. Since the Minimal Steiner Tree problem is NP-complete, it is obvious that our problem is also NP-complete. Consequently, the ILP solution technique will not be useful when dealing with large networks, a large number of destination nodes or a large number of types. Therefore, two heuristics will be presented further on. The ILP-formulation will be used to evaluate the heuristics for relatively small problems. 2.1. Model description The problem can be characterized by the network topology, a list of available stream formats, the source and the destinations with their required stream format or type. First, some symbols and constants describing these objects are presented. 2.1.1. The network The network is a directed graph G, characterized by a set of nodes V of size jVj and a set of directed edges E of size jEj. The graph is presumed
31
to be bi-directional but the edge properties can be asymmetric. With every edge e from E a cost per unit of bandwidth ce and a bandwidth capacity ue is associated. The capacity of a node n from V is vn. This capacity is related to the ability of that node to transcode. ce, ue and vn are constants P0. In is the set of ingoing edges of node n. On is the set of edges leaving node n. 2.1.2. The types There is an ordered list of types T of size jTj, containing all formats the user can choose from. With each type t from T a bit rate bt corresponds. The types are ordered in decreasing order of bit rate requirements. The type with the highest bit rate will be referred to by t1. The cost associated with the transcoding from type t to type t 0 is denoted as kt,t 0 . In order to simplify the ILP-formulation, it is supposed that if a node just passes on the data, it also does a transcoding from type t to type t. The cost associated with this transcoding is set to zero. Note that the formulation given here is general and that a type can be anything (speech, audio, video, text, etc.). Also the transcoding process can stand for more general media transformations (transcoding, transrating, . . .). We only assume a transcoding hierarchy: a type t 2 T can only be transcoded to a type further in the ordered list T. 2.1.3. Source and destination nodes The source node s is the root of the multicast tree. The computing capacity of the source node is set to infinity. In other words, if needed, all necessary transcodings can be done at the source. This assumption ensures that the ILP-formulation always finds a feasible solution if the edge capacity is set high enough. With each type t from T corresponds a set of destination nodes Dt V of size jDtj. Hereafter, such a set will be referred to by the term class. The source node s does not belong to any of the classes. D is the P set of all destinations (D = ¨t 2 T Dt with jDj ¼ t2T jDt j). td is the type associated with destination d. A single node can belong to multiple classes and then there will be a corresponding destination node in set D for each requested format.
32
T. Lambrecht et al. / Computer Networks 50 (2006) 29–45
2.2. Variables In this subsection e is an edge from set E, n is a node from set V, t is a type from the ordered list T and d belongs to the set of destinations D. All variables are binary. • fe,t is 1 iff edge e is used to carry traffic of type t. (jEj Æ jTj f-variables.) • he,d,t is 1 iff edge e carries traffic of type t destined for d. (jEj Æ jDj Æ jTj h-variables.) • zn,t,t 0 is 1 iff node n does a transcoding from type t to type t 0 . Note that transcoding is only possible if type t 0 is ordered not before type t in the ordered list T (This will be noted as t 0 P t). (jVj Æ jTj Æ (jTj + 1)/2 z-variables.) • xn,d,t,t 0 is 1 iff node n does transcoding from type t to type t 0 (with t 0 P t) on the route towards destination d (jVj Æ jDj Æ jTj Æ (jTj + 1)/2 xvariables). The f- and z-variables allow the calculation of the bandwidth carried by the edges and the transcodings done in the different nodes. The h- and x-variables are auxiliary variables facilitating the ILP-formulation of the problem. 2.3. Formulation 2.3.1. Objective function Now that all symbols and variables have been presented, the objective function to minimize can be formulated as follows: XX X X X ce bt fe;t þ a k t;t0 zn;t;t0 . ð1Þ t2T
e2E
t2T t0 2T ;t0 Pt n2V
The objective function is the sum of two parts, the tree cost and the transcoding cost:
a is zero, only the tree cost is considered. By increasing a, the transcoding contribution to the total cost becomes more important, leading to a solution with less transcodings and more bandwidth usage. 2.3.2. Constraints Capacity constraints: X bt fe;t 6 ue 8e 2 E; X X
In order to be able to weigh the importance of one of these costs, a parameter a is introduced. If
k t;t0 zn;t;t0 6 vn
8n 2 V .
ð3Þ
t2T t0 2T ;t0 Pt
Constraint (2) imposes a restriction on the total flow through the edges. This flow may not exceed the capacity of the edge. Constraint (3) imposes a restriction on the transcodings performed in a node. The total cost of all transcodings done in a node may not exceed the capacity of that node. There are jEj (2) + jVj (3) capacity constraints. Auxiliary constraints: he;d;t 6 fe;t 8e 2 E; 8d 2 D; 8t 2 T ; xn;d;t;t0 6 zn;t;t0 8n 2 V ; 8d 2 D; 8t 2 T ; 8t0 2 T ; t0 P t.
ð4Þ ð5Þ
The constraints (4) and (5) take care of the relationship between he,d,t and fe,t and between xn,d,t,t 0 and zn,t,t 0 , respectively. The auxiliary variables h and x in fact allow to formulate the multicast problem as a unicast problem. There are jEj Æ jTj Æ jDj (4) + jVj Æ jTj Æ (jTj + 1)/2 Æ jDj(5) auxiliary constraints. Flow constraints: The are four series of flow constraints. Ingoing constraints X X he;d;t
xn;d;t;t0 ¼ 0 e2I n
• The first part of Eq. (1) is the tree cost which is related to the bandwidth usage. Transporting type t over edge e (i.e. fe,t = 1) costs ce Æ bt. • The second part is the transcoding cost. It defines the cost related to the transcodings done in the nodes. Transcoding from type t to type t 0 in node n (i.e. zn,t,t 0 = 1) costs kt,t 0 .
ð2Þ
t2T
t0 2T ;t0 Pt
8n 2 V n fsg; d 2 D; t 2 T .
ð6Þ
Constraint (6) regulates the flow towards a node other than the source node s: if an edge carries flow of type t towards destination d then the target node of that edge should do a transcoding for destination d from type t to another type t 0 , t 0 P t. The source node is dealt with separately in Section 2.3.2. There are (jVj 1) Æ jDj Æ jTj (6) incoming constraints.
T. Lambrecht et al. / Computer Networks 50 (2006) 29–45
Outgoing constraints X X he;d;t0
xn;d;t;t0 ¼ 0 8d 2 D; e2On
Binary constraints:
t2T ;t0 Pt
8n 2 V n fdg.
ð7Þ
Constraint (7) is similar to constraint (6) but regulates the flow leaving a node: if there is flow of type t 0 leaving a node towards destination d then that node has to do a transcoding to type t 0 for destination d. This time there are no leaving constraints when the node n in question is the destination node d. The destination nodes are handled separately for this constraint (see Section 2.3.2). There are (jVj 1) Æ jDj Æ jTj outgoing constraints. Source constraints X xs;d;t1 ;t ¼ 1 8d 2 D; ð8Þ t2T
xs;d;t;t0 ¼ 0 8t 2 T n ft1 g; 8t0 P t; 8d 2 D; he;d;t ¼ 0
8e 2 I s ; 8t 2 T ; 8d 2 D.
ð9Þ ð10Þ
Constraints (8)–(10) take care of the source node. Constraint (8) imposes that the source supplies exactly one format towards every destination. It is assumed that only type t1 can be used as type to start with. Consequently transcodings in the source must be from type t1 to another type t 2 T. Constraint (9) prevents transcodings in the source node starting from a type different than t1. Constraint (10) prevents any flow from entering the source. Destination constraints X xd;d;t;td ¼ 1 8d 2 D; ð11Þ t2T ;t6td
xd;d;t;t0 ¼ 0 8d 2 D; 8t 2 T 8t0 2 T ; t0 P t; t0 6¼ td ;
ð12Þ
he;d;t ¼ 0
ð13Þ
8d 2 D; 8e 2 Od ; 8t 2 T .
33
Constraints (11)–(13) handle the destination nodes. Constraint (11) makes sure that the requested type td arrives in destination node d: there has to be exactly one transcoding in d from a certain type t (t 6 td) to the requested type td. Constraint (12) states that any transcoding, done for destination d, towards another type t5td, should not take place in d. Constraint (13) ensures that no edge leaving the destination node carries flow towards that destination, preventing loops.
fe;t bin 8e 2 E; 8t 2 T ;
ð14Þ
he;d;t bin 8e 2 E; 8d 2 D; 8t 2 T ;
ð15Þ
zn;t;t0 bin 8n 2 V ; 8t 2 T ; 8t0 2 T ; t0 P t;
ð16Þ
xn;d;t;t0 bin 8n 2 V ; 8d 2 D; 8t 2 T ; 8t0 2 T ; t0 P t.
ð17Þ
Constraints (14)–(17) impose that all variables are binary. This concludes the ILP-formulation. Based on this ILP-formulation, a tool using Cplex [12], was written to solve the TMT problem.
3. Heuristics As mentioned before, the TMT problem is NP complete, so finding the optimal solution will not be feasible for large networks or for a large number of types or destinations. Therefore, two heuristics are proposed here, offering a fast but sub-optimal solution. 3.1. Skinned Steiner Tree heuristic This heuristic algorithm can be divided in two steps: in the first step, a tree with sufficient bandwidth is calculated. In the second step, the exact bandwidth and transcodings needed are determined. 3.1.1. Calculating the tree First, we attempt to calculate a minimum cost tree (using the shortest path tree [14]) connecting the source and every destination. On this preliminary tree, sufficient bandwidth is available to carry type t1. This is done by temporarily eliminating the edges that do not have enough bandwidth when the minimum cost tree is calculated. If every destination is in this tree, step 2 is taken. If this is not the case, one or more destinations cannot be reached, due to the fact that at least one link towards these destinations has insufficient bandwidth to carry t1 traffic. In this case, the algorithm continues step 1 by making a list of the destinations that were not reached and by sorting this list in order of decreasing requested
34
T. Lambrecht et al. / Computer Networks 50 (2006) 29–45
bandwidth. The algorithm then tries to sequentially add the destinations from this list to the preliminary tree. A path from the existing tree towards this particular destination d is calculated: first all edges with less bandwidth than required by type td, are temporarily eliminated. Then, the cost of the edges already belonging to the current tree are temporarily set to zero. Finally, the shortest path connecting source and destination is calculated. The edges of this path not yet belonging to the tree, are added to it. Destinations that still cannot be reached after this step are marked unreachable. The program notifies the user and the destinations in question are discarded for the rest of the calculations. 3.1.2. Skinning the tree At this stage, a tree connecting all the reachable destinations and the source is found, but no bandwidth optimization has been done and transcoding locations are unknown. This can be achieved by skinning the tree: this is done by walking up the tree from leaves to source, reserving bandwidth and determining transcodings on the way up. Fig. 2 illustrates this procedure. In Fig. 2a, a tree from source S to two destinations (C & G) A
B
?
C 2
A
B
? S
?
D 1-2
E
S
?
H
F
S
D
2
C 2
A
B
E
S
D 1-2
G 1
c
3.2. Branch attach heuristic 2
C 2
2
1 F
H
b
2 ?
E
G 1
a B
D 1
G 1
A
C 2
?
? F
?
is shown. G belongs to class 1 while C is a class 2 destination. The skinning algorithm starts with a randomly chosen leaf. In the leaves, no transcoding needs to be done. (Unless multiple types are requested by that leaf. In that case, transcodings from the requested type with the highest bit rate to the other requested types is needed.) Consequently, on the incoming edge, the (highest) requested bandwidth is reserved. This can be seen in Fig. 2b: in this case, leaf G was the chosen leaf to start with. On the edge towards G, enough bandwidth to carry type t1 traffic is reserved. The algorithm takes a step up the tree, to node D. Since there is an unvisited outgoing edge, nothing can be said yet for this node. The algorithm therefore walks down this unchecked edge (edge D–B) and then further down the following edge (edge B–C) until it reaches leaf C. Here, it can again determine the necessary incoming bandwidth. Now, it can also determine the necessary bandwidth on the incoming edge of node B. Fig. 2c shows the situation after both C and B have been visited. The algorithm is now back in node D and since all outgoing edges have been handled, the transcodings needed in this node can be determined, as well as the bandwidth needed on the incoming edge. It is clear that enough bandwidth has to be reserved for a type t1 flow on the incoming edge (the largest bandwidth flow needed by the outgoing edges) and that transcoding from type t1 to type t2 is needed in that node. In Fig. 2d, finally the source is reached. When all edges and nodes of the tree have been checked and edge bandwidth and node transcodings have been determined, the algorithm ends.
1
E
1 H
F
G 1
In contrast to the Skinned Steiner Tree Heuristic, this heuristic does not calculate a single big minimum cost tree, but calculates several smaller ones (one for each class). The procedure again consists of two major steps: an initialization step and a step where new branches are attached. This second step is done for every class except the first.
H
d
Fig. 2. Skinned Steiner Tree Heuristic—Skinning the Tree.
3.2.1. Initialization The algorithm starts by calculating the minimum cost tree (using the shortest path tree [14])
T. Lambrecht et al. / Computer Networks 50 (2006) 29–45
towards all destinations of class 1. Edges that do not have enough bandwidth to carry t1 data, are left out of the calculation. The edges of this tree will belong to the final tree and enough bandwidth to carry t1 traffic is immediately reserved on these edges. 3.2.2. Attaching the branches Before calculating the minimum cost tree for the next class, the costs of the edges already in the tree are set to zero. By doing this, edges already in the tree will be favored to be reused. The minimum cost tree for that class is then calculated. Again edges that do not have enough bandwidth to carry the current type of data, are left out in the calculation of the tree. Edges not belonging to the transcoding multicast tree found so far are added and the requested bandwidth is reserved. The algorithm also determines in which nodes a transcoding is needed. This is done by looking at the incoming and outgoing flows in the nodes belonging to the tree. This attaching step is repeated until all classes have been added. Classes are added in order of descending bandwidth requirement.
35
another intermediate node or until the source node is reached. This node will then send the new type towards node A, replacing the stream with the previous type to node A. In node A the signal is sent towards the new user and transcodings towards the other types leaving node A are performed. If no nodes of the tree are encountered on the way up, the user will get the data directly from the source. 3.3. Post optimization
3.2.3. Distributed implementation This heuristic can easily be modified to make it suited for a distributed implementation, since it does not require a global topology view. Suppose, a new user wanting to join an existing multicast stream, sends a request towards the source. If on the way up, a node is encountered that already belongs to the tree (referred to by node A), we can have three situations:
Both heuristics have one major drawback: they do not check whether the nodes are able to host the requested transcodings. This checking is done in a post-optimization step: first, all problem nodes are put in a set. Nodes are sequentially taken from this set and the post-optimization algorithm attempts to alleviate the node at hand, by trying to move the transcoding to another node. Three methods have been developed to do this. The algorithm tries all of these methods and calculates the additional cost. The cheapest method is subsequently applied and the next node in the set of problem nodes is investigated. If none of the methods comes to a solution, the nodes that cannot be reached are marked unreachable and reserved bandwidth is released. The user is notified of the problem encountered. The three methods in question are referred to as the upstream method, the downstream method and the active neighbor method. All three obviously cause an increase of the overall cost.
1. The requested type is the same as a type already present in node A. Then, node A just multicasts the data also towards the new user. 2. The requested type is ordered after a type already present node A. Now node A will perform a transcoding towards the new type and send this data towards the new user. 3. The requested type is ordered before all types already present in node A. This means that a higher bit rate type will have to be sent towards node A. This request is sent further up the tree until the requested new type is encountered at
3.3.1. Upstream method In this method, the transcodings are moved towards the source: the algorithm will go one step up in the tree and check whether that node has enough processing capacity left for the transcoding. If this is the case, the transcoding is moved to this node. If not, the algorithm checks the next node on the path to the source until eventually the source is reached. Note that by moving the transcoding towards a node closer to the source, the edges on the path from the new transcoding node to the problem
36
T. Lambrecht et al. / Computer Networks 50 (2006) 29–45
node will have to carry more traffic, because these edges will have to carry multiple formats. The algorithm has to check whether this is possible. The pseudo-code of the core of the algorithm is given below. The function takes as arguments the problem node N and the requested type t. First it checks the capacity of the edge and then it checks the capacity of the upstream-node. If the edge capacity is insufficient, the upstream method cannot be used. If the node capacity is insufficient, we will check nodes further upstream by using the upstream function recursively. The implemented algorithm also calculates the additional cost of upstream pushing (not shown in the pseudo-code). function upstream(N,t) if (enough capacity on upstream edge) add t-traffic to upstream edge else Stop:upstream method not feasible, undo reservations made S = source node of upstream edge if (transcoding to t present in S) Stop:upstream method finished else if (enough capacity left to perform transcoding to t) set transcoding to t in S Stop:upstream method finished else upstream(S,t) end function By using this method, the amount of transcodings usually stays the same (in rare cases, where we can re-use an existing transcoding, the number of transcodings decreases). The overall bandwidth usage always increases. 3.3.2. Downstream method In this method, the transcoding is done in node(s) further down the tree: the algorithm checks if it can push the transcoding towards one or more nodes further along the tree, ensuring that all nodes eventually get their requested type. Fig. 3a shows the solution before the post-optimizing step. The edge label shows the type of traffic the edge carries. The node label shows the type
A
B
2
C 2
A
B 1-2
2 S
1
D 1-2
G 1
C 2
1 E
S
H
F
1
1 F
2
D
E
1
a
G 1
H
b Fig. 3. Downstream method.
requested by that node and/or the transcodings that take place in that node. Node D is assumed unable to transcode to the required type. A possible solution is given in Fig. 3b: the transcoding in node D is moved to node B and the incoming edge of B carries type t1 traffic. The pseudo-code of the core of the algorithm is shown below. The function takes as arguments the problem node N, the high type th and the low type tl (th is chosen such that the transcoding cost to tl is minimal). On all edges of the tree leaving N and carrying type tl, the algorithm tries to replace the tl-stream with the th stream and checks whether the destination node can perform the necessary transcoding. The downstream method fails if there is insufficient capacity on any of the edges encountered or when the method has reached a leaf and is not able to transcode there. The implemented algorithm also calculates the additional cost to perform the downstream method. function downstream(N,th,tl) for all downstream edges carrying type tl if (enough capacity on edge) replace tl with th on edge else Stop:downstream method not feasible, undo reservations made D = destination node of downstream edge if (enough capacity left to perform transcoding) set transcoding in D
T. Lambrecht et al. / Computer Networks 50 (2006) 29–45
Continue with next edge else if (D requested format tl or D needs tl to transcode to another type t) Stop:downstream method not feasible, undo reservations made else downstream(D,th,tl) end for end function Using this method possibly increases the amount of transcodings and the overall bandwidth usage will rise. 3.3.3. Active neighbor method The last method searches for a transcoding node close to the problem node. The method checks the processing capabilities of the active node as well as the bandwidth limitations on the paths to and from this active node. Since it is possible that the transcoding is moved towards a node not in the tree, extra edges might have to be included. Nodes are investigated with increasing hop-count and the maximum hop count will be limited (maxHC). The example shown in Fig. 4 starts with the same configuration as in the previous sections. First active nodes on a single hop distance, are inspected. For each of these nodes, the additional cost is calculated. If node E is able to transcode from type t1 to type t2, the solution shown in Fig. 4b will be found. Notice that type t1 is sent towards node E, where a transcoding takes place to type t2 and that type is eventually sent back to node D. A
B
2
C 2
A
B
2 S
1
D 1-2
G 1
a
C 2
2 E
S
2 1
D 1
1 F
2
E 1-2
1 H
F
G 1
b
Fig. 4. Active neighbor method.
37
The pseudo-code of the core of the algorithm is presented below. The function takes as arguments the problem node N, the high type th and the low type tl (th is chosen such that the transcoding cost to tl is minimal). First active nodes that are one hop away and that are able to do the transcoding, are investigated. The algorithm checks whether the edges towards the node can carry the high type th and that the edges back towards the problem node can carry the low type tl. The algorithm calculates the additional cost and compares this cost with other solutions already found with this method. If no solutions are found the hop count (hc) is incremented. function activeNeighbor(N,th,tl) hc = 1 while no solution found and hc < maxHC for all active nodes able to transcode at hop count hc if (enough capacity on edges to the active node and enough capacity on edges from the active node) set transcoding and reserve bandwidth Stop:solution found, compare with possible other solutions else continue with next active node end for increment hc end while end function By using this method the amount of transcodings will remain the same, the overall bandwidth usage will increase and new edges might be added to the tree. 3.4. Favoring active nodes
H
The algorithm as explained so far works very well when all nodes in the network are active and capable of transcoding. If only a few of the nodes
38
T. Lambrecht et al. / Computer Networks 50 (2006) 29–45
are active, there is a chance that these will not belong to the calculated tree before the problem nodes are handled. The active neighbor method tries to add these active nodes to the final tree but the additional cost may be very high because the active nodes are far away from the problem nodes. In this case, we do not fully benefit from the processing power of the active nodes in the network. This problem is solved by performing a pre-calculation step: the edge cost of the edges entering active nodes will be multiplied with a factor b (0 6 b 6 1). b is set equal to (number of active nodes)/(number of nodes). If all nodes are active, b is one and the costs stay the same. If only one node is active, b is equal to 1/(number of nodes). The edges entering the active node will have a lower cost and thus a bigger chance of belonging to the multicast tree or in other words the active node is favored to belong to the tree. Note that the costs are only changed for calculating the multicast tree. When the tree cost is determined, the normal costs are used.
4. Simulations Several simulations were done to illustrate the usefulness of the active solution and to compare the performance of the different solutions discussed in Sections 2 and 3. We also implemented the simulcasting [1] strategy to compare the active solution with a nonactive solution. Exact least cost multicast trees where calculated for implementing this method, yielding the optimal non-active solution. For this solution, the tree cost is the sum of the costs of all the separate multicast trees. Since all transcodings have to be done in the source, the transcoding cost is the sum of theP transcoding costs from t1 to all other types t 2 T ( t2T ;t6t1 k t1 ;t ). We examined two cases: video and speech multicast. In the simulations done for this paper edge and node capacity will not be restricted (i.e. edges can carry all required traffic and nodes can do all required transcodings) unless mentioned otherwise. The edge cost per unit of bandwidth ce was set to 1 for all the edges in the network i.e. the total edge cost (=tree cost) will indicate the overall bandwidth
needed summed over all links in the network. All the problems were solved using the Non-Active solution (NAS), the two heuristics (SST and BA) and the Fully Active ILP-solution (FAS). Multiple similar simulations were done per configuration: the topology, the number of types and the number of destinations stayed the same. Given these values, the source and destination nodes were chosen randomly and the destinations were distributed uniformly over the classes, allowing us to calculate an average cost per configuration. 4.1. Video multicast 4.1.1. Video transcoding Before discussing the setup of the video multicast simulations, we will briefly describe the different forms of video transcoding. A distinction needs to be made between transrating and transcoding. Transcoding is the general term for converting content from one format to another. It usually requests the decoding and encoding of the signal. This is usually a processing power consuming operation. Transrating is a special case of transcoding and reduces the bit rate within a single compression format. Lastein and Reul [15] explain how this can be done while Reul and Lastein [16] state that transrating can in some cases achieve acceptable rate-reductions of up to 50%, thereby consuming quite some processing power, but not as much as the general transcoding. Transrating could be done in the active nodes of an active (overlay) network. Because of the complexity of the algorithms, hardware is needed to perform general transcodings in real time. A number of real time hardware transcoders have been developed ([17–19]), but these hardware components are rather expensive and have to be manually installed on all routers. Therefore, general transcoding will not be an option in the near future. Another solution to alleviate the transcoding effort in the case of MPEG-streams is to just drop specific frame types. An MPEG-stream [20] is composed of a series of Groups of Pictures (GoPs) and a GoP is composed of a sequence of pictures (frames). There are three different sorts of frames: I-, P- and B-frames. An I-frame is encoded as a
T. Lambrecht et al. / Computer Networks 50 (2006) 29–45
39
Fig. 5. The test network used for the video multicast simulations.
single image, with no reference to any past or future frame. A P-frame is encoded relative to the past reference frame. A reference frame is either a P- or I-frame. A B-frame is encoded relative to the past reference frame, the future reference frame, or both frames. It is clear that, because of the GoP-structure of the MPEG-stream, we can in principle leave out some of the frames to reduce the bit rate. B-frames may be omitted without restriction. Since P-frames depend on the previous reference frame (which may be a I- or a P-frame), only the last P-frame(s) of the GoP may be omitted (all B-frames depending on this P-frame should be omitted as well). Leaving out frames reduces the quality of the movie since the frame rate drops and the movie will start to stutter. Other solutions for low complexity transcoding of video-streams are being studied a.o. [21,22] and especially [23], where secure scalable streaming enabling transcoding without decryption is being investigated. 4.1.2. Configuration We will look at the case of an e-learning service where multiple users at different locations wish to
Table 1 Video stream GoP-structure and bit rate by leaving out B- and P-frames
join a video multicast session. Some users are in the lecture rooms of different universities and request a high bandwidth signal for screen projection. Others take the class at home and are satisfied with a lower bandwidth stream. The test network shown in Fig. 5 with 39 nodes and 58 edges, was used in these simulations. We used the frame dropping strategy to get some realistic bit rate values, starting from an MPEG-stream with a traditional 15-3 GoP-structure with a bit rate of 1150 kbps, B- and P-frames were left out to reduce the bandwidth needed (Table 1).
T. Lambrecht et al. / Computer Networks 50 (2006) 29–45
The transcoding costs kt,t 0 are set to 1 for all possible transcodings since dropping frames is equally expensive in all cases. The transcoding cost will indicate how often frames had to be dropped. Parameter a was set to 0 i.e. the tree was optimized to limit the overall bandwidth usage. Two series of simulations were done. In the first series (Broadband Video, first Series or short BV1) we kept the number of types jTj fixed at 4 with respective bit rates of 1150, 941, 732 and 525 kbps. We gradually increased the number of destinations from 4 up to 36 (in steps of 8). The destinations were uniformly distributed over the 4 types and over the network nodes of Fig. 5. In the second series (BV2), we kept the number of destinations fixed at 12. The number of types jTj was gradually increased from 1 up to 6 with respective bit rates of 1150, 941, 732, 525, 314 and 157 kbps. The destinations were evenly distributed over the available types.
Tree Cost (kbps)
BV1 - Tree Cost 900000 800000 700000 600000 500000 400000 300000 200000 100000
NAS SST BA FAS
500000 450000 400000 350000 300000 250000 200000
12
20
28
36
Number of Destinations Fig. 6. BV1—Tree cost for increasing number of destinations.
SST FAS
1
2
3
4
6
Fig. 7. BV2—Tree cost for increasing number of types.
benefit from the active nodes and needs less bandwidth than the non-active solution. There is already a large potential gain when only two types are requested (on average 30%). The non-active curve drops for a large number of types because the number of destinations remains the same, while types needing less bandwidth are introduced. Transcoding cost: Figs. 8 and 9 show the transcoding cost or, since the cost per transcoding is set to one for BV1 and BV2, the total number of transcodings needed. Note that in the Non-Active case, we also take into account the transcodings that have to be done in the source node. For BV1, this indicates that as the number of destinations increases, the number of transcodings required for an active solution increases linearly. In the non-active case, the number of transcodings only depends on the number of types (if there are N types, the source node has to do N 1 transcodings). Similar conclusions can be drawn for BV2.
30
BV1 - Node Cost NAS
25
SST
20
BA
15
FAS
10 5 0 4
4
NAS BA
Number of Types
Node Cost (# transc.)
4.1.3. Results Tree cost: Fig. 6 shows that for BV1, the active solution only needs about 53% of the bandwidth needed by the non-active solution. We see that once the number of destinations is large enough (exceeding 12 destinations), the relative gain becomes rather independent of the number of destinations. As the number of destinations increases, only a small amount of additional bandwidth is needed in both the active and non-active case since the active multicast tree and the multicast tree of the requested type will have spread throughout the network. Fig. 7 shows the tree cost for BV2. As the number of types increases the active solution can
BV2 - Tree Cost
Tree Cost (kbps)
40
12
20
28
36
Number of Destinations Fig. 8. BV1—Transcoding cost for increasing number of destinations.
10 8 6
NAS SST BA FAS ( =0)
4 2 0 1
2
3
4
41
BV2 - Flow Leaving Source
BV2 - Node Cost 12
Source Flow (kbps)
Node Cost (# transc.)
T. Lambrecht et al. / Computer Networks 50 (2006) 29–45
6
5000 4500 4000 3500 3000 2500 2000 1500
NAS BA
1
Number of Types
SST FAS
2
3
4
6
Number of Types
Fig. 9. BV2—Transcoding cost for increasing number of types.
Heuristics: For both BV1 and BV2, the heuristics perform rather well. Fig. 10 shows the relative difference (defined by: ((TreeCost)Heuristic/(TreeCost)FAS) 1*100). The heuristics stay within 10% of the optimal solution. This means that in real networks these heuristics could be used to approximate the optimal solution. The Branch Attach heuristic seems to outperform the Skinned Steiner Tree heuristic and, on average, stays within 5% of the optimal solution. Server load: The amount of flow leaving the source was calculated for BV1 and BV2. For BV1, the flow leaving the source in the active case is only half of the flow leaving the source in the non-active case, irrespective of the number of destinations. Fig. 11 shows that for BV2 the flow leaving the source in the active solution is almost independent of the number of types, while in the non-active solution this flow increases with the number of available types. Note that a certain type can leave the source node via multiple edges. This explains why in the active case the flow leaving the source is higher than just the highest bit rate.
Fig. 11. BV2—Total flow leaving the source node.
Influence of a: Also the influence of parameter a of the objective function (Eq. (1)) was investigated. As a increases, the transcoding cost becomes more important. Therefore, the corresponding solution will contain less transcodings but consequently it will use more bandwidth. By changing the value of a, the trade-off between bandwidth and transcoding can be steered. Note that when a is set high enough, the active solution will use the same amount of transcodings as the non-active solution. This is the minimum amount of transcodings that have to be made in order to have a feasible solution (=jTj 1). Figs. 12 and 13 show the tree and transcoding costs of the active solution for different values of a. If we look at the transcodings, we see that as a increases, the amount of transcodings done in the active case eventually reaches its lower limit, namely jTj 1. Influence of the number of active nodes: We will now look at the influence of the amount of active nodes. It is obvious that this does not change anything for the non-active solution. The active solutions however will need more bandwidth when the
BV2 - Heuristics - Tree Cost 8
SST
BV2 - Parameter
BA
7 6 5 4 3 1
2
3
4
5
Number of Classes Fig. 10. BV2—Relative difference in tree cost between FAS and the Heuristics.
Tree Cost (kbps)
% diff. w/ FAS
9
450000 400000 350000 300000 250000 200000
FAS (=0) = 50 k
12
= 10 k = 100 k
34
6
Number of Classes Fig. 12. BV2—Influence of parameter a on the tree cost.
T. Lambrecht et al. / Computer Networks 50 (2006) 29–45
BV2 - Parameter 10
Node Cost (# transc.)
Node Cost (#transc.)
42
FAS (=0)
8
= 10 k
6
= 50 k = 100 k
4 2 0 2
1
4
3
6
Number of Classes
number of active nodes decreases. The number of needed transcodings is expected to decrease: since less transcoding nodes are available, more transcodings will be reused. Fig. 14 shows the tree cost when only 25% of the nodes in the network are active. It shows that the Fully Active solution still needs less bandwidth than the non-active solution but does need more bandwidth compared to the fully active situation (see Fig. 6). The heuristics now need to do some post-optimizations (see Section 3.3) and thus they become sub optimal: on average the Skinned Steiner Tree Heuristic needs 12% more bandwidth than the Fully Active Solution and the Branch Attach Heuristic needs almost 16% more bandwidth. Fig. 15 shows the transcoding cost when only 25% of the nodes in the network are active. Compared to the Fully Active Network (see Fig. 8), less transcodings are needed. Since the number of transcoding nodes is limited, these nodes will transcode for more destination nodes.
Tree Cost (kbps)
Tree Cost - 25% Active NAS SST
50000
BA
40000
FAS
15
Node Cost - 25% Active
SST BA FAS
10 5 0 4
12
20
28
36
Number of Destinations
Fig. 13. BV2—Influence of parameter a on the transcoding cost.
60000
NAS
20
30000
Fig. 15. BV1—Influence of the number of active nodes (25%) on the transcoding cost.
We did the same simulation, but now with only 10% of the nodes being Active. The bandwidth gain for the Fully Active Solution was further reduced, while the heuristics even performed worse than the Non-Active Solution. This can be explained by the fact that the heuristics use a shortest path tree and not the exact minimal Steiner Trees used by the non-active solution. The number of transcodings needed by the active solutions further decreases. A way to partially circumvent the post optimization step is to use active overlay networks in which all nodes are active. Limiting capacities: Limiting the node-capacity to one (i.e. the active nodes can only do one transcoding), only had a minor influence on the active solutions: bandwidth increase with 4% and number of transcodings decreased with (7%). This is due to the fact that the number of transcodings done in one node already was low. This does however become more important when the number of Active Nodes is limited. Limiting the edge-capacity, caused the nonactive solution to fail, since it could no longer deliver the streams to the destinations. As long as the edges could carry the biggest format t1, the active solutions were able to deliver all requested streams.
20000 10000 4
12
20
28
36
Number of Destinations Fig. 14. BV1—Influence of the number of active nodes (25%) on the tree cost.
4.2. Speech multicast 4.2.1. Configuration We now look at speech multicasting. Multiple users spread over Europe wish to listen to the same
Table 2 Transcoding costs between the different Codecs %
G.711
G.726-7
G.728
G.729
PCM16 G.711 G.726-7 G.728
1
14 15
55 66 79
364 365 377 407
Node Cost (# transc.)
T. Lambrecht et al. / Computer Networks 50 (2006) 29–45
43
AM - Node Cost - varying 1400
=0 =0.1 =100
1200 1000 800 600 400 5
6
7
8
9 10 11 12 13 14 15 16 17 18
Number of Destinations
speech stream, but not all users have the same access facilities. Therefore, they request different types of the same stream. A small network with 19 nodes and 37 edges was used. In case of speech, transcoding can be done in real-time. We have studied five speech codecs: PCM 16bit (128 kbps), G.711 law (64 kbps), G.726-7 4bit (32 kbps), G.728 (8 kbps) and G.729 (4 kbps). We measured the cpu-load needed to perform the transcoding on a 10ms voice fragment (processor clock speed 1000 MHz). The corresponding transcoding costs kt,t 0 used are based on these measurements an are stated in Table 2. The cpu time to transcode from PCM 16bit to G.711 was used as reference (value 1). We gradually increased the number of destinations from 5 up to 18. The destinations were evenly distributed over the five types and over the nodes of the network: we start off with 5 users each requesting a different type. The next user we add will request type t1. The following will request t2 and so on. When there are 10 users, there are 2 users for every type. We will refer to this configuration as AM (Audio Multicast). 4.2.2. Results The general outcome of these simulations yielded similar conclusions as in the video multicast section. We will take a closer look at the influence of parameter a on the transcoding cost. Fig. 16 shows the transcoding costs for different values of a. For a = 0, we clearly see a step when the number of destinations becomes 10 and 15 i.e. when there are respectively two and three destinations requesting the last type (G.729): destinations requesting this type will be grafted on the multicast tree and transcoding will be done at different intermediate nodes. Since transcoding to
Fig. 16. AM—Influence of parameter a on the transcoding cost.
this type is very expensive, the solution found as we increase a (e.g. a = 0.1), will contain only one such transcoding. We now see a step when the number of destinations becomes 9 and 14 (type 4 requests). If a is set high enough (e.g. a = 100), every transcoding is only done once.
5. Conclusions In this paper, solutions for the transcoding multicast tree problem in an active network were presented. The optimal solution was found using the ILP-formulation of the problem. Two heuristics were proposed to solve the problem for realistic problem sizes. The Skinned Steiner Tree heuristic, tries to skin a multicast tree with enough bandwidth in order to determine the bandwidth and transcodings needed. The Branch Attach heuristic starts with a multicast tree for the highest quality class and then sequentially attaches branches towards the other types. A non-active optimal solution (simulcasting) was also implemented for benchmarking purposes. A video and speech multicast session was examined to determine the usefulness of the method. For the video multicast session, two series of simulations were done. For the speech multicast we only increased the number of destinations while maintaining the number of types. From the simulations, following conclusions can be drawn: the active solutions on average need much less bandwidth than the non-active solution for increasing numbers of classes (up to 50% less).
44
T. Lambrecht et al. / Computer Networks 50 (2006) 29–45
The relative improvement is rather independent of the amount of destinations. The active solutions however need more transcodings, especially as the amount of destinations increases. The server load decreases substantially in the active case and becomes independent of the number of types. The heuristics presented approximate the optimal solution and can be used for real-time multicast tree optimizations.
[9]
[10]
[11]
Acknowledgements This work was carried out within the framework of the project CoDiNet sponsored by the Flemish Institute for the promotion of Scientific and Technological Research in the Industry (IWT). It is in part supported by the Ghent University GOA-project Programmable and Active Networks.
References [1] I. Kouvelas, V. Hardman, J. Crowcroft, Network adaptive continuous media applications through self organised transcoding, in: Proceedings of the Network and Operating Systems Support for Digital Audio and Video (NOSSDAV 98), July 1998. [2] S. McCanne, Van Jacobson, M. Vetterli, Receiver-Driven Layered Multicast, SIGCOMM, Stanfort, CA, August 1996, pp. 117–130. [3] D. Tennenhouse et al., A survey of active network research, IEEE Commun. Mag. 35 (January) (1997) 80–86. [4] J.M. Smith, K.L. Calvert, S.L. Murphy, H.K. Orman, L.L. Peterson, Activating networks: a progress report, IEEE Computer Mag (April) (1999) 32–41. [5] K.L. Calvert, S. Bhattacharjee, E. Zegura, J. Sterbenz, Directions in active networks, IEEE Commun. Mag. 36 (October) (1998) 72–78. [6] K. Psounis, Active networks: applications, security, safety, and architectures, IEEE Communications Surveys, First Quarter, 1999. Available from: . [7] P. Parnes, Media distribution in heterogeneous environments using IP-Multicast, 1998. Available from: . [8] J.C. Pasquale, G.C. Polyzos, E.W. Anderson, V.P. Kompella, Filter propagation in dissemination trees: trading off bandwidth and processing in continuous media networks, International Workshop on Network and Operating Sys-
[12] [13] [14] [15]
[16]
[17] [18] [19] [20] [21]
[22]
[23]
tem Support for Digital Audio and Video (NOSSDAV), November 1993, pp. 259–268. E. Amir, S. McCanne, H. Zhang, An Application Level Video Gateway, in: Proceedings of the ACM Multimedia 95, San Francisco, CA, 1995. H. Akamine, N. Wakamiya, M. Murata, H. Miyahara, On the construction of heterogeneous multicast distribution trees using filtering in an active network, in: Proceedings of the Quality of future Internet Services Workshop (QofIS 2000), September 2000. A. Singh, A. Trivedi, K. Ramamritham, P. Shenoy, PTC: proxies that transcode and cache in heterogeneous web client environments, in: Proceedings of the Third International Conference on Web Information Systems Engineering, December 2002. ILOG Inc. Available from: . M.R. Garey, D.S. Johnson, The rectilinear steiner problem is NP-complete, SIAM J. Appl. Math. 32 (1977) 826–834. L. Wei, D. Estrin, A comparison of multicast trees and algorithms, INFOCOM, 1994. L. Lastein, B. Reul, Transrating efficiency—Bit-rate reduction methods to enable more services over existing bandwidth, IBC 2002, September 2002. B. Reul, L. Lastein, Transrating Of MPEG-2 Streams various applications and different techniques, Broadcast Asia 2002, June 2002. http://www.vwebcorp.com/. http://www.vixs.com/. http://www.moonlight.co.il/. http://www.mpeg.org/. P.N. Tudor, O.H. Verner, Real-time transcoding of MPEG2 video bit streams, in: Proceedings of the International Broadcasting Convention, Amsterdam (NL), 1997. G. Keesman, R. Hellinghuizen, F. Hoeksema, G. Heideman, Transcoding MPEG bitstreams, Signal Process: Image Commun. 8 (6) (1996). S. Wee, J. Apostolopoulos, Secure scalable streaming enabling transcoding without decryption, ICIP-2001 International Conference on Image Processing, Poster-Session.
Thijs Lambrecht received his Masters Degree in Computer Science (option Telecommunication Networks) in 1999 from the Ghent University, Belgium. He is now a Ph.D. student at the same University where he is part of the Department of Information Technology (INTEC). He works in the Integrated Broadband Communications Network group (IBCN). He worked on ‘‘Routing with revenue optimalization’’ as part of his graduate thesis. His current research interests include Active Networks where his focus is on caching and multicast routing. He is also interested in optimization problems in general.
T. Lambrecht et al. / Computer Networks 50 (2006) 29–45 Bart Duysburgh received his electronic engineering degree in 1998 from the Ghent University, Belgium. Since September 1998, he is a research assistant for the Inter-University Micro-Electronics Center (IMEC) at the Department of Information Technology of the Ghent University. He worked on a software based IP and ATM traffic generator as part of his master thesis. His current research interests include active and programmable networks, traffic monitoring and network performance measurements, multicast routing, voice and video streaming over IP and assessment of voice and video quality.
Tim Wauters received his Masters Degree in Electrical Engineering in 2001 from the Ghent University, Belgium. He is now a Ph.D. student at the same University where he is part of the Department of Information Technology (INTEC). He works in the Integrated Broadband Communications Network group (IBCN). He worked on ‘‘Multicast Tree Optimalisation for Transcoding in Active Networks’’ as part of his graduate thesis. His current research interests include Peer-to-Peer Networks and Content Distribution Networks where his focus is on caching.
Filip De Turck received his M.Sc. degree in Electronic Engineering from the Ghent University, Belgium, in June 1997. In May 2002, he obtained the Ph.D. degree in Electronic Engineering from the same university. From October 1997 to September 2001, Filip De Turck was research assistant with the Fund for Scientific Research-Flanders, Belgium (F.W.O.-V.). At the moment, he is a part-time professor and a postdoctoral fellow of the F.W.O.-V., affiliated with the Department of Information Technology of the Ghent University. Filip De Turck is author or co-author of approximately 80 papers published in international journals or in the proceedings of international conferences. His main research interests include scalable software architectures for telecommunication network
45
and service management, performance evaluation and optimization of routing, admission control and traffic management in telecommunication systems. Bart Dhoedt received a degree in Engineering from the Ghent University in 1990. In September 1990, he joined the Department of Information Technology of the Faculty of Applied Sciences, University of Ghent. His research, addressing the use of microoptics to realize parallel free space optical interconnects, resulted in a PhD degree in 1995. After a 2 year post-doc in opto-electronics, he became professor at the Faculty of Applied Sciences, Department of Information Technology. Since then, he is responsible for several courses on algorithms, programming and software development. His research interests are software engineering and mobile & wireless communications. Bart Dhoedt is author or co-author of approximately 100 papers published in international journals or in the proceedings of international conferences. His current research addresses software technologies for communication networks, peer-to-peer networks, mobile networks and active networks.
Piet Demeester received the Masters degree in Electro-technical engineering and the Ph.D degree from the Ghent University, Gent, Belgium in 1984 and 1988, respectively. In 1992 he started a new research activity on broadband communication networks resulting in the IBCN-group (INTEC Broadband communications network research group). Since 1993 he became professor at the Ghent University where he is responsible for the research and education on communication networks. The research activities cover various communication networks (IP, ATM, SDH, WDM, access, active, mobile), including network planning, network and service management, telecom software, internetworking, network protocols for QoS support, etc. Piet Demeester is author of more than 400 publications in the area of network design, optimization and management. He is member of the editorial board of several international journals and has been member of several technical program committees (ECOC, OFC, DRCN, ICCCN, IZS, &).