High Quality and Resilient IPTV Backbone Networks André Ricardo Simões Bento Dissertation submitted for obtaining the degree of Master in Electrical and Computer Engineering Jury Supervisor: Prof. João Luís da Costa Campos Gonçalves Sobrinho President: Prof. Nuno Cavaco Gomes Horta Members: Prof. Joao Jose de Oliveira Pires Outubro 2012 To my grandmother ii Acknowledgements Acknowledgements Foremost, I would like to express my sincere gratitude to my supervisor Prof. João Luís Sobrinho for the continuous support of my M.Sc. study and research, for his patience, motivation and enthusiasm. His guidance helped me in all the time of research and writing of this thesis. Besides my advisor, I would like to thank my family, especially my parents, who supported me emotionally and financially all the way through the course. Last but not the least, I would like to thank my beloved girlfriend for her patience to see me writing this thesis throughout the summer and during her vacations. iii iv Abstract Abstract Multicasting of TV (real time video) streams with the IP protocol suite is a challenge. IP was initially designed for unicast not multicast. The routing and transport protocols of IP were not planned for applications requiring a combination of steady high bandwidth, low packet delay and low packet loss. This thesis focuses on the design of broadcast trees for the backbone of IPTV networks. We will present algorithms for the calculation of optimal out-branchings and resilient out-branchings, where in case of link failure, it is possible to substitute the link by a backup path, granting that the other links of the tree are not affected by the failure. The failure recovering is provided by Fast Reroute mechanisms, which are capable of handling failure in sub-second time (around 50ms). Keywords IPTV, Backbone, Minimum Out-Branchings, Resilient Out-Branchings, Fast Reroute, MPLS v vi Abstract Resumo Distribuir fluxos de TV (vídeo em tempo real) usando os protocolos IP para transmissões pontomultiponto é um desafio. O IP foi inicialmente desenhado para transmissões ponto a ponto, não ponto-multiponto. Os protocolos de transporte e encaminhamento do IP não foram planeados para aplicações que requerem-se uma combinação de largura de banda estável, baixo atraso e baixa perda de pacotes. Esta tese foca-se no desenho de arborescências de distribuição para o núcleo das redes IPTV. Apresentamos algoritmos para o cálculo de arborescências óptimas e arborescências resilientes, onde, no caso de falha de uma ligação é possível substituir a ligação que falhou, por um caminho de protecção, garantindo que as outras ligações da arborescência não são afectadas pela mudança. A recuperação das falhas é feita usando Fast Reroute, um mecanismo capaz de lidar com falhas em tempos que rondam os 50 ms. Palavras-Chave IPTV, Núcleo, Arborescências de Custo Mínimo, Arborescências Resilientes, Fast Reroute, MPLS vii viii Table of Contents Table of Contents Acknowledgements ................................................................................ iii Abstract ................................................................................................... v Resumo ................................................................................................ vii Table of Contents ................................................................................... ix List of Figures ........................................................................................ xi List of Tables......................................................................................... xiii List of Algorithms .................................................................................. xv List of Acronyms .................................................................................. xvii 1 Introduction .................................................................................. 1 1.1 Problem Statement .................................................................................. 2 1.2 Contents .................................................................................................. 3 1.3 Contributions ........................................................................................... 3 1.4 State of the Art......................................................................................... 4 1.5 IPTV Architecture .................................................................................... 5 2 The Unicast Routing Problem ....................................................... 9 2.1 Network Service Models ........................................................................ 10 2.2 Routing Protocols .................................................................................. 11 2.3 MPLS ..................................................................................................... 13 2.4 Fast Reroute .......................................................................................... 20 3 The Multicast Routing Problem .................................................. 29 3.1 From Unicast to Multicast ...................................................................... 29 3.2 Protocol Independent Multicast (PIM) .................................................... 31 3.3 Point-to-Multipoint LSPs (P2MP LSPs) ................................................. 35 3.4 Protocol Architecture ............................................................................. 37 3.5 Failure Recovering Process................................................................... 39 ix 4 5 Designing Multicast Out-branchings ........................................... 43 4.1 Definitions .............................................................................................. 44 4.2 Minimum Cost Out-Branchings .............................................................. 46 4.3 Connectivity Theorems .......................................................................... 51 4.4 Link-Disjoint Out-branchings Algorithms ................................................ 56 4.5 Comparison of the Out-branchings ........................................................ 70 4.6 Protection Paths .................................................................................... 73 Conclusions................................................................................ 75 References............................................................................................ 79 x List of Figures List of Figures Fig. 1 – IPTV architecture .........................................................................................................................5 Fig. 2 – Structure of the backbone of an IPTV network ...........................................................................6 Fig. 3 – Changes in case of link failure. The link represented in bold is introduced to substitute the link that failed (represented with a cross) .................................................................6 Fig. 4 – Verizon’s continental US backbone network (2011) [31] ............................................................7 Fig. 5 – T-Systems (aka Deutsche Telekom) germany50 [26] backbone network (2006) .......................8 Fig. 6 – Layer model ...............................................................................................................................10 Fig. 7 – Various Virtual Circuits established over a Physical Link: one per packet stream ...................11 Fig. 8 – MPLS layer model .....................................................................................................................13 Fig. 9 – Label Switching in MPLS. A packet with label 50 arrives at LSR at interface B. The LSR looks at its label switching table a sends the packet with label 40 through interface C. ....................................................................................................................14 Fig. 10 – MPLS’s header structure .........................................................................................................15 Fig. 11 – MPLS tunnel ............................................................................................................................15 Fig. 12 – LDP session establishment .....................................................................................................17 Fig. 13 – Path signalling using RSVP .....................................................................................................19 Fig. 14 – Constrain based routing example that uses bandwidth as a routing metric ...........................20 Fig. 15 – Network Failure Anatomy ........................................................................................................21 Fig. 16 – Local Protection .......................................................................................................................22 Fig. 17 – End-to-end protection ..............................................................................................................23 Fig. 18 – Setting up a facility backup ......................................................................................................24 Fig. 19 – Forwarding traffic using the facility backup .............................................................................24 Fig. 20 – Setting up a one-to-one backup ..............................................................................................25 Fig. 21 – Forwarding traffic over a one-to-one backup ..........................................................................26 Fig. 22 – Packet distribution in unicast. A packet stream is generated for each receiving node. ..........30 Fig. 23 – Packet distribution in multicast. Packet streams are not replicated at the source. .................30 Fig. 24 – Receiver “join” and source “register” in PIM-SM .....................................................................32 Fig. 25 – Source “join” and traffic flow in PIM-SM ..................................................................................32 Fig. 26 – Shortest path reconfiguration in PIM-SM ................................................................................33 Fig. 27 – Shortest Path Tree in PIM-SM after reconfiguration ...............................................................33 Fig. 28 – Source “join” in PIM-SSM. “Join” messages for different group are sent to its respective video source. ...............................................................................................34 Fig. 29 – Data flow in PIM-SSM. Video packets are sent to the reverse packet of the respective “join” message. .............................................................................................................34 Fig. 30 – P2MP LSPs example. The forwarding table of P1 shows the different labels using for the different interfaces. .................................................................................................35 Fig. 31 – P2MP LSP setup .....................................................................................................................36 Fig. 32 – Cross layer architecture ..........................................................................................................38 Fig. 33 – Setup of calculated out-branchings .........................................................................................39 Fig. 34 – Link failure scenario ................................................................................................................40 Fig. 35 – Minimum cost out-branching (bold red arrows) .......................................................................46 Fig. 36 - Algorithm to find MOBs: Step I (link cost update) ....................................................................47 xi Fig. 37 - Algorithm to find MOBs: Step II (0 cost cycle is agglomerated in a supernode) .....................48 Fig. 38 - Algorithm to find MOBs: Step III ...............................................................................................48 Fig. 39 - Algorithm to find MOBs: Step IV (0 cost links are selected as part of the MOB) .....................49 Fig. 40 - Algorithm to find MOBs: Step V (the links of the cycle except link are added to the MOB) .............................................................................................................................49 Fig. 41 – Link-disjoint example part I ......................................................................................................51 Fig. 42 – Link-disjoint example part II .....................................................................................................51 Fig. 43 – Two symmetric cycles. There are two disjoint paths between 0 and 1. ..................................52 Fig. 44 – Closed Path using the 2 links of .................................................................................53 Fig. 45 – Closed path using the 3 links of .................................................................................53 Fig. 46 – Simple path between sets and ......................................................................................54 Fig. 47 – 2-connected directed graph .....................................................................................................55 Fig. 48 – Graph with 2 link-disjoint out-branchings. One represented in normal red lines, the other in dashed blue lines .............................................................................................55 Fig. 49 – Edmond’s theorem application example. Set and have 3 and 2 entering links represented in bold, respectively. .................................................................................56 Fig. 50 – First iteration of algorithm 2: Part I (path is added to out-branching ) ...............................57 Fig. 51– First iteration of algorithm 2: Part II (path is added to out-branching ) ..............................57 Fig. 52 – First iteration of algorithm 2: Part III (the reverse edges of path are added to ) ...............58 Fig. 53 – First iteration of algorithm 2: Part IV (the reverse edges of path are added to ) ...............58 Fig. 54 – Final result of the application of algorithm 2 ............................................................................59 Fig. 55 – Algorithm 2 iteration: Part I (there are always two paths available from the two out-branchings to node ) .............................................................................................61 Fig. 56 – Algorithm 2 iteration: Part II ........................................................................................61 Fig. 57 – Choosing paths in algorithm 2: Part I (path - - is added to out-branching ) .................62 Fig. 58 – Choosing paths in algorithm 2: Part II (path - - - is added to out-branching )...........63 Fig. 59 – Choosing paths in algorithm 2: Part III (the reverse edges of the paths are added to the out-branchings) .......................................................................................................63 Fig. 60 – Choosing paths in algorithm 2: Part IV (the result of not choosing the shortest paths for out-branching ) ...........................................................................................................64 Fig. 61 – Graph and the maximum number of disjoint out-branchings ............................................66 Fig. 62 – First iteration of algorithm 4: Part I ..........................................................................................66 Fig. 63 – First iteration of algorithm 4: Part II .........................................................................................67 Fig. 64 – First iteration of algorithm 4: Part III ........................................................................................67 Fig. 65 – Final result of the application of algorithm 4 ............................................................................68 Fig. 66 – 3 out-branchings rooted at vertex 0 generated by our implementation of algorithm 4 ............70 Fig. 67 – 10 node link-disjoint trees generated using algorithm 4. (Main tree in Red) ...........................72 Fig. 68 – 10 node link-disjoint trees generated using algorithm 2 optimized for the main tree (Red) .............................................................................................................................72 xii List of Tables List of Tables Table 1 – Router S IGP routing table .....................................................................................................40 Table 2 – Algorithm comparison .............................................................................................................71 Table 3 – Protection path cost comparison ............................................................................................74 xiii xiv List of List of Algorithms Algorithm 1 - Finding a MOB rooted at in a given multigraph ........................................................49 Algorithm 2: Finding 2 link disjoint out-branchings rooted at in a 2-connected symmetric graph .........................................................................................................................59 Algorithm 3: Finding the maximum number of link-disjoint paths in from to . ..............................65 Algorithm 4: Finding a maximal set of link disjoint out-branchings in rooted at .............................68 xv xvi List of List of Acronyms AS Autonomous System ATM Asynchronous Transfer Mode BBC British Broadcasting Corporation BFD Bidirectional Forwarding Detection BFS Breadth First Search CBS Columbia Broadcasting System CNN Cable News Network CPU Central Processing Unit DSL Digital Subscriber Line DV Distance Vector ERO Explicit Route Object FEC Forward Equivalence Classes FRR Fast Reroute IGP Interior Gateway Protocols IP Internet Protocol IPTV Internet Protocol Television IPv4 Internet Protocol version 4 IPv6 Internet Protocol version 6 IS-IS Intermediate Systems to Intermediate Systems ISP Internet Service Providers LDP Label Distribution Protocol LER Label Edge Router LIB Label Information Base LS Link State LSA Link State Advertisements LSP Label Switched Path LSR Label Switched Routers MOB Minimum Out-Branching MP Merge Point MPLS Multi Protocol Label Switching MPLS-TE Multi Protocol Label Switching-Traffic Engineering MST minimum spanning tree OSPF Open Shortest Path First P2MP Point to Multipoint xvii PIM Protocol Independent Multicast PIM-DM Protocol Independent Multicast-Dense Mode PIM-SM Protocol Independent Multicast-Sparse Mode PIM-SSM Protocol Independent Multicast-Source Specific Mode PLR Point of Local Repair PON Passive Optical Networks QoS Quality of Service RIP Routing Information Protocol RP Rendezvous Point RRO Record Route Object RSVP Resource Reservation Protocol RSVP-TE Resource Reservation Protocol-Traffic Engineering SHO Super Hub Office SPT Shortest Path Tree TCP Transmission Control Protocol TE Traffic Engineering TTL Time to Live TV Television UDP User Datagram Protocol UK United Kingdom US United States VC Virtual Circuit VHO Video Hub Offices VOD Video On Demand VSO Video Serving Office xviii Chapter 1 Introduction 1 Introduction 1 1.1 Problem Statement In the past few years, we have seen the telecommunication carriers do a global effort in the rapid implementation of Internet Protocol Television (IPTV). IPTV encodes live TV streams in a flow of IP packets and delivers them to users through broadband networks. The key reasons for using IP networks for the transport of TV are: Convergence: The services offered by the telecommunications companies are converging on IP, including the transmission of digital voice and data. The addition of IPTV allows telecommunication carriers to strengthen their competitiveness by offering triple play services (digital voice, data and TV); Cost: Having a uniform platform reduces the cost of maintaining separate networks; Flexibility: IP has the ability to support diverse applications; this offers flexibility opening up opportunities for the introduction of new services like interactive TV and Video on Demand (VOD). telecommunication carriers use their IP networks to distribute the TV streams; this presents challenges because the networks were not designed for this type of service and IPTV requires: Minimal Packet Loss: IP packet loss can represent various levels of image damage, ranging from a single pixelated block in an image, to a long period of degraded, pixelated or unavailable sequence of images. In order to offer the user a satisfactory Quality of Service (QoS) packet loss needs to be minimal. Steady High Bandwidth: The total amount of video-stream data that can be sent is ultimately limited by the bandwidth provisioned over the network. Any increase in bandwidth demand that is above the maximum capacity of the link, will result in video packets being lost. Low Packet Delay: Live TV streams are encoded and sent in IP packets. At the reception, an initial delay is introduced and the first packets are stored in a buffer. After the initial delay, packets are decoded and transmitted in the user’s TV. This mechanism helps in coping with short transmission delays and jitter. However, if a packet fails to arrive at the reception at the time it is supposed to be decoded, it is like it was lost since it will not be used. Sub-second Failure Restoration: In the case of link failure, restoration needs to be done in sub-second time to avoid packet loss, which could cause image damage or even service interruption. 2 1.2 Contents In this dissertation we will evaluate network designs that can cope with the challenging requirements of IPTV. We will start with an overview of the IPTV backbone architecture and its components. In chapter 2: The Unicast Routing Problem, we will discuss the network service models and protocols used to forward traffic sent from a single source to a single location in the network. We will also address the resource reservation problem and refer to a link-failure protection mechanism called Fast Reroute, which is capable of providing sub-second restoration. The emergence of applications like IPTV led to introduction of routing protocols capable of forwarding traffic to a group of recipients. The changes in routing, from a source to multiple destinations, as opposed to routing from a source to a single destination, are addressed in chapter 3: The Multicast Routing Problem. We will see the out-trees that are currently being used for forwarding and analyse mechanisms to calculate them. An out-tree is a directed tree rooted at a specific source vertex. If the out-tree spans all the vertices of a given graph is called out-branching. In the end of chapter 3, we will also discuss possible protocol suites that can cope with the requirements of IPTV and simultaneously accommodate custom out-branchings. We will then describe a methodology to handle link failure. The purpose of this methodology is to avoid packet loss when using the Fast Reroute mechanisms. In chapter 4: Designing Out-branchings, we will present the custom out-branching designs. We will present algorithms to calculate optimal out-branchings and resilient out-branchings. In the end of the chapter we will compare the designs and evaluate its pros and cons. Finally, in chapter 5: Conclusion, we will do a short summary of the topics covered by this thesis and discuss future work to be done in this subject. 1.3 Contributions The objective of this thesis is to compute and implement different out-branching designs, understanding the advantages and disadvantages of these implementations. Given a certain outbranching we purpose a protocol suite that can use this out-branching to transport IPTV packets and, at the same time, is capable of handling the requirements of IPTV. We have studied the interaction between the involved protocols and based on previous work ([1], [2], [3] and [4]), developed a strategy to operate them. We have also studied different out-branching designs: Shortest path out-branchings, where the cost of the paths from the root to the each vertex is minimum; 3 Minimum cost out-branchings, where the sum of the costs of the links in the out-branching is minimum; Resilient out-branchings, where each link of the out-branching can be protected by a path that is disjoint from the out-branching. If the protection paths shared links with the main tree, these links would transport more than packet stream replicas, this could cause link overload. We have implemented algorithms that generate the out-branchings above. Regarding the resilient out-branchings we will present 2 algorithms. Given a certain graph, one of them generates 2 disjoint out-branchings and the other generates the maximum number of link-disjoint out-branchings possible. For the first, we have come up with 2 heuristics that improve the resultant out-branchings. The purpose of the heuristics is the following: Improve the main out-branching: where the objective is to reduce the link cost of the paths between the root and the vertices in the main out-branching; Improve for the backup paths: where the objective is to reduce the total link cost of paths used for backup. As far as the protection paths are concerned, we also purpose a way of calculating them. 1.4 State of the Art A group of authors ([1], [2], [3] and [4]), suggested a protocol suite capable of accommodate custom out-branching designs, as well as a failure handling method used to avoid packet loss during the Fast Reroute recovering process . The authors have also suggested an algorithm to calculate outbranchings, and respective disjoint protection paths. This algorithm took care of link overload problem in case failure, however, the authors were not preoccupied with the cost of the out-branchings. They also did not had in consideration the cost of the protection paths. Other authors came up with algorithms [5] that resulted in similar out-trees, also disregarding the cost optimization aspect. In [6] and [7] simple schemes are proposed to solve single link failures in sub-second time with low packet loss in the process, these schemes introduce minor changes in routing and are easy to implement. However, these schemes disregard the link overload problem during the recovery. Like the others the proposed solutions do not account for the costs of the out-branchings. Another different solution is proposed in [8], this scheme uses fast layer 3 failure detection, this type of schemes is often disregarded by the telecommunications carriers because they require short timers to detect the failures, this can produce false positive cases and as we will see further ahead can compromise the best routing options. Finally in [9], 3 protocol architectures are discussed with the objective of distributing IPTV streams, the advantages and disadvantages of the architectures are briefly explained. 4 1.5 IPTV Architecture IPTV networks are usually built on top of existent IP networks and are separated in three areas: access, metro and backbone. These areas have different components and philosophies. Fig. 1 shows a typical IPTV architecture. National Channel Servers Local Channel Servers Cable Access SHO VHO PON Access VSO DSL Access Super Hub Office Video Hub Office Video Serving Office Backbone Metro Acess Fig. 1 – IPTV architecture In the backbone, there are two essential components: the Super Hub Office (SHO) and the Video Hub Offices (VHOs). The SHO gathers the TV streams of the national channels (channels that are not specific to a region or state, like CNN in the US, or BBC One in the UK) and distributes them to a large set of receiving locations, the VHOs. Each VHO gathers the TV streams of the local channels (channels that are only reproduced in a specific region like WCBS the CBS channel of New York) and feeds both national and local streams to a metro area. IP routers are used to transport the IPTV content from the SHO to the VHOs. In addition, both SHOs and VHOs are enabled with powerful buffer servers to enable retransmissions and some other Video-On-Demand (VOD) and interactive services. Most of them have a satellite used to receive local/national contents. Is at the VHO, that traffic leaves the core network. It distributes the contents to Video Serving Offices (VSOs) via the routers in the metro area. The VSOs establish the border between metro and access networks, than can either be Cable, Digital Line Subscriber (DSL), or Passive Optical Networks (PON). The focus of this thesis is on the backbone of the IPTV networks, which are supported by the telecommunication carriers’ already existent IP networks. Fig. 2 shows an example of the backbone of an IP network that supports IPTV. The bold green circle represent de SHO and dotted and dashed green circles represent de VHOs. Both SHOs and VHOs incorporate two routers so that in case of router failure, the network is connected in a way, which grants that there is always one router available to feed the VSOs. The black lines are undirected edges; every edge has an upstream and 5 downstream link. The normal black lines represent edges that have a link used in the out-branching. This out-branching, rooted at the SHO, is depicted by the red arrows that represent links. The black dashed lines are unused edges. They are used to substitute a link of the out-branching in case of failure. SHO VHO SHO VHO VHO VHO Router VHO Used Edge Unused Edge VHO Link of the broadcast out-tree VHO Fig. 2 – Structure of the backbone of an IPTV network SHO VHO SHO VHO VHO VHO Router VHO Used Edge Unused Edge VHO Link of the broadcast out-tree VHO Fig. 3 – Changes in case of link failure. The link represented in bold is introduced to substitute the link that failed (represented with a cross) Most developed countries have their own IP backbone networks. In general, there are few big telecommunication carriers per country, each one supporting its own national backbone network. The 6 main difference between these networks is size. When compared to the US’s, the European networks are generally smaller. Nevertheless major IPTV carriers in US and in Deutschland, Verizon and T-Systems, respectively, support IP backbone networks that are about the same size. These networks are represented in Fig. 4 and Fig. 5. Fig. 4 – Verizon’s continental US backbone network (2011) [10] 7 Fig. 5 – T-Systems (aka Deutsche Telekom) germany50 [11] backbone network (2006) As we can see Verizon’s network has 47 nodes and T-Systems’ has 50 nodes. We will focus our work in this kind of networks: big IPTV backbone networks. Instead of analysing European countries individually, we could also picture it as a whole, but mind that in Europe, carriers distribute IPTV contents in and for its own country. This explains the existence of an IPTV backbone area per country per telecommunication carrier, to where video contents should be derived and then distributed. 8 Chapter 2 The Unicast Routing Problem 2 The Unicast Routing Problem 9 This chapter overviews the network service models and routing protocols used in unicast routing. Unicast is the term that designates the sending of messages from a single source to a single destination in the network. We will also talk about Multi Protocol Label Switching and Fast Reroute. 2.1 Network Service Models Perhaps the most important abstraction provided by the network layer to the upper layers is whether it uses virtual circuits (VCs) or datagrams [12]. Sender Receiver ... ... Layer 4 Transport Layer 4 Transport Layer 3 Network Layer 3 Network Layer 2 Data-link Layer 2 Data-link Layer 1 Physical Layer 1 Physical Fig. 6 – Layer model VCs provide a connection oriented service through the establishment and teardown of pseudo (virtual) circuits. This mechanism enables the reservation of link resources, providing steady high bandwidth, low delay and low packet loss. Nevertheless, VCs require state information to be recorded in the routers for each packet stream. 10 Physical Link Virtual Circuit Physical Link Physical Link Fig. 7 – Various Virtual Circuits established over a Physical Link: one per packet stream In contrast, datagrams do not require state information to be recorded in the routers for each flow of data, since data transmission is not preceded by circuit establishment. IP uses datagrams; it has low complexity and therefore handles heterogeneous networks best. IP uses resources evenly through any number of users, giving them no guarantees whatsoever. Due to its lack of guarantees, IP service is often called a best-effort service. As far as IPTV is concerned, we can see that VCs would be suitable for video transmission, however IP uses datagrams. Multi Protocol Label Switching is an answer to this dichotomy. MPLS is a label switching mechanism that allows the traffic to be divided into forwarding classes which may be forwarded the same way. The use of MPLS together with the Resource Reservation Protocol (RSVP) enables the reservation of resources to a specific data flow type (eg. Video), keeping only state information relative to the flow type, not the flow itself. MPLS is fully compatible with IP and in fact RSVP uses the IP routing protocol information to do the reservation of resources. We will talk about MPLS and RSVP later, for now, we will explore the IP routing protocols, from which RSVP depends on. 2.2 Routing Protocols In this dissertation we are particularly interested on the best way of using the resources of a particular operator. Given so, we are then only interested in routing within a given autonomous system (AS). An intra-AS routing protocol is used to maintain the forwarding tables in the routers of an AS. Once the routing tables are configured, datagrams are routed within the AS. Intra-AS routing protocols 11 are also known as Interior Gateway Protocols (IGP). IGP can be divided into two categories: DistanceVector (DV) protocols and Link-State (LS) protocols. The name distance-vector is derived from the fact that routes are advertised as vectors of (distance, direction), where distance is defined in terms of link cost and direction is defined in terms of the next-hop router. Each router learns routes from its neighbouring routers and then advertises the paths from its own perspective. A typical distance vector routing protocol is a routing algorithm in which routers periodically send routing updates to all neighbours by broadcasting their entire routing tables. The algorithm used to calculate the shortest paths is distributed and because of that routing loops may occur when using a DV protocol [13]. In a LS protocol, the network topology and all link costs are known in each router. This is accomplished by having each node broadcast the identities and costs of its attached links to all other routers in the network. This link state broadcast can be accomplished without the routers having to initially know the topology of the network. A node only needs to know the identities and costs to its directly-attached neighbours; it will then learn about the topology of the rest of the network by receiving link state broadcast messages from the other nodes. The result of the nodes' broadcast is that all nodes have an identical and complete view of the network. Each router can then run a Dijkstra algorithm to calculate the shortest paths to the other routers. Any link cost changes or failure occurrences are communicated through the network using Link State Advertisements (LSAs). The key differences between distance vector and link state routing algorithms are the following: Convergence time: the computation time to calculate the shortest paths in DV protocols take usually take longer than in LS protocols. The distributed computations can result in cycles that can extend the computation time; State information: LS protocols require the routers to store state information of the whole network, in other words routers need to know the complete topology of the network. DV protocols only need to store information related to their neighbours. The fact that in LS routing protocols, routers have complete knowledge of the network topology is useful if we want to design custom out-branchings. If the routers know the topology they can use it to calculate the out-branchings independently, reaching the same results without having the necessity of exchanging messages between the routers. The most popular LS protocols are Open Shortest Path First (OSPF) and Intermediate Systems to Intermediate Systems (IS-IS). They are similar; the operator’s choice is usually based on convenience issues, not technical aspects. Another advantage of using LS protocols is its low convergence time. This is especially useful upon failure, in order to recalculate the shortest paths in the new topology (after failure). IS-IS and OSPF use a message exchange mechanism to detect failures. They keep HELLO messages between nodes to evaluate their condition; they use timers to detect the non-reception of a scheduled HELLO packet. When a failure is detected, routers distribute this info through the network using LSAs. Note that the choice of the timers is a tricky question, short timers may cause false positive failure detection, on the other hand, long timers may cause slow failure detection and in consequence, 12 packet loss. The failure detection times depend on the timers, but in common implementations range from 1 to 10s. Late extensions of OSPF and IS-IS offer a Fast-Hello mechanism that is able to detect failure in less than 1s. After the application of the LS protocol the resultant shortest path may be used to forward data packets. However if we use MPLS we can choose alternate paths, other than the shortest. 2.3 MPLS Multi-Protocol Label Switching (MPLS) [14] began in the mid-1990s and appeared mainly to solve problems on IP over Asynchronous Transfer Mode (ATM) networks, which used VCs to forward the IP datagrams. MPLS has generally four main objectives: Improving scalability using labels to aggregate state information and to do routing hierarchies; Improving routing flexibility, using the labels to identify traffic with specific necessities to construct personalised paths; Optimizing network performance if used globally; Simplifying router integration with switching technology by forcing switches to act like routers, reporting physical topology information to the network layer and by unifying the routing, control and addressing of layers 2 and 3. Multi protocol label switching operates at a layer that is generally considered to lie between traditional definitions of layer 2 (data link layer) and layer 3 (network layer), and thus is often referred to as a "layer 2.5" protocol. It was designed to provide a unified data-carrying service for both circuitbased clients and packet-switching clients which provide a datagram service model. It can be used to carry many different kinds of traffic, including IP packets. ... IP Layer 3 - Network Layer Layer “2,5" - MPLS MPLS ATM / Frame Relay / Ethernet/ ... Layer 2 - Data-link layer DSL / 10BASE-T / OTN/ ... Layer 1 - Physical Layer Fig. 8 – MPLS layer model This label switching mechanism brought by MPLS consists on the addition of a short extra label 13 in each IP packet. In a MPLS network, like the one represented in Fig. 9, a packet enters a MPLS domain by a Label Edge Router (LER), which introduces a MPLS header in the packet and sends it to the core of the network. This header contains the label that has forwarding information to be used by the core routers named Label Switched Routers (LSR). When the packet reaches the end of a path (to an exit LER), the label is removed. This path is called Label Switched Path (LSP). The labels are stored in a label switching table, where each entry contains a pair incoming label, incoming interface and the corresponding pair containing the out-coming label and its destination interface. IN Int OUT Int OUT Label IN B 50 C 40 Label 40 50 A IP C B LSR IP LSP D LER (Egress) LER (Ingress) D Fig. 9 – Label Switching in MPLS. A packet with label 50 arrives at LSR at interface B. The LSR looks at its label switching table a sends the packet with label 40 through interface C. Labels have a local character and can be distributed using the Label Switching Protocol (LDP) or the Resource Reservation Protocol (RSVP), which we will refer to later in this chapter. Fig. 10 indicates the structure of MPLS’s header. It is formed by: Label – Packets are expedited based on this field. This field will be used to index MPLS’s expedition table, the Label Information Base (LIB); EXP – originally meant to be experimental, are now used to classify packets into service classes; S – Labels can be stacked. S is a bit used to identify the bottom of the stack. TTL – Time to Live field is used to avoid cycles and infinite packet redistribution. 14 Data-link Layer Header MPLS Header Label 20 bit IP Header EXP 3 bit S 1 bit TTL 8 bit Fig. 10 – MPLS’s header structure Since it is not possible to have a “personalized” treatment to each packet, they need to be grouped in classes. These are called Forward Equivalence Classes (FEC). Each class contains packets with same transport requirements. This concept is inherited from IP. However, in MPLS, it is possible to have much more traffic classes. The most common factors used to classify the packet into FECs are the packet stream destination, packet stream origin and content type (e.g. video). All packets of a given FEC may be bound to the same MPLS label, which are expedited the same way. In contrast with IP expedition, the attribution of FEC to a packet is only done once (you can change a packets FEC by adding a second label to the packet). Labels are the essential element for packet expedition. A LSR just needs to examine the label to know which will be the next node, to where the packet needs to be expedited to. Labels are local, this means they only identify univocally connections between LSRs. A. Label Stacking It is possible to insert more than one label per IP packet. This enables the existence of hierarchies in the LSPs. R1 R6 6 17 6 17 13 R3 21 6 R4 R5 21 13 13 R2 R7 Fig. 11 – MPLS tunnel 15 Fig. 11 shows an example of this hierarchy. In this example R1 sends a packet with label 6 to R3, which is the entry router of an MPLS tunnel. R3 receives the packet and instead of switching label 6, it adds another label: label 17. R3 then sends the packet to R4 that only evaluates the first label of the label stack, 17. R4 switches the label 17 with label 21 and forwards the packet to R5. This is an egress LSR, it removes label 21 from the stack and evaluates the inner labels. R5 then sends the packet without MPLS labels to the respective router, according to its forwarding table. With label stacking, R1 does not need to know the configuration of the MPLS tunnel. It just needs to know the entry and exit routers. On the other hand, R4 does not need to know the configuration outside the MPLS tunnel. There are two protocols that usually used to attribute labels to the packet streams: the Label Distribution Protocol (LDP) and the Resource Reservation Protocol (RSVP). B. LDP: Label Distribution Protocol LDP is the result of the MPLS Working Group in the IETF. LDP was made for MPLS unlike RSVP, which existed before and was extended to handle label distribution. Nevertheless, LDP is much used, especially when optimizing the network is not a vital issue. Typically it is used in metro or local networks. In other cases RSVP is the one to go with. LDP has the advantage of not requiring any manual configuration; nodes interact between each other in order to discover new LSPs automatically. Each LSR establishes a TCP session with its neighbours to run the protocol, this grants reliable message delivery. With this setup we can see that each router has only information from its neighbours, this limits the labelling options but enhances scalability. The operation of LDP is driven by message exchanges between peers. Potential peers, also known as neighbours, are automatically discovered via hello messages multicast to a well-known UDP port. The protocol enables the discovery of remote peers using targeted hello messages. Once a potential peer is discovered, a TCP connection is established to it and an LDP session is set up. At session initialization time, the peers exchange information regarding the features and mode of operation they support. After session setup, the peers exchange information regarding the binding between labels and FECs over the TCP connection. As we have referred, the use of TCP ensures reliable delivery of the information, but it also allows the use of incremental updates, rather than periodic refreshes. LDP uses the regular receipt of protocol messages to monitor the health of the session. In the absence of any new information that needs to be communicated between the peers, keep-alive messages are sent. Fig. 12 shows LDP session establishment. 16 Fig. 12 – LDP session establishment LDP does its routing based on the information it gets from the running IGP protocol. Having an IGP running below LDP is therefore essential. The association between an FEC and a label is advertised via label messages: label mapping messages for advertising new labels, label withdraw messages for withdrawing previously advertised labels. The fundamental LDP rule states that LSR A that receives a mapping for label L for FEC F from its LDP peer LSR B will use label L for forwarding if and only if B is on the IGP shortest path for destination F from A’s point of view. This means that LSPs setup via LDP always follows the IGP shortest path and that LDP uses the IGP to avoid loops. Given our necessities, LDP by its own, does not seem to be the way to go. It is a highly scalable and simple protocol but it fails when we try to get optimization out of it. It is not to make full path reservations or have a good traffic engineering process, since it does its routing automatically and based on IGP information. Therefore we shall evaluate our other choice, RSVP, which gives us better tools to treat video flows. C. RSVP Another scheme for distributing labels for transport LSPs is based on the Resource Reservation Protocol (RSVP). RSVP was invented before MPLS. It was originally developed as a scheme to create bandwidth reservations for individual traffic flows in networks (e.g. a video telephony session between a particular pair of hosts) in an attempt to bring QoS to IP. RSVP includes mechanisms for reserving bandwidth along each hop of a network for an end-to-end session. However, the original QoS application of RSVP has fallen out of favor because of concerns about its scalability: the number of end-to-end host sessions passing across a service provider network would be extremely large, and it would not be desirable for the routers within the network to have to create, maintain and tear down state as sessions come and go. In the context of MPLS, however, RSVP has been extended to allow it to be used for the creation and maintenance of LSPs and to create associated bandwidth reservations. When used in this context, the number of RSVP sessions in the network is much smaller because of the way in 17 which traffic is aggregated into an LSP. A single LSP requires only one RSVP session, yet can carry all the traffic between two LER, containing many end-to-end packet streams. MPLS also allows the traffic to be aggregated into FECs, through the simple label switching mechanism, reducing the number of existent packet streams. In contrast to LDP, a RSVP-signalled LSP may not necessarily follow the path dictated by the IGP and, in its extended form, it is possible for the ingress router to specify the entire end-to-end path that the LSP must follow, having full control over the path. This property is very important in the context of traffic protection schemes such as Fast Reroute, discussed in detail lately in this chapter. RSVP requests resources for a traffic stream in only one direction from sender to one or more receivers and maintains soft state (the reservation at each node needs a periodic refresh) of the host and routers' resource reservations, hence supporting dynamic automatic adaptation to network changes. Paths may be computed online by the router or offline using a path computation tool. In the case of online computation, typically only the ingress router needs to be aware of any constraints to be applied to the LSP. Moreover, use of the explicit routes eliminates the need for all the routers along the path to have a consistent routing information database and a consistent route calculation algorithm. The creation of an RSVP-signalled LSP is initiated by the ingress LER. The ingress LER sends an RSVP Path message. The destination address of the Path message is the egress LER. The Path message includes the IP address of the previous node and some data objects, the most relevant are: Label Request Object - Requests a MPLS label for the path. As a consequence, the egress and transit routers allocate a label for their section of the LSP. Explicit Route Object (ERO) - The ERO contains the addresses of nodes through which the LSP must pass. If required, the ERO can contain the entire path that the LSP must follow from the ingress to the egress. Record Route Object (RRO) - RRO requests that the path followed by the Path message (and hence by the LSP itself once it is created) be recorded. Each router through which the Path message passes adds its address to the list within the RRO. A router can detect routing loops if it sees its own address in the RRO. Sender TSpec - TSpec enables the ingress router to request a bandwidth reservation for the LSP in question. In response to the Path message, the egress router sends a Resv message. Note that the egress router addresses the message to the adjacent router upstream, rather than addressing it directly to the source. This triggers the upstream router to send a Resv message to its upstream neighbour and so on. As far as each router in the path is concerned, the upstream neighbour is the router from which it received the Path message. This scheme ensures that the Resv message follows the exact reverse path of the Path message. Here are some of the objects contained in a Resv message: 18 Label Object - Contains the label to be used for that section of the LSP. Record Route Object (RRO) - Records the path taken by the Resv message, in a similar way to the RRO carried by the Path message. Again, a router can detect routing loops if it sees its own address in the Record Route Object. As can be seen, RSVP Path and Resv messages need to travel hop-by-hop because they need to establish the state at each node they cross, e.g. bandwidth reservations and label setup. R2 R5 R1 100 300 200 R3 R4 Path messages Resv messages Fig. 13 – Path signalling using RSVP Fig. 13 shows an example of the establishment of a path. R1 needs to setup a data packet stream with specific QoS to R5. To accomplish it, R1 sends a Path message to R3, asking for a label for a given data flow. R3 then redirects the Path message to router R4 and R4 redirects it to R5. R1 needs to resend this Path message every 30 seconds to refresh the state. When the message reaches its destination, router R5, it allocates a label for that flow locally and responds with a Resv message to R4. In this case label 100 was chosen. As soon as the Resv message reaches R4, the router must first make a reservation based on the request parameters and then forward the request upstream (in the direction of the sender). This process is repeated in the upstream routers. The routers store the characteristics of the packet stream, and also police it to grant the specified QoS. This is all done in soft state, so if nothing is heard for a certain length of time, then the reservation will be cancelled. This solves the problem if either the sender or the receiver crash or are shut down incorrectly without first cancelling the reservation. D. Traffic Engineering Traffic engineering is the process of routing data traffic to balance the traffic load on the various links, routers, and switches in the network and is most applicable in networks where multiple parallel or alternate paths are available. Fundamentally, the objective of traffic engineering is to ensure sufficient capacity exists to handle the forecast demand from the different service classes while 19 meeting their respective QoS objectives. Along the years many protocols got Traffic Engineering (TE) extensions to better cope with nowadays traffic demands, MPLS was no different. MPLS-TE brought with it the concept of constrain based routing, a change in shortest path concept. If all the traffic is forwarded through the same “shortest paths”, paths may become congested. Perhaps we can get better results distribute the traffic wisely between the paths available. Constraint-based routing is a mechanism used to meet TE requirements, which can take in to account more than one metric to define its paths. Besides link cost, it can use bandwidth, delay and jitter. 1 R1 R2 R3 1 1 1 R4 R5 40Mbit/s traffic 1 70Mbit/s traffic R6 R7 40Mbit/s bandwidth 100Mbit/s bandwidth Fig. 14 – Constrain based routing example that uses bandwidth as a routing metric Fig. 14 shows an example of constrain based routing that uses bandwidth as a routing metric. If short path first routing was used in this case, both traffic flows from R1 to R5 would pass through path R1-R2-R3-R4-R5, which has the lowest total combined cost. This is the worst path in terms of bandwidth and will have difficulties in handling the traffic. 2.4 Fast Reroute Fast Reroute (FRR) [14] is a technology that offers resiliency to MPLS networks. With FRR it is possible to quickly recover from link failure. The medium time down time when a single fails is around 50 ms, which is far better than convergence times of the IGP protocols. FRR uses pre-programmed backup paths to detour the packets from the affected links, offering a safe path that can quickly be introduce once failure is detected. These procedures are transparent to the IGP protocols and therefore will leave the layer 3 routing tables intact. 20 R1 PLR MP R2 R5 R3 R4 R6 Fig. 15 – Network Failure Anatomy Fig. 15 shows the anatomy of a failure in a given network. If connection R2-R5 fails, traffic will be redirected through path R2-R3-R4-R5. The point that is upstream of the failure is called Point of Local Repair (PLR); on the other hand, the point downstream of the failure is called Merge Point (MP). FRR gives us essentially two schemes for protecting a given LSP. We can protect a full path (end-to-end) or just do a local protection detour (hop-by-hop). Both have its advantages and disadvantages and although local protection is the most popular, they can be used complementarily. A. Local Protection The idea of local protection is simple. Instead of protecting the full path, the traffic is only detoured at the PLR, protecting only one link. This is pretty similar to what happens when a freeway between two cities closes down somewhere between exits A and B. Instead of redirecting the traffic through other national routes, vehicles are redirected to a detour at exit A, which leads them to point B where they enter the freeway once again. Following the freeway’s philosophy, it is necessary to create what we call detour path as we can see in Fig. 16. 21 R8 R9 Detour S D R1 R2 LSP from S to D R3 R4 R5 Fig. 16 – Local Protection: If link R1R2 fails traffic is detoured at the PLR (R1), returning to the LSP at the MP (R2) B. End-to end Protection End-to end protection consists on having a secondary LSP protecting a primary LSP (Fig. 17). This LSP is an end-to-end path that redirects traffic between two nodes that are typically a few hops apart in the network. This is usually done offline given that it is normally necessary that the involved nodes have full network knowledge, which may not be available to all routers (which may be in different Link State areas or just not running any Link State protocol at all). On the other hand, this scheme of protection offers advantages as well. Since the backup paths normally are callibrated by the network manager, it is possible to have better QoS guarantees. Another advantage is that no matter what link(s) fail in the primary LSP, it is always possible to protect it. Path signaling is usually done by RSVP. 22 LSP2 secondary R4 LSP1 primary D S R1 R2 R3 Fig. 17 – End-to-end protection Both in local and end-to-end protection, the protection paths must be computed and signalled before a failure happens. We have seen in MPLS that flag S indicated the end of a stack of labels. We stack an extra label on top of another usually to prevent the following router to access the inner label. This encapsulation mechanism forms a LSP tunnel; these tunnels can be used for protection. There are two methods for doing protection LSPs, which differ in the label with which the traffic arrives at the MP. This in turn influences the number of LSPs that can be protected by a single backup tunnel, yielding either N:1 (facility backup) or 1:1 (one-to-one backup). C. Facility Backup In facility backup, a single protection LSP is used to protect many primary LSPs. Traffic arrives over the backup tunnel with the same label as it would if it arrived over the failed link. The only difference from the point of view of forwarding is that, when arriving to the MP, it may come from different interfaces. Normally, traffic comes from the protected LSP, but in failure cases, traffic may travel through the protection tunnel. In both cases, traffic reaches the MP with the same label. To ensure that traffic arrives at the MP with the correct label, all that needs to be done is to tunnel it into the backup by pushing the backup tunnel label on top of the protected LSP label at the PLR (label stacking) and remove the first label on the stack at the penultimate node before reaching the MP (penultimate hop popping). The ability for several LSPs to share the same protection path is an important scaling property of facility backup. 23 Swap 201→200 Pop 200 C D 200 Protection tunnel for link A-B 201 102 X Push 102 Pop 101 A Swap 102→101 B 100 Y Swap 101→100 Pop 100 Protected LSP from X to Y Fig. 18 – Setting up a facility backup Fig. 18 shows the setup of the backup tunnel before the failure and the forwarding state that is installed at every hop in the path. No new forwarding state is installed at the MP (B). At the PLR (A), the forwarding state must be set in place to push the label of the backup path (label 201 in the example). Any number of LSPs crossing link A–B can be protected by the backup shown in the figure. There is no extra forwarding state for each LSP protected either at the MP or at any of the routers in the path. The label that is advertised by the MP is an implicit null label this triggers penultimate hop popping to be performed for the backup tunnel. Thus, traffic arrives at the MP with the same label with which it would have arrived over the main LSP. Swap 201→200 200 Pop 200 101 D IP 201 102 101 101 IP IP C IP 100 IP X A B Y Push 102 Swap 102→101 Swap 101→100 Pop 100 Push 201 Fig. 19 – Forwarding traffic using the facility backup Fig. 19 depicts an example of forwarding IP traffic using facility backup. To better understand the process of recovering from a link failure, let us analyse each step of the fixing procedure: 24 Router X is not aware of the failure of link A-B, it pushes the attributed 102 label as usually; Router A has already detected that link A-B is down, it swaps label 102 to label to label 101, issued by router B, and pushes label 201 issued by router C to redirect the packet to the protection tunnel; Router C acts like if it was a normal LSP and swaps label 201 to label 200 issued by router D. Label 101 remains untouched; Router B advertised an implicit null label to router D. Therefore D knows that it needs to pop the last label for B, to know the origin of that packet. D then pops label 200 and sends the packet, with the untouched label 101, to router B. D. One-to-one Backup In One-to-one backup traffic arrives at the MP with a different label than the one used by the main (protected) LSP. Fig. 20 shows the setup of a one-to-one backup for the LSP from the previous example and Fig. 21 shows forwarding over the backup following a failure. Traffic arrives at the MP with label 300, the backup tunnel label and is forwarded using label 100, the protected LSP label. Thus, the MP must maintain the forwarding state that associates the backup tunnel label with the correct label of the protected LSP. If a second LSP were to be protected in this figure, a separate backup tunnel would be required for it, and a separate forwarding state would be installed at the MP. Swap 302→301 Swap 301→300 C D 200 Protection tunnel for link A-B 302 X Push 102 102 101 A 300 Swap 300→100 B 100 Swap 101→100 Swap 102→101 Y Pop 100 Protected LSP from X to Y Fig. 20 – Setting up a one-to-one backup Similar to facility backup, the forwarding state must be set up to map traffic from the protected LSP into the backup. For example, traffic arriving with label 102, the label of the protected LSP, is forwarded over the backup using label 302, the backup tunnel label. Note that, using this approach, the depth of the label stack does not increase when packets are forwarded over the backup path, because the top label is simply swapped to the backup tunnel label. 25 Swap 201→200 Pop 200 C D 300 302 300 IP IP 102 IP IP 100 IP X A B Y Push 102 Swap 102→302 Swap 300→100 Pop 100 Fig. 21 – Forwarding traffic over a one-to-one backup Let us once again go through the link failure recovering steps, but now using one-to-one backup: Router X unaware of the failure pushes label 102 to an IP packet; Router A had already detected the failure in link A-B and swaps label 102 to label 302, previously advertised by router C; Router C and D act like if it was a normal LSP and remove the arriving label. After that they add the label advertised by the downstream router; Finally the packet reaches router B with a label that has been negotiated between B and D, instead of being negotiated between B and A. E. Failure Detection In order to accomplish the restoration goal of 50 ms, we need a way of quickly detecting a failure, which is out of FRR scope. Some transmission media provide hardware indications of connectivity loss. One example is SDH/SONET, which is, as we have seen in the first chapter, widely used in the network backbones. These technologies have the capability of detecting a failure in the physical layer within seconds. Some gigabit Ethernet routers have also that capability they use pulse detection to evaluate if the link is up. When failure detection is not provided in the hardware, this task can be accomplished by an entity at a higher layer in the network. As we have seen IGP protocols have a hello exchange mechanism to detect link failure. The original versions of OSPF and IS-IS had architectural time limits in the detecting link failure. The minimum time interval used to them was 3 seconds for OSPF and 1 second for IS-IS. Recent versions of the protocols included a fast-hello mechanism that that can be used to do sub-second failure detection. Nevertheless it is still CPU consuming and causes some overhead. 26 Based on this realization, the BFD protocol was developed jointly by Juniper and Cisco, having rapidly gained acceptance. It is a simple hello protocol designed to do rapid failure detection. Its goal is to provide a low-overhead mechanism that can quickly detect faults in the bidirectional path between two forwarding engines, whether they are due to problems with the physical interfaces, with the forwarding engines themselves or with any other component. 27 28 Chapter 3 The Multicast Routing Problem 1 The Multicast Routing Problem 29 3.1 From Unicast to Multicast If unicast transmissions were used to distribute TV streams, copies of the same video packets would pass through the links (Fig. 22). This would cause congestion. Packet Stream Fig. 22 – Packet distribution in unicast. A packet stream is generated for each receiving node. The problem of having copies of the same packets passing in a single link is solved with the introduction of multicast. Multicast [15] is the delivery of information to a group of destinations simultaneously in a transmission from a given source. In a multicast transmission links do not carry copies of the same packet streams, instead, packets are replicated only when they are needed in order to reach all the destinations (Fig. 23). Packet Stream Fig. 23 – Packet distribution in multicast. In this chapter we will talk about some novelties brought by multicast. The most common 30 concept of an IP address is in unicast addressing, available in both IPv4 and IPv6. It normally refers to a single sender or a single receiver, and can be used for both sending and receiving. A multicast address is associated with a group of interested receivers. In IPv4, addresses 224.0.0.0 through 239.255.255.255 (the former Class D addresses) are designated as multicast addresses. IPv6 uses the address block with the prefix ff00::/8 for multicast applications. In either case, the sender sends a single datagram from its unicast address to the multicast group address and the intermediary routers take care of making copies and sending them to all receivers that have joined the corresponding multicast group. 3.2 Protocol Independent Multicast (PIM) Routing had also changes in multicast. Protocol-Independent Multicast (PIM) [16] is a family of multicast routing protocols for networks that provide one-to-many and many-to-many distribution of data over the Internet. PIM routes multicast packets to multicast groups, and is designed to efficiently establish distribution out-trees. It is termed protocol-independent because PIM does not include its own topology discovery mechanism, much like RSVP and LDP. Instead it uses the unicast routing information supplied by other traditional routing protocols, such as IS-IS or OSPF, to perform the multicast forwarding. Since it relies only on the network layer routing protocols, PIM does not depend on the IP transport protocols, like TCP and UDP. Nevertheless IP datagrams are still used in data transmission. There are many PIM variants, the most popular are PIM Dense Mode (PIM-DM), PIM Sparse Mode (PIM-SM) and PIM Source-Specific Multicast (PIM-SSM). As far as PIM-DM is concerned, it builds source out-trees by flooding multicast traffic domain wide, then pruning back branches of the out-tree where no receivers are present. PIM-DM is straightforward to implement but generally has poor scaling properties and that is why we will not explore it in this dissertation, since it is not really used for IPTV purposes. We will then be focused on PIM-SM and PIM-SSM. A. PIM Sparse Mode (PIM-SM) In sparse-mode PIM, a router in a group is designated as a rendezvous point (RP). Its choice can be done statically or automatically. The RP collects information about multicast senders and makes that information available to potential receivers. The purpose of RP is to allow a receiver to find out the IP address of the source for a particular group. Multicast traffic flows from the sender back down the path created by the PIM messages. All received multicast traffic is checked to verify if the incoming multicast traffic is being received via the interface, on which PIM request was sent. This prevents loops and duplicate packets. Each router forms a neighbour relationship with adjacent PIM routers in a group using PIM 31 “hello” messages. When a router in a group wants to receive a multicast stream, it sends a PIM “join” message towards the RP. The source sends a PIM “register” message to the RP with multicast traffic already encapsulated on it, as we can see on Fig. 24. RP 192.168.0.1 PIM – “register 234.1.1.1” PIM – “join 234.1.1.1” B 10 20 Encapsulated multicast traffic Multicast group address 234.1.1.1 R A 10 10 PIM – “join 234.1.1.1” D Video source 10.0.1.1 C Fig. 24 – Receiver “join” and source “register” in PIM-SM The RP then sends a “join” message to the source. As soon as the message is received, the multicast traffic flows from the source to the receiver (Fig. 25). RP 192.168.0.1 PIM – “join 234.1.1.1” B 10 20 Multicast group address 234.1.1.1 A D 10 10 R1 Video source 10.0.1.1 C Fig. 25 – Source “join” and traffic flow in PIM-SM When a router wants to stop receiving a multicast stream, it sends a PIM “prune” message towards the IP address of the multicast source. Multicast traffic initially travels from the sender to the receiver via the RP. Once the downstream router starts receiving the multicast traffic (and knows the senders IP), it is possible to build path directly back to the sender. This is triggered by the receiver that sends join messages directly to the 32 source. In the example traffic would then flow through links D-C and C-A, that are part of the shortest path. After this reconfiguration the path between D and A will be the shortest path (Fig. 26). RP 192.168.0.1 B 10 15 Multicast group address 234.1.1.1 A D 10 10 R1 Video source 10.0.1.1 C Fig. 26 – Shortest path reconfiguration in PIM-SM If another router wanted to join the multicast group, a path would be added, forming an out-tree. The added path would also be the shortest path between the source and the referred router. An outtree formed by concatenation of the shortest paths between the root and the nodes in the out-tree is designated Shortest Path Tree (SPT) (Fig. 27). RP 192.168.0.1 10 R2 B 10 15 Multicast group address 234.1.1.1 A D 10 10 R1 C Fig. 27 – Shortest Path Tree in PIM-SM after reconfiguration If the receiver knew the sender’s address, the RP would not be necessary. 33 Video source 10.0.1.1 B. PIM Source Specific Multicast (PIM-SSM) PIM-SSM [17] requires the edge router to know the address of the multicast source for each group, avoiding the need of an RP. In PIM-SSM the edge router sends a PIM “join” message directly towards the sender using the unicast routing table. This “join” request will be forwarded along the shortest path (in terms of the IGP) to the multicast source until it reaches a router which is already aware of the multicast group. Video source 2 192.168.1.2 Group 234.1.1.2 PIM – “join 234.1.1.2” PIM – “join 234.1.1.2” PIM – “join 234.1.1.1” A 10 B 20 Group 234.1.1.1 10 D 10 PIM – “join 234.1.1.1” Video source 1 192.168.1.1 C Fig. 28 – Source “join” in PIM-SSM. “Join” messages for different group are sent to its respective video source. Video source 2 192.168.1.2 Group 234.1.1.2 10 A B 10 20 Group 234.1.1.1 10 D Video source 1 192.168.1.1 C Fig. 29 – Data flow in PIM-SSM. Video packets are sent to the reverse packet of the respective “join” message. Fig. 28 and Fig. 29 show PIM-SSM in operation, as we can see there is no shared out-tree sourced at a RP, as in PIM-SM, and there is also no need for any source discovering mechanism. This is an advantage of PIM-SSM over PIM-SM; the complexity of the multicast routing for PIM-SSM is 34 lower than PIM-SM and therefore easier to configure and maintain. PIM-SSM has also a more efficient network usage, since traffic is not routed temporarily via the RP and the most direct path from source to receiver is always used. Despite the wide use of PIM-SM in IPTV networks, the advantages of the fairly new PIM-SSM make it the one to be chosen in IP backbone, especially when the range of potential multicast group addresses is not much large, which is typically the case. 3.3 Point-to-Multipoint LSPs (P2MP LSPs) The LSPs generated in MPLS for unicast are also different in the multicast case [18]. So far we have seen how MPLS is used to establish point-to-point LSPs, now we will see the tools that MPLS has to offer in order to handle multicast connections. Independently of the signalling protocol that is used, the forwarding mechanisms are similar. PE2 PE3 L2 L4 A B P1 L3 C Data Source PE1 P3 L1 P2 L5 PE5 Forwarding table on P1: L6 L7 PE4 IN IN OUT OUT Label Interface Label Interface L1 C L2 A L1 C L3 B Fig. 30 – P2MP LSPs example. The forwarding table of P1 shows the different labels using for the different interfaces. Fig. 30 shows an MPLS multicast scenario, as we can see these LSPs have only one ingress router but several egress routers. PE1, the ingress router, replicates the packets and sends them to the downstream nodes, always with different labels. The downstream routers act in a similar way; they 35 also replicate the packet and send the copies to the next downstream routers, again, always with different labels. These routers are called branch nodes. P3 is however a special case, since instead of replicating the packets, it just transmits them with a new MPLS label. This forwarding mechanism is quite useful, since it is far more efficient than using unicast LSPs. RSVP was not prepared to P2MP LSPs as well. However, RSVP got a traffic engineering (RSVP-TE) extension, allowing it to handle this new type of LSPs. Nevertheless it still uses routing information of layer 3 protocols; therefore it is still out of the RSVP ambit to construct the outbranchings. PE2 PE3 L2 L4 P1 P3 L1 L3 PE1 L1 L5 P2 PE5 L7 L5 L6 Path messages PE4 Resv messages Fig. 31 – P2MP LSP setup Fig. 31 shows the setup of multicast LSPs using RSVP-TE. We can see that like in unicast, routers running RSVP-TE use Path messages to communicate downstream, and Resv messages to communicate upstream. In the point of view of MPLS’s control plan a P2MP LSP is just a sum of pointto-point LSPs. Path messages have an Explicit Route Object (ERO) which determines the route followed by the LSP. On the other hand, Resv messages incorporate, for each step, the label that will be used for forwarding in that step. In the case P2MP networks, each sub-LSP is signalled using its own Path and Resv messages. For this to happen both type of messages contain a new P2MP Session Object that identifies the sub-LSP. This mechanism grants the replication of the packets in forwarding. Let us see how this works in the example network shown in Fig. 31. The represented P2MP LSP has four egress routers, it is composed of four sub-LSPs, one from PE1 to PE2, another from PE1 to PE3, and so on. Because each sub-LSP has its own associated Path and Resv messages, on some links multiple Path and Resv messages are exchanged. For example, the link from PE1 to P1 has 36 Path messages corresponding to the sub-LSPs to PE2 and to PE3. Let us examine in more detail how the P2MP LSP in Fig. 31 is signalled; looking at sub-LSP PE1 to PE3 whose egress is PE3: A Path message is sent by PE1, the ingress router, containing the ERO {PE1, P1, P3, PE3}. This can contain a bandwidth reservation for the P2MP LSP if required. PE3 responds with a Resv message that contains the label value, L4, that P3 should use when forwarding packets to PE3.Similarly, the Resv message sent on by P3 to P1 contains the label value, L3, that P1 should use when forwarding packets to P3. In a similar way, for the sub-LSP whose egress is PE2, P1 receives a Resv message from PE2 containing the label value, L2, that P1 should use when forwarding packets to PE2. P1 knows that the Resv messages from PE2 and P3 refer to the same P2MP LSP, as a consequence of the P2MP Session Object contained in each. P1 sends a separate Resv message to PE1 corresponding to each of the two sub-LSPs, but deliberately uses the same label value for each, L1, because the two sub-LSPs belong to the same P2MP LSP. P1 installs an entry in its forwarding table such that when a packet arrives with label L1, one copy is sent on the link to PE2 with label L2 and another copy on the link to P3 with label L3. This ensures that when the Resv messages are sent from P1 to PE1 corresponding to the two sub-LSPs, no double-counting occurs of the bandwidth reservation. PE1, knowing that the two Resv messages received from P1 refer to the same P2MP LSP, a consequence of the P2MP session object contained in each, forwards only one copy of each packet in the flow to P1, with the label value L1 that had been dictated by PE1 in those two Resv messages. 3.4 Protocol Architecture In order to implement the custom out-branchings presented in chapter 4, we will present a set of protocols that enable custom designs, can cope with resource reservation and handle fast reroute restoration. This protocol scheme is based on [3], and uses PIM-SSM combined with P2MP LSPs. 37 Multicast coordination PIM-SSM Routing info changes IGP routing IS-IS/ OSPF Route Signaling RSVP-TE FRR support Link restoration MPLS Fig. 32 – Cross layer architecture PIM-SSM is used to coordinate the multicast groups and generate the multicast out-branchings. IGP routing is assured by either IS-IS or OSPF. P2MP LSPs are established using RSVP to enable resource reservation and FRR is used for rapid link restoration. Routes are established following the next steps: 1. The network topology and its current link costs are fed to an out-branching calculation algorithm that will be presented in the next chapter. The routers can run the programmed algorithm since they are running an LS protocol which provides them the complete network topology. The out-branching calculation can also be done by the network operator; 2. After finding the out-branching, we need to re-set the IGP link costs. The cost of the physical links in the main out-branching remains the same. The cost of the physical links that are not in the main out-branching are set to (Fig. 33) The new IGP costs are then downloaded to the routers; 3. This way PIM-SSM will generate the same out-branchings that we have calculated; 4. RSVP signals the out-branchings generated by PIM-SSM, making the proper resource reservations, forming a P2MP LSP. 5. The backup paths for each link of out-branching, can be calculated by the network operator and downloaded to the routers. Alternatively, RSVP can setup them automatically using the IGP routing information. To do such, IGP costs need to be re-set again. The cost of the physical links of the main out-branching are set to and the costs of the other physical links are set to its original value. Doing such grants that the backup paths that are calculated do not share any link with main out-branching. 6. If the backup paths were calculated using RSVP, the IGP costs need to be set to the values of step 2; 38 3.5 Failure Recovering Process In point 5.1 we have described a procedure to assure the setup of the pre calculated out- branchings. We have also presented a way of calculating and setting up protection paths. Fig. 33 shows a network that has already been set. A S R1 B 3 A A B 1 B C R7 5 A C R2 2 B A C A R5 C B R3 6 Physical bidirectional links A Links used in FRR backup paths 3 A B 4 R4 B Out-branching links R6 Fig. 33 – Setup of calculated out-branchings The normal black lines represent physical links, which provide connectivity in both directions. The red arrows represent the links used by the out-branching. The dashed black lines represent FRR backup paths and the bold red arrows represent the links that are protected. For simplicity only two backup paths are presented path R1-R3-R2 is protecting link R1R2 and path R4-R5-R7-S-R1 is protecting link R4R1. Note that the protection paths are disjoint from the main tree. They are built this way to prevent congestion in main tree links. If the protection paths shared links with the main tree, these links would transport more than packet stream replicas. For example, if we used path R4-R3-R1 to protect link R4R1, link R4R3 would transport the packets destined to R3 as well as the packets destined to R1. Let us assume that link R1R2 fails. Once the failure is detected the respective FRR backup is activated. Fig. 34 shows the label switching table of node R1 and the activation of the backup path. Before the failure packets arrived at R1 through interface C with label 50, R1 switched the packet label to label 40 and would send it to R2 through interface A. When the failure is detected, R1 uses the FRR columns of the switching table. It sends the packet using interface B with label 30. 39 IN Int Label IN C 50 OUT OUT FRR FRR Int Label Int Label A 40 B A 30 S R1 A B 3 A D B 1 B C R7 5 A C R2 2 A R5 6 Physical bidirectional links C B R3 Active FRR backup path B A C A 3 A 4 B Links used in FRR backup paths B R4 Out-branching links R6 Fig. 34 – Link failure scenario The activated protection path is transparent to the IGP protocol, which is not yet aware of the link failure. The process of detecting the failure and activating the protection path takes about 50 ms, which is lower than the failure detection times of the IGP and PIM protocols. Both OSPF and IS-IS use an HELLO based protocol that take about 1-10s to detect link failure. Normally when the failure is detected, LSAs are broadcasted to inform the routers. After the broadcast, the IGP routing table of S would be like the one represented below. Router Path Cost Interface R1 20 B R2 X B R3 21 B R4 18 B R5 8 B R6 14 B R7 3 B Table 1 – Router S IGP routing table S is the head-end of the sub-LSP that S-R7-R6-R4-R1-R2, which was signalled with RSVP and needs to be refreshed periodically. When RSVP tries to refresh the LSP it uses the IGP routing information. This would generate an error since S is not aware that the backup path is being used. This error would cause the LSP to be teardown. 40 The purpose of the backup is to protect traffic while the LSP head end looks for an alternate path for the LSP, avoiding the failed link. For this to happen, the head end must first find out about the failure. The PLR takes care of this by notifying the head using an RSVP Path Error message with a ‘Notify’ error code and ‘Tunnel locally repaired’ subcode. In addition to this, a new flag indicating that the path is locally repaired is turned on in the Record Route Object. If an LSA is received by the head end the RSVP Path Error is received, the LSA must be suppressed. After the broadcast of the link failure, IGP costs are recalculated and a new topology is found. During this process the backup path is at use. The new topology and link costs need to be fed to the out-branching calculation algorithm again, so that a new out-branching can be calculated. The IGP costs are then downloaded to the routers that use PIM to generate the out-branchings and RSVP to signal them. After the establishment of the new LSPs, the old are torn down. 41 42 Chapter 4 Designing Multicast Outbranchings 4 Designing Multicast Out-branchings 43 In this chapter we will present algorithms to calculate out-branchings with different characteristics. We have implemented the algorithms to compare the resultant out-branchings. The results of this comparison are presented in the end of the chapter. 4.1 Definitions The notation used in this thesis is the same used in [19]. The definitions that we will present next can be seen in more detail in [19] and [20]. A. Directed Multigraphs and digraphs A directed multigraph is a pair elements of of finite sets together with a map are called the vertices or nodes of vertex set of a directed multigraph and the those of is referred to as The map associated with directed multigraph these vertices is its tail, denoted by also say that link leaves vertex . The . a pair of vertices. The first of , and the second is its head, denoted by and that it enters vertex . We . To express that are the tail and head of link , respectively, we sometimes write not determine are called the links of and its link set as assigns to link . The and , although this notation does uniquely. Another characteristic that may distinguish a link is its cost represented by We denote the sets of links leaving and entering vertex by and respectively. The out-degree and in-degree of , in a directed multigraph (1) , i.e. the number of links entering, or leaving node , are defined as and , (2) respectively. We generalize the out- and in-degrees to sets of vertices. For be the number of links leaving and entering and , we let be the number of links entering and leaving . As a result, we have . We let be the number of links leaving number of links entering and leaving (3) and entering and be the . As result we have . If link leaves vertex an in-neighbour of and enters vertex , we say that (4) is an out-neighbour of .The sets of out-neighbours and in-neighbours of 44 and that is in directed multigraph G are defined and (5) respectively. For convenience, define and The reverse of directed multigraph such that for every link in symmetric if from . is the directed muligraph to there is a link in with the same vertex set and from to . A directed multigraph is . A digraph is a directed multigraph where the mapping from the set of links to the set of ordered pairs of vertices is one-to-one. In this case, each link is uniquely identified by its ordered pair of vertices. B. Paths, cycles and walks A path P from to is a digraph of the form . (6) A path is simple if it does not contain repeated vertices. Adding link yields a cycle. Two paths each from independent if and to to the path above are link-disjoint if they have disjoint link sets. They are are the only vertices they have in common. Clearly, any two independent paths are link-disjoint, but the converse is not true in general. A walk from to in directed multigraph of vertices and links of such that is an alternating sequence and for all . If a walk begins and ends in the same vertex, we designate it as closed walk. A closed walk may contain one or more cycles. C. Undirected graphs An undirected graph is one in which links have no direction. The elements of called vertices of , but undirected links elements in are called the edges of . It is equivalent to say that a graph is undirected or symmetric. Therefore if we have an edge between that there is a link from to and another from are also to and , we assume . D. Trees, out-trees and out-branchings A tree is an undirected graph in which any two vertices are connected by exactly one simple path. In other words, any connected graph without cycles is a tree. If a tree spans all vertices of a graph, it is called a spanning tree. In opposition, out-trees are directed graphs. An out-tree ( rooted at is a digraph where every node , but , have only one in-neighbour and there is a path from the root to each node. The root 45 has no in-neighbours ( . An out-tree that spans all vertices of a given graph is called an out-branching. 4.2 Minimum Cost Out-Branchings We have seen that PIM ends up using SPTs to forward traffic flows. The key feature of the SPT is that the paths from the root to a vertex are the shortest. A SPT that spans all vertices of a given graph is called shortest path out-branching. Minimum cost out-branchings (MOB) are another type of spanning out-trees. Given a certain graph , the MOB of branchings in is the out-branching that has the lowest cost, among the existent out- . The strategy used to calculate a minimum cost out-branching is different to the one used to calculate a minimum cost spanning tree. A minimum cost spanning tree, or minimum spanning tree (MST), of an undirected connected graph is a spanning tree with cost less than or equal to the cost of every other spanning tree on that graph. A MST always contains the minimum cost edge of the graph it spans. s 2 10 10 a 1 8 4 2 2 4 Fig. 35 – Minimum cost out-branching (bold red arrows) Fig. 35 shows a MOB, where is the minimum cost link and is not part of the MOB. Starting with the cheapest links is not a correct strategy; hence the algorithms used to calculate MSTs cannot be used to calculate the MOBs. A. Algorithm to find MOBs in a multigraph A possible algorithm to calculate a MOB is presented in [21]. Given a multigraph with attributed link costs, the algorithm recursively changes the graph until every vertex has an entering link that has cost equal to 0 and there are no cycles in the graph with total link cost equal to 0. At this point, the 46 algorithm recursively selects links to be part of the MOB and returns the graph to its original form. The first step is to reattribute the costs of the links. For each vertex we to find the cost of the minimum cost link, among the links that are entering vertex (7) After that we need to reattribute the costs of the links entering , by subtracting its cost with (8) where represents the reattributed costs. After this step, for every out-branching in graph rooted at root the sum of the reattributed costs added to the sum of costs of the minimum links entering each vertex, will be equal to the sum of the original link costs. (9) Thus, is a MOB in relation to costs if, and only if, it is a MOB in relation to costs . This step is depicted in Fig. 36. s s 2 1 8 4 8 4 0 2 2 0 0 0 4 0 10 6 10 1 Fig. 36 - Algorithm to find MOBs: Step I (link cost update) The second step of algorithm is to find cycles with links that have cost equal to 0. After that the algorithm agglomerates them into a single super node, as represented in Fig. 37. 47 s s 1 8 6 1 8 6 0 0 4 4 0 0 0 Supernode 0 Fig. 37 - Algorithm to find MOBs: Step II (0 cost cycle is agglomerated in a supernode) We then need to repeat the first two steps until the following two conditions apply: There are no cycle with 0 cost links left; Every vertex has an entering link that has cost equal to 0. After this we will end up with graph on the right of Fig. 38. s s 1 8 0 6 7 Supernode 4 5 Supernode 0 4 0 Fig. 38 - Algorithm to find MOBs: Step III Next we will start to the links of the out-branching. The first that are going to be selected are the ones that have cost equal to 0 (bold red links Fig. 39). 48 s 0 7 5 Supernode 4 0 Fig. 39 - Algorithm to find MOBs: Step IV (0 cost links are selected as part of the MOB) Finally we need to expand the supernodes and select a special link is link contained in the cycle that has head in , 1 2 10 10 2 0 0 1 8 4 Fig. 40 - Algorithm to find MOBs: Step V (the links of the cycle except link The complexity of this algorithm is 1 are added to the MOB) The pseudo code of this algorithm is presented below. Algorithm 1 - Finding a MOB rooted at 4 2 0 0 f . The special . s 6 0 as s 8 link as the entering link of a supernode and 4 out-branching. Let us define that will not be included in the in a given multigraph Define minEnteringLinkCost[ ] 49 2 3 Repeat for each minEnteringLinkCost[ ]:=+ 4 5 do for each do for each 6 do 7 if minEnteringLinkCost[ ] < cost( ) do 8 minEnteringLinkCost[ ] := cost( ) 9 for each do for each 10 do cost( ):= cost(e) - minEnteringLinkCost[ ] 11 12 Find a directed cycle 13 Contract where any edge has cost( )=0 in a single super node , yielding 14 until no cycle 15 Repeat is found in step 12 16 find a minimum out-branching (V’,F’) in G’ rooted at s 17 extend 18 by substituting for and deleting the special link of until there are no super nodes left We have implemented this algorithm in order to compare the total link cost of the out-branchings calculated using this algorithm, with the total link cost of the SPTs in a given group of graphs. This algorithm will also be compared with other algorithms that will be presented further ahead. This comparison is presented in the end of the chapter. Algorithm 1 calculates an optimal out-branching, given a specific network. However, in case of failure, it may not always be possible to use FRR to protect every link of the out-branching. This happens due to the simple fact that an alternate path may not exist. Our network has then requirements that should be taken in consideration. Obviously we should guarantee that every link is protected. The process of choosing both the out-branching and its protecting paths must be done carefully to avoid overlap problems. When protecting a specific out-branching link we may be obliged to use another link as part of the backup path, this may result in using a link to transmit the same video packets more than once, which may cause link congestion. To solve this problem we need to guarantee that the backup of each link is disjoint from the out-branching. In other words, we need to guarantee that the protection paths do not share any link with the out-branching. 50 4.3 Connectivity Theorems To better understand the disjoint paths problem we need to focus on some theorems and propositions. Menger’s theorem is a fundamental result on connectivity of finite graphs. Menger’s Theorem (link version): In a directed multigraph form , there are to , if and only if after deleting any -1 links, there is still a path left from link-disjoint paths to . Menger’s theorem had a later reformulation in the Max Flow Min Cut theorem. Menger’s Theorem (Max Flow Min Cut theorem application): A directed multigraph link-disjoint paths from vertex to vertex for all has if and only if such that and , (10) The Menger’s theorem gives us a way to prove the (in)existence of link-disjoint paths; however we still need to know how to find them. A. Finding Disjoint Paths A possible first approach to this problem could be finding an arbitrary path between two vertices, eliminate it from the graph and repeat the process until there is no path left. This procedure is inappropriate to find all the disjoint paths, as we can see in Fig. 41 and Fig. 42. The graph is the same in both figures; we can see in Fig. 42 that there are two disjoint paths from vertex 0 to vertex 3 (0-1-2-3 and 0-2-3). However if to find them, we chose path 0-1-2-3 represented in red in Fig. 41, we fail, since there is no other path left in the graph that is disjoint from the first one. Fig. 41 – Link-disjoint example part I Fig. 42 – Link-disjoint example part II In general we cannot choose and remove an arbitrary path between two nodes, expecting to have another path left between them. However, there is an exception. B. Finding Disjoint Paths in 2-Connected Symmetric Graphs A 2-connected graph is a graph where between any pair of vertices there are always 2 disjoint 51 paths. Given the definition of 2-connected graphs, Menger’s theorem gives us that the minimum cut between any two sets of a given connected symmetric graph for all such that is always 2 and therefore we get: (11) , for any and These graphs have some special properties and therefore finding disjoint paths in them may not be that restrictive as we have seen before. This presumption may not be that hard to believe. Let be a graph composed by two symmetric cycles. In such graph we can eliminate a simple path between two points, knowing that connectivity between those two points still exists. There are always two disjoint paths available between two nodes, no matter what is the path that we choose first, a second one is always available. Note that if we had chosen and eliminated a path where one of the vertices appeared more than one time, another path would not be available. 3 4 2 0 Unused links Path 1 1 Path 2 Fig. 43 – Two symmetric cycles. There are two disjoint paths between 0 and 1. Proposition 1: Let vertices be a 2-connected symmetric graph and an arbitrary simple path, between and . We can remove from that there will still be a path left between and . Proof: Let us consider two disjoint sets, and . In a 2-connected graph, . Let us suppose by absurd that we can find a simple path between two arbitrary vertices, and , given that such path uses all links of . Where represents the set of entering links of Fig. 44 depicts sets uses all the links of and , each having two entering links , but is not a simple path. 52 . In this case path G e P s f g Z t h Y Used links Unused links Fig. 44 – Closed Path using the 2 links of In fact, in order to use links and in the same path, we must use either link , or link . Given so, the resultant path will not be a simple path. If we had links between the sets, we need two add at least a pair, since the graph is two symmetric. Fig. 45 depicts sets and , each having three entering links . G e P s f g h i Z t j Y Used links Unused links Fig. 45 – Closed path using the 3 links of In this case we have the same problem, in order to use links , either link , or , ending up with closed path again. The only way that and still be a simple path, was to add a link from in that case, and in the same path, we must use could use all entering of to , without adding another form to . However, would not be symmetric. It is then impossible to choose a simple path that uses all the entering link of a set, on a given 2- 53 connected symmetric graph . If such we can choose to remove a simple path , between two vertices of , that there will still be a path left between them. This result is specific to 2-connected symmetric graphs and it does not apply to other types of graphs. However it is possible to generalize proposition 2 to -connected symmetric multigraphs. At this point, we could think that in a arbitrary path that -connected symmetric graph we could choose to remove one paths will still be available at its endpoints. This is not true. G P s t Z Y Used links Unused links Fig. 46 – Simple path between sets and Fig. 46 shows an example that proves our impossibility. Path we remove it, only one path remains from remains, not uses 2 of the 3 links of , if to . The only guarantee that we get is that only one path paths as would maybe be expectable. Given so, the only generalization we can do is the following. Proposition 1 (Generalization): Let arbitrary simple path, between vertices left between be a connected undirected graph and and . We can remove from an that there will still be a path and . As a result we will have: for all such that and , for any (12) Unfortunately, this result is not applicable to directed multigraphs. Let us consider a 2-connected directed multigraph. In this case links may not have its symmetrical, a determinant point in proposition’s 1 proof. Proposition 2: Let between vertices be a 2-connected directed multigraph and and . If we remove from an arbitrary simple path, there may not be another path left between and . Proof: We can use a construction similar to the one we have used to prove the trueness of Proposition 54 2. We can try to find a -connected multigraph where the removal of a path cuts connectivity between its endpoints. Fig. 47 – 2-connected directed graph Fig. 47 shows a simple -connected directed multigraph, if we choose to remove path graph, sets and from the will be disconnected and therefore there will be no path left between s and t. This proves proposition 3. C. Edmonds’ Theorem Another important result, as far IPTV resilience is concerned, would be to know the number of possible disjoint out-branchings in a given graph. Such result is given by Edmond’s Theorem. It states that a directed multigraph has link-disjoint out-branchings rooted at for all such that and if and only if: (13) We can see an application of the theorem in the next figures. s Fig. 48 – Graph with 2 link-disjoint out-branchings. One represented in normal red lines, the other in dashed blue lines 55 Fig. 48 shows a graph that has 2 link-disjoint out-branchings. We can see that in this graph any set of vertices that does not include the source has 2 or more entering links. An example of such sets is depicted in Fig. 49. X s Y s Fig. 49 – Edmond’s theorem application example. Set and have 3 and 2 entering links represented in bold, respectively. Other sets could be defined and all of them would have at least 2 entering links. 4.4 Link-Disjoint Out-branchings Algorithms Instead of calculate backup paths to a given out-branching, we can pre-calculate link-disjoint out-branchings and alternate between them in case of failure. We present two algorithms that perform this task. A. Algorithm for 2 link-disjoint out-branchings in 2-connected symmetric graphs In this algorithm, we have made use of the above results on 2-connected symmetric graphs to adapt the edge-disjoint algorithm in [5]. Let be a 2-connected symmetric graph, a source vertex, and the two resultant disjoint out-branchings. This algorithm finds two link-disjoint out-branchings rooted at . In the first step not simultaneous in and are empty sets and and First it finds a path ( in . For each vertex in that is ), the algorithm does the following: . The algorithm then concatenates path branching . 56 with out- s s P u u Fig. 50 – First iteration of algorithm 2: Part I (path After that the algorithm finds a path in R B is added to out-branching ) . The algorithm then concatenates path with out-branching . s s u u P Fig. 51– First iteration of algorithm 2: Part II (path The union of reverse links of and is a closed walk. Given that R B is added to out-branching ) is symmetric, we can make use of the and add them to the other out-branching ( ), excluding obviously the links that lead to vertices already in out-branching (this does not happen in the first iteration) (Fig. 52). 57 s u s R B R B u Fig. 52 – First iteration of algorithm 2: Part III (the reverse edges of path are added to ) To finish the iteration, the algorithm adds the reverse links of path excluding, as before, the links that lead to vertices already in out-branching (Fig. 53). s s u to out-branching R B u Fig. 53 – First iteration of algorithm 2: Part IV (the reverse edges of path This process is repeated until all vertices are part of both out-branchings. 58 R B are added to ) , s u R B Fig. 54 – Final result of the application of algorithm 2 The complexity of this algorithm is . Its pseudo-code is presented below Algorithm 2: Finding 2 link disjoint out-branchings rooted at 1 Define 2 Define 3 for each ( 4 find a path 5 for each 6 from ), do to in do if then 7 8 9 10 11 find a find a path from for each do if to in then 12 13 14 for each do 59 in a 2-connected symmetric graph if 15 then 16 17 for each 18 do if 19 then 20 21 Proof: The proof of correctness of the theorem can be done by induction. We must assure that, for each iteration , a node not included in one of the out-branchings has a path from that out-branching to itself. As far as the first iteration ( to an arbitrary vertex ) is concerned, we find two disjoint paths and add them to out-branchings and and , from the root respectively. The second path is available due to the properties of the 2-connected symmetric graphs we have described above. The union of and is a closed walk. Given that the graph is symmetric, we can make use of the reverse edges in both paths and add them to the according out-branching, ending up with all the vertexes of both disjoint paths found, being part of both out-branchings After the and . iteration of the algorithm, we have two non-spanning disjoint out-branchings. By induction, we assume that a node not included in both out-branchings has two disjoint paths from the out-branchings two itself. Now we add the paths from to a vertex two paths. Since the graph is 2-connected, there must be two disjoint not yet covered by the out-branchings. Since all the vertices already present in the out-branchings are part of both and , the two disjoint paths that lead to may start at any two arbitrary vertices and can be added at any order to the respective out-branching. Doing so, grants the introduction of node in both out-branchings (Fig. 55). 60 B B u u s s R Fig. 55 – Algorithm 2 R iteration: Part I (there are always two paths available from the two outbranchings to node ) . Again it is also possible to form a closed walk using the latest two disjoint paths found (reversing one of them) and part of the out-branching. Given this, we can again make use of the reverse links of the new paths to add any uncovered nodes to the respective out-branching. B B u u s s R R Fig. 56 – Algorithm 2 At the end of iteration: Part II iteration all the nodes in one out-branching are also part of the other out- branching. If a node is not part of any out-branching, it has 2 disjoint paths from the out-branching to itself. 61 B. Improving the results of Algorithm 2 After proving the correctness of the algorithm, we can now focus on improving it. As we have seen, we can choose any path between any two vertices of a 2-connected symmetric graph, knowing that there is still one left. Algorithm 2 finds paths that are added to the out-branchings, a simple breadth-first search (BFS) could be used to calculate the paths. However, if we have the freedom to choose any path, we could use the shortest paths. The algorithm alternately adds a path to out-branching cannot be added to and . When a link is added to it and vice-versa since they are supposed to be disjoint. Given so, if we choose the shortest path available from the root to a vertex that might be part of the shortest path between and add the links of this path to and another vertex , those links , will not be available to be inserted in . Such scenario is represented in the following figures. s 2 2 5 1 1 3 u5 u3 2 4 1 5 4 1 3 14 u6 3 1 u4 u0 u1 2 5 2 3 1 R B u7 Fig. 57 – Choosing paths in algorithm 2: Part I (path - - is added to out-branching ) Fig. 57 depicts the first execution step of algorithm 2 in graph and was added to . The shortest path between . After that the algorithm adds the shortest path between , and adds it to . 62 and , available in s 2 2 5 1 1 3 u5 u3 u0 2 4 1 5 4 1 3 14 u6 3 1 u4 u1 2 5 2 3 1 R B u7 Fig. 58 – Choosing paths in algorithm 2: Part II (path - - - is added to out-branching ) Later on, the algorithm uses the reverse links of both paths to include all vertices in both outbranchings. s 2 2 5 1 1 3 u5 u3 u0 2 4 1 5 4 1 3 14 u6 3 1 u4 u1 2 5 2 3 1 u7 R B Fig. 59 – Choosing paths in algorithm 2: Part III (the reverse edges of the paths are added to the outbranchings) Next the algorithm finds the shortest path from ) is . If links and . The shortest path available in were not in , they could be added to . In that case the total cost of and the vertices in to to form the path would be lower and the cost of the paths between would also be lower. Links , paths were not added to . Making them available to . 63 and would be available if the shortest s 2 2 5 1 1 3 u5 u3 2 4 1 5 4 1 3 14 u6 3 1 u4 u0 u1 2 5 4 3 1 u7 R B Fig. 60 – Choosing paths in algorithm 2: Part IV (the result of not choosing the shortest paths for outbranching ) Assuming that we are going to use one of these out/branchings to distribute TV streams and the other for backup, we can play with the chosen paths in order to: Reduce the total link cost of the paths from the root to the vertices in main out-branching, used for TV distribution; Reduce the total link cost of the paths from the root to the vertices in backup out-branching. We have used heuristics to test 3 different implementations of algorithm 2: Not improved: where the paths added to both out-branchings are calculated using a BFS algorithm; Improved for the main out-branching: where the objective is to reduce the link cost of the paths between the root and the vertices in the main out-branching. In this case, the paths of the main out-branching are calculated using a Dijkstra shortest path algorithm and the paths of the backup out-branching are calculated using a BFS algorithm; Improved for the backup paths: where the objective is to reduce the total link cost of paths used for backup. In this case, the paths added to the main and backup out-branchings are both calculated using a Dijkstra shortest path algorithm. The results of the tested implementations will be presented in the end of the chapter. C. Algorithm for link-disjoint out-branchings This algorithm is presented in [19] and makes use of a result of Edmond’s theorem to calculate the maximum number of link-disjoint out-branchings in a given graph. The algorithm begins calculating the maximum number of possible link disjoint out-branchings. To do this the algorithm finds the minimum number of link disjoint paths between the root and every node of the graph. 64 The algorithm presented below is part of the link-disjoint out-branchings algorithm (Algorithm 4) and is used to calculate the number of disjoint paths between the root and a given node. Let be a directed multigraph, and respectively. A -link-disjoint subgraph A, from and a source and destination vertex, to encloses k and not more link-disjoint paths from to , which can be found with a path-finding algorithm at complexity . Let (14) . Then, A is recovered from and by (15) Algorithm 3 is based on this recursion and finds the maximum number of link-disjoint paths from s to t. The algorithm has complexity where k is the maximum number of arc-disjoint paths from s to t. Note that, in order to find, for example 2 link-disjoint out-branchings, we do not need to have a 2-connected algorithm as we would need in Algorithm 2. We just need to have two link-disjoint paths between the root and every node of the graph. Algorithm 3: Finding the maximum number of link-disjoint paths in from to . 1 2 3 while there is a path in 4 find a path from in graph from to do to 5 6 Once the algorithm determines the maximum number of link-disjoint out-branchings rooted at s, it builds them one at a time. In turn, each of these out-branchings is build one link at a time starting from and stopping when we get an out-branching, a spanning out-tree. Before adding the link to an out-branching, the algorithm verifies if link-disjoint paths, from to , will still be available after its addition. In case of success the link is added to the out-branching, otherwise it is marked and other link is chosen to be verified again. This process is repeated for every link to be in an out-branching for each out-branching to be calculated. Let us see an execution example. 65 s u0 u1 u2 u3 k=2 u4 Fig. 61 – Graph and the maximum number of disjoint out-branchings First algorithm 3 is used to calculate the maximum number of disjoint out-branchings , that are available in graph above. In this case . After this the algorithm starts building the out-branchings, one at a time. The first out-branching (out-branching[1]) is initiated with the root . The algorithm then finds a link with out-branching[ ] and out-branching[i] . An example of such link is represented in green in the figure below. s u0 u1 u2 u3 Chosen link Disjoint path u4 Fig. 62 – First iteration of algorithm 4: Part I After choosing a link, the algorithm evaluates if there are at least 66 paths from to , that are simultaneously disjoint from , where and is the maximum number of out-branchings in graph is the variable that represents the current iteration. In this case so, the algorithm needs to check if there is at least 1 path from to G . Given , that is disjoint from . In case of success the path is added to out-branching[ ] and removed from and therefore added to set , and , otherwise is rejected, . s out-branching[1] s u0 u1 u2 u2 u3 Chosen link Disjoint path u4 Fig. 63 – First iteration of algorithm 4: Part II In this case, there was a disjoint path available as we can see in the above figure. Nevertheless, such is not always possible, as we can see in Fig. 64. G s out-branching[1] s u0 u1 u2 u2 u3 Rejected link u4 Fig. 64 – First iteration of algorithm 4: Part III 67 The algorithm continues to add links to the out-branching, until it spans all the vertices of . After that the iteration will be , and second out-branching is constructed with the links left. The two obtained out-branchings are presented below. s s out-branching[2] out-branching[1] u0 u0 u1 u2 u1 u2 u3 u3 u4 u4 Fig. 65 – Final result of the application of algorithm 4 The complexity of this algorithm is , where is the maximum number or link- disjoint out-branchings rooted at . The pseudo-code of the algorithm is presented below. Algorithm 4: Finding a maximal set of link disjoint out-branchings in 1 2 for each do 3 4 5 6 for to do 7 8 9 10 while do Repeat 68 rooted at 11 take with 12 if there are at least ) and link-disjoint paths in from to then 13 else 14 15 16 until 17 18 19 Proof: Suppose that T is a non-spanning out-tree for which condition: for all such that (16) and is satisfied. From Edmonds’ theorem, we can be assured of the existence of a link such that the condition above remains satisfied for the non- and spanning out tree from tail in with or, equivalently said, such that there are at least to and head in from link-disjoint paths in . Algorithm 4 looks for such a link. If, in the process, it finds a link with but such that there are less than to its head, then this link cannot be added to construction of this out-branching, the link is stored in variable link-disjoint paths in now or at any other time during the . We have also implemented this algorithm; it will be compared with algorithms 1 and 2 in the next point. An example of the creation of 3 out-branchings generated by our implementation is presented below. The diferent out-branchings are represented in blue, green and red. 69 Fig. 66 – 3 out-branchings rooted at vertex 0 generated by our implementation of algorithm 4 4.5 Comparison of the Out-branchings To better understand the advantages and disadvantages of all the out-branchings generated by the algorithms we have seen, we have generated 100 random 2-connected symmetric graphs with 50 vertices each. To each link was attributed a random cost between 1 and 10. Note that two symmetrical links can have different costs in our simulation. We have then applied the different algorithms to the generated set of graphs and analysed the resultant out-branchings. The results of the presented algorithms will be compared with spanning SPTs that were also calculated for the generated graphs. In the case of algorithms 2 and 4, which generate two outbranchings, only the out-branching that has the lowest total link cost is contemplated The following table shows the results of the application of the algorithms that we have seen. We have focused our evaluation in 4 important points: Medium out-branching cost – the arithmetic mean of the sum of the costs of all the links in the out-branchings; Medium path cost – the arithmetic mean of the cost of the paths between the root and all the nodes in the out-branchings; Medium percentage of non-minimum paths – the arithmetic mean of the percentage of paths (between the root and each vertex) that have total link cost that is higher than the one provided by a SPT, on each out-branching; Resiliency – the granted possibility of having protection paths, that are disjoint from the main 70 trees. Algorithm 2 Algorithm 2 Algorithm 2 (Not improved) (improved for the backup paths) (improved for the main outbranching) Algorithm 4 SPT MOB 201.67 161.6 236.45 224.66 207.22 256.51 6.66 9.80 8.94 8.31 7.45 8.16 X 62.02% 41.54% 24.04% 12.70% 34.62% NO NO YES YES YES YES Medium Outbranching Cost Medium Path Cost Medium Percentage of Non-Minimum Paths Resiliency Table 2 – Algorithm comparison The different algorithms offer advantages and disadvantages in different points. The minimum out-branchings (MOBs) offered of course the lowest total link cost. However the price to pay in terms of medium path cost is high. The total link cost of the path that goes from the root to each vertex is higher. In the point of view carrier, MOBs are the most efficient design in terms of link usage. MOBs use the design that offers the minimum total link cost. In the point of view of the users SPTs offer a lower root to vertex path cost. Algorithms 2 and 4 grant resiliency. This means that it is always possible to find a link-disjoint backup path to any link of the main tree. We can see that the version of algorithm 2, that is optimized for the main tree, has in fact much better results. When compared to algorithm 4, the first seem to take advantage in all the compared points. When optimized for backup paths, algorithm 2 has an allround performance. The cost of the backup paths is presented bellow. As we have seen, algorithm 2 requires a 2-connected undirected graph. In practice, backbone networks assume this configurations, since they are in fact at least 2-connected and physical links offer communication in any direction, hence they are also symmetric. When compared with the SPT, algorithm 2 shows almost no difference, only 12.70% of the paths are non-minimum paths. The total tree cost is 2,72% higher. Algorithm 2 is actually an excelent choice since it grants resillience with only liminar extra cost. 71 Fig. 67 – 10 node link-disjoint trees generated using algorithm 4. (Main tree in Red) Fig. 68 – 10 node link-disjoint trees generated using algorithm 2 optimized for the main tree (Red) 72 4.6 Protection Paths Using multiple out-branchings is a way of protecting the network in case of link failure. We can choose to alternate packet distribution through the available out-branchings. However, we can also calculate backup paths to each link of the main out-branching, avoiding the need of a second outbranching. If we do this we can easily introduce FRR in the network, ending up with the fast resiliency advantages that FRR grants. Both algorithms 2 and 4 offer resiliency because they calculate 2 or more link-disjoint outbranchings. If when finding the first out-branching, they grant that, at least another out-branching is available in the graph, they also grant paths from the root to each node, all of them disjoint from the main out-branching. To make better use of FRR, every link of the out-branching should be protected and the beginning of the protection path should be located at the tail of the failed link (Point of Local Repair Fig. 15 – Network Failure Anatomy). As far as algorithm 2 is concerned, the requirement was having a 2-connected symmetrical graph . In such graph, every link has its symmetrical; therefore each link of an out-branching that leads to a given node has a symmetrical link leading to the root. If every node has a disjoint path leading back to the root, then the union of this path with the branches of a secondary out-branching, that lead to the respective node, form a backup path that is disjoint from the main out-branching . Given this, we know that after finding the main out-branching in algorithm 2, we will still have at least one path non-overlapping disjoint path left Fig. 68. We can easily calculate the minimum path that grants these features by using a minimum paths algorithm in graph . Regarding algorithm 4, the requirements were different; we just needed to have -disjoint paths from the root the each node, in order for the algorithm to work. With this requirement we do not guarantee the existence of a path from the nodes to the root. If we want assure that the main tree calculated in algorithm 4 has disjoint protection paths available, we need to strict the requirements and ask for a symmetrical graph with -disjoint paths from the root to each vertex. We can use the symmetrical links to make paths to the root Fig. 67. In this case, like in algorithm 2, we have guaranteed the existence of the disjoint protection paths. We have compared our two algorithms in terms of cost of the protection paths. We have used the already generated 2-connected symmetric graphs and respective main out-branchings. We have calculated the minimum protection path of each link of the out-branching, i.e. given a link branching, we have calculated the cheapest path that goes from to in the outand is simultaneously disjoint from the out-branching. The medium costs of these paths are shown below. 73 Algorithm 2 Algorithm 2 (Improved for the backup paths) (Improved for the main out- 14.58 branching) 16.15 Algorithm 4 19.45 Table 3 – Protection path cost comparison Algorithm 2, when optimized for the main tree, presents slightly lower values in comparison to algorithm 4 and again has points in its favour. Results are even better when it is optimized to obtain better backup paths; nevertheless, the price to pay to these paths is high. Since the use of this paths is mostly temporary (few seconds at most). 74 Chapter 5 Conclusions 5 Conclusions 75 In this thesis we presented ways of computing and implementing different out-branching designs, understanding the advantages and disadvantages of these implementations. Given a certain out-branching we purposed a protocol suite that could use this out-branching to transport IPTV packets and, at the same time, was capable of handling the requirements of IPTV. We have studied the interaction between the involved protocols and developed a strategy to operate them. We have presented 3 different algorithms, which given a specific network, calculate different types of out-branchings. The algorithms had different objectives. The first algorithm (algorithm 1) calculated the minimum cost out-branching (MOB). We have seen that a MOB is different from a shortest path out-branching and could not be calculated using a minimum cost spanning tree algorithm. In the given networks, MOBs are the out-branchings with lowest total link cost. However, they do not offer the shortest paths between the root and each vertex. In fact, they offered the highest path costs among the algorithms that we have compared. Algorithm 1 did not also guaranteed the existence disjoint paths that could substitute the links of the out-branching in link failure occasions. The second algorithm (algorithm 2) generated two disjoint out-branchings given a 2-connected symmetric network. We have seen that in this type of network, after the calculation of the two outbranchings, it was possible to calculate local protection paths, for all the links in main out-branching, using the links left on the graph in association with the links of the backup out-branching. The protection paths were disjoint from the main out-branching. Given so, in case of link failure, traffic would not pass in the same link more than once, overloading the links. We have also used a heuristic to reduce the total link cost and the cost of paths that go from the root to each vertex in the outbranchings that result from algorithm 2. We have also calculated the protection paths for the main outbranching links and evaluated their costs. Finally the last algorithm (algorithm 4) enabled the calculation of the maximal number of disjoint out-branchings in a given graph. We have demonstrated that if the network was symmetric, it was possible to calculate protection paths for main out-branching with the links left on graph. When improved for the main out-branching algorithm 2 presented total links costs and root to vertex path costs that were similar to the ones obtained by an SPT. The costs were lower than the ones obtained by algorithm 4. As far as protection paths are concerned, its costs were also lower in algorithm 2 in comparison with algorithm 4. One of the main problems of our implementations and that could lead to future work is that we are not using the Traffic Engineering properties of MPLS in its full extent. IP carriers use their backbone to support IPTV, but they also use it to support many other types of traffic. We are making our out-branching calculations on IGP costs which do not characterize the network properly. IGP costs, are usually based on the debit of the links and usually do not account for the current traffic of that link, as well as, the bandwidth that is being used at a given time. The paths that we assumed good (because of their low IGP cost) might not be that good if the links are being extensively used for other types of services like unicast data transmission or virtual private networks, for example. We are 76 also not making use of constrained based routing offered by MPLS, which we could benefit from, if we wanted to balance the traffic over an IP backbone. Nevertheless, the graph properties that led to development of algorithm show that in symmetrical networks we have some flexibility when choosing links to be part of the main out-branching, and also benefit from the resiliency of the protection paths that the algorithm accounts for. If we could find a set of metrics that dynamically characterise the network in all its extent, and in same time we could feed this data to algorithm 2 to calculate reconfigured versions of network, on the fly, we would certainly offer an interesting tool to enhance IP backbone networks. 77 6 References 79 References [1] R. Doverspike, G. Li, K. Oikonomou, K. K. Ramakrishnan and D. Wang, “IP Backbone Design for Multimedia Distribution: Architecture and Performance”, Proc. of IEEE INFOCOM, April 2007, p. 1523- 80 1531 [2] G. Li, D. Wang, and R. Doverspike, “Efficient distributed MPLS P2MP fast reroute”, Proc. of IEEE INFOCOM, April 2006, p. 1-11 [3] M. Yuksel, K. K. Ramakrishnan, R. Doverspike, R. Sinha, G. Li, K. Oikonomou, and D. Wang, “Cross-Layer Techniques for Failure Restoration of IP Multicast with Applications to IPTV”, Communication Systems and Networks (COMSNETS), 2010 Second International Conference, p. 223-232 [4] R. Doverspike, Guangzhi Li, K.N. Oikonomou, K.K. Ramakrishnan, R.K. Sinha, Dongmei Wang, C. Chase, “Designing a Reliable IPTV Network”, Internet Computing, IEEE, volume: 13 , issue: 3, June 2009, p. 15 - 22 [5] Muriel Medard, Steven G. FinnRichard A. Barry,and Robert G. Gallager,” Redundant Trees for Preplanned Recovery in Arbitrary Vertex-Redundant or Edge-Redundant Graphs”, IEEE/ACM Transactions on network, VOL. 7, NO. 5, October 1999, p. 641- 652 [6] Adrian S.-W. Tam Kang Xi H. Jonathan Chao, “A Fast Reroute Scheme for IP Multicast” IEEE "GLOBECOM" 2009 proceedings p. 1-7 [7] Bhed Bahadur Bista, “A Fault-Tolerant Scheme for Multicast Communication Protocols”, 2005 AsiaPacific Conference on Communications p. 289-293 [8] Yong Liu, A. L. Narasimha Reddy, “A Fast Rerouting Scheme for OSPF/IS-IS Networks”, Computer Communications and Networks, 2004. ICCCN 2004. Proceedings. 13th International Conference, 1113 Oct. 2004, p. 47- 52 [9] Nokia Siemens Networks, “High-quality and resilient IPTV multicast architecture: An overview of ResIP multicast architecture design guidelines”, Technical White Paper, Available: http://www.nokiasiemensnetworks.com/system/files/document/High-quality_and_resilient_IPTV_multic ast_architecture.pdf [10] “The Verizon Global Network”, U.S. Backbone Network, Image September 2012, Available: http://www.verizonbusiness.com/about/network/maps/map.xml [11] “Survivable fixed telecommunication Network Design Library”, T-Systems Deutsch Backbone Network, Available: http://sndlib.zib.de/home.action [12] James F. Kurose and Keith W. Ross, “Computer Networking: A Top-Down Approach”, 5th Edition, Addison-Wesley, 2009 [13] Nitin Vaidya, “Notes on Distance Vector Routing”, Course Distributed Algorithms for Wired and Wireless Networks, University of Illinois Fall 2009 Available: http://users.crhc.illinois.edu/nhv/09fall.598/routing_dv.pdf [14] Ina Minei, Julian Lucek, “MPLS-Enabled Applications: Emerging Developments and New Technologies”, Part One, John Wiley & Sons Ltd, 2005 [15] Cisco Systems, “IP Multicast Concepts, Design and Troubleshooting”, Cisco Systems 81 presentation, 2011 Available: http://home.komsys.org/~jocke/ciscolivemelbourne2011/BRKMPL- 1261_IP_Multicast_-_Concepts,_Design_and_Troubleshooting.pdf [16] Rajiv Chaudhuri, “End to End IPTV Design and Implementation, How to avoid Pitfalls”, Ericsson presentation, Available: http://www.networks2008.org/data/upload/file/Tutorial/T6_Chaudhuri.pdf [17] A. Quadir, M. T. Arefin, and H. E. Sandström, “Reliable IPTV Service Delivery Using PIM-SSM Routing”, J. Sci. Res. 1 (3), 495-507 (2009). [18] Pierre-Henri Bourdiol, “Point to Multipoint Traffic Engineering with MPLS” Juniper Networks presentation, 2006, Available: http://media.frnog.org/FRnOG_8/FRnOG_8-2.pdf [19] Jorgen Bang-Jensen, Gregory Gutin, “Digraphs Theory, Algorithms and Applications”, Springer London, Limited, 1 edition, 2002 [20] João Luís Sobrinho, “Connectivity in directed multigraphs” [21] Jon Kleinberg, Eva Tardos, “Algorithm Design”, Addison-Wesley, 1 edition (March 26, 2005) [22] Peter Arberg, Torbjörn Cagenius, Olle V. Tidblad, Mats Ullerstig and Phil Winterbottom, “Network infrastructure for IPTV”, Ericsson Review http://www.ebu.ch/fr/technical/trev/trev_312-kernen_QoE.pdf 82 No. 3, 2007, Available: 83 High Quality and Resilient IPTV Backbone Network André Ricardo Simões Bento