Problemas de otimização em network design e biologia: •Ajuste de pesos em roteamento sob OSPF • Correspondência de grafos em reconhecimento de cenas Celso Carneiro Ribeiro Seminário do Departamento de Informática PUC-Rio, Junho 2002 June 14, 2002 Page 1/67 Problemas de otimização em network design e biologia Weight setting in OSPF routing • • • • • Weight setting in OSPF routing Genetic algorithm for OSPF routing Population dynamics Parallel GA for OSPF routing Numerical results and conclusions June 14, 2002 Page 2/67 Problemas de otimização em network design e biologia Weight setting in OSPF routing • Internet traffic has been doubling each year Coffman & Odlyzko (2001): in the 1995-96 period (introduction of web browsers), traffic doubled every three months! • Increasingly heavy traffic (due to video, voice, etc.) is raising the requirements of the Internet of tomorrow. June 14, 2002 Page 3/67 Problemas de otimização em network design e biologia Weight setting in OSPF routing • Objective of traffic engineering: make more efficient use of existing network resources • Routing of traffic can have a major impact on the efficiency of network resource utilization June 14, 2002 Page 4/67 Problemas de otimização em network design e biologia Packets of information Contains necessary routing information, such as IP destination address. body header Information sent over the Internet is broken into chunks, called packets or datagrams. June 14, 2002 Page 5/67 Problemas de otimização em network design e biologia Packet routing When packet arrives at router, router must decide where to router send it next. router D1 D2 Packet’s final destination. router router R1 D3 D4 R2 R3 R4 Routing table June 14, 2002 Page 6/67 router Routing consists in finding a path from source to destinatio Problemas de otimização em network design e biologia Autonomous systems To decrease the complexity of routing, the Internet is divided into smaller domains, called Autonomous Systems. Routing within an AS is done via Interior Gateway Protocols (IGP), while between AS’s Exterior Gateway Protocols (EGP) are used. June 14, 2002 Page 7/67 AS2 AS1 AS4 AS3 Problemas de otimização em network design e biologia OSPF (Open Shortest Path First) • OSPF is the most commonly used intradomain routing protocol (IGP). • It requires routers to exchange routing information with all other routers in the AS. – Complete network topology knowledge is available to all routers, i.e. state of all routers and links in the AS. June 14, 2002 Page 8/67 Problemas de otimização em network design e biologia Weight setting in OSPF routing • Each link in the AS is assigned an integer weight [1,65535=2161] – Smaller weights may be used: MAX • Each router computes tree of shortest weight paths to all other routers in the AS, with itself as the root, using Bottom: Cisco 7000 router Top: ForeRunner ASX-200 ATM switch Dijkstra’s algorithm. June 14, 2002 Page 9/67 Problemas de otimização em network design e biologia Weight setting in OSPF routing Routing table D1 R1 D2 R1 D3 R2 D4 R3 D5 R1 D6 R3 root 1 1 2 3 5 3 4 2 June 14, 2002 Routing table is filled with first hop routers for each possible destination In case of multiple shortest paths, flow is evenly split. Page 10/67 First hop routers. 6 Destination routers Cisco 12400 routers Problemas de otimização em network design e biologia Weight setting in OSPF routing • OSPF weights are assigned by network operator – CISCO assigns, by default, a weight proportional to the inverse of the available link bandwidth. – If all weights are unit, the cost of a path is the number of hops in the path. • Fortz & Thorup (2000): weight setting by using local search on large networks with up to 100 nodes and 503 links • Ericsson, Pardalos, & Resende (2001): genetic algorithm June 14, 2002 Page 11/67 Problemas de otimização em network design e biologia Minimization of congestion • Directed capacitated network G = (N,A,c), where N are routers, A are links, and ca is the capacity of link a A. • Same measure of Fortz & Thorup (2000) to compute congestion (also used for PVC routing): = 1(L1) + 2(L2) + … + |A|(L|A|) La is the load on link a A, a(La) is piecewise linear and convex, and a(0) = 0, for all a A. June 14, 2002 Page 12/67 Problemas de otimização em network design e biologia Piecewise linear and convex a(La) link congestion measure 70 slope = 5000 60 50 slope = 500 40 30 20 slope = 3 slope = 10 slope = 1 10 slope = 70 0 0 0.2 0.4 0.6 0.8 1 1.2 (Laca ) trunk utilization rate June 14, 2002 Page 13/67 Problemas de otimização em network design e biologia Weight setting in OSPF routing • Given a directed network G = (N, A ) with link capacities ca A and demand matrix D = (ds,t ) specifying a demand to be sent from node s to node t : – Assign weights wa [1,65535] to each link a A, such that the objective function is minimized when demand is routed according to the OSPF protocol. • Weights are computed off-line and do not change often June 14, 2002 Page 14/67 Problemas de otimização em network design e biologia Genetic algorithms done Initialize and evaluate P (0); Set t = 1 Test termination P (t ) is population of solutions at generation t. t=t+1 June 14, 2002 Page 15/67 Generate P (t ) from P (t1) crossover Alter P (t ) mutation Evaluate P (t ) Problemas de otimização em network design e biologia GA for OSPF: solution encoding • Ericsson, Pardalos, & Resende (2001) • A population consists of nPop integer weight arrays: w = (w1, w2 ,…, w|A| ), where wa [1,MAX] • All possible weight arrays correspond to feasible solutions, i.e., every weight setting is feasible – nice problem feature for application of a GA June 14, 2002 Page 16/67 Problemas de otimização em network design e biologia GA for OSPF: fitness evaluation • Route each demand pair (s,t ) using OSPF • Compute loads Las,t on each link a A • Add up loads on each link a A, yielding total load La on link • Compute link congestion cost a(La) for each link a A • Add up costs: = 1(L1) + 2(L2) + … + |A|(L|A|) June 14, 2002 Page 17/67 Problemas de otimização em network design e biologia Initial population • nPop 10 solutions with randomly generated arc weights, uniformly in the interval [1,MAX] • Weight settings of two other common heuristics: – OSPF (unit): all weights set to 1 – OSPF (invCap): each arc weight is set inversely proportional to its arc capacity – OSPF (fractions): all weights set to .MAX, with = 1/8, 1/4, 3/8, 1/2, 5/8, 3/4, 7/8, 1 all but invCap lead to the same routing decisions (all weights are equal) June 14, 2002 Page 18/67 Problemas de otimização em network design e biologia Population partitioning Class A 20% most fit Population is sorted according to fitness (solution value) and solutions are classified into three categories. Class B Class C June 14, 2002 10% least fit Page 19/67 Problemas de otimização em network design e biologia Population dynamics generation t + 1 generation t Class A Class A is promoted unchanged Class B is replaced by crossover of: one Class A parent and Class A Class B Class B one Class B or C parent. Class C June 14, 2002 Class C is replaced by randomly generated solutions. Page 20/67 Class C Problemas de otimização em network design e biologia Parent selection • Parents are chosen at random: – one parent from Class A (elite) – one parent from Class B or C (non-elite) • Reselection is allowed, i.e. parents can breed more than once per generation • Better individuals are more likely to reproduce June 14, 2002 Page 21/67 Problemas de otimização em network design e biologia Crossover with random keys Bean (1994): crossover combines elite parent p1 with non-elite parent p2 to produce child c : for all genes i = 1,2,…,|A | do With small probability child if rrandom[0,1] < 0.01 then has single gene mutation. c [i ] = irandom[1,MAX] else if rrandom[0,1] < 0.7 then c [i ] = p1[i ] Child is more likely to inherit else c [i ] = p2[i ] end gene of elite parent. June 14, 2002 Page 22/67 Problemas de otimização em network design e biologia Parallel GA: local search • Combine GA with local search • LS with cost recomputations from scratch: – For each arc e with current weight we do: • Temporarily replace arc weight by (1+ we)/2 • Evaluate fitness • If new improved solution, update weight and go to next arc • Otherwise, temporarily replace arc weight by (MAX+ we)/2 • Evaluate fitness • If new improved solution, update weight • Go to next arc Problemas de otimização em network design e biologia Page 23/67 – Until no further improvement is possible June 14, 2002 Parallel GA: local search • Variants: – V-1: at each processor, apply LS to the best solution whenever it is improved – V-2: at each processor, always apply LS to the best non-locally optimal solution June 14, 2002 Page 24/67 Problemas de otimização em network design e biologia Parallel GA: cooperation • P processors • Whenever a processor improves its incumbent, the latter is broadcasted to: – all other processors – all closest log P processors (logical organization) • At the beginning of each generation, every processor replaces its worst solutions by those sent by other processors June 14, 2002 Page 25/67 Problemas de otimização em network design e biologia Numerical results • Work-in-progress, preliminary results: GA, LS – Combine GA+LS? Cooperative // GA? Scatter search? • One real world network: AT&T Worldnet backbone with 90 nodes, 274 links, and 272 pairs • Compared with cost and maximum utilizations of the LB lower bound and several heuristics: – OSPF(invCap) – Local search of Fortz and Thorup (2000) – OriginalPage sequential GA of Ericsson et al. June 14, 2002 Problemas de otimização em network design e biologia 26/67 (2001) Collaborative vs. noncollaborative versions 2.85 collab_v1 nocollab_v1 collab_v2 nocollab_v2 F&T seqGA-500itr LPLB 2.8 2.75 2.7 2.65 2.6 2.55 2.5 2.45 2.4 2.35 0 2 4 6 8 10 12 14 processors June 14, 2002 Page 27/67 Problemas de otimização em network design e biologia 16 cost Number of generations: 500 and 8000 2.85 collab_v1-500itr nocollab_v1-500itr collab_v1-8000itr nocollab_v1-8000itr F&T seqGA-500itr LPLB 2.8 2.75 2.7 2.65 2.6 2.55 2.5 2.45 2.4 2.35 0 2 4 6 8 10 12 14 processors June 14, 2002 Page 28/67 Problemas de otimização em network design e biologia 16 cost 12 10 8 InvCap GA // LPLB 6 4 2 0 0 5000 100001500020000250003000035000400004500050000 scaled internet traffic June 14, 2002 Page 29/67 Problemas de otimização em network design e biologia 2 InvCap GA // LPLB 1.5 1 0.5 0 0 5000100001500020000250003000035000400004500050000 s caled internet traffic June 14, 2002 Page 30/67 Problemas de otimização em network design e biologia Concluding remarks • Sequential GAOSPF produced as good solutions as LS for most instances, even better in some cases. • GA generally finds good solutions close to the LP lower bound. • //GA+LS works very well on real-world AT&T Worldnet backbone network, significantly increasing traffic and Internet capacity over CISCO’s recommended weight setting strategy. • Extensions: speedup LS, improve cooperation, June evaluate 14, 2002 Problemas de otimização em network design e biologia Page 31/67 scatter effects, search Graph matching for scene recognition • • • • Scene recognition Graph matching Problem formulation GRASP heuristic June 14, 2002 Page 32/67 Problemas de otimização em network design e biologia Scene recognition • The recognition of complex scenes requires not only a detailed description of the objects involved, but also of the spatial relationships between them. • The diversity of the forms of the same object in different instantiations of a scene, and also the similarities of different objects in the same scene make relationships between objects of prime importance in order to remove June 14, 2002 Problemas de otimização em network design e biologia ambiguitiesPagein33/67the recognition of objects with Scene recognition • Graph based representations often used for scene representation in image processing: – Vertices represent the objects in the scene – Edges represent the relationships between the objects • Relevant information for the recognition is extracted from the scene and represented by relational attributed graphs. • Model-based recognition: both the model and the scene are represented by graphs – graph matching June 14, 2002 Page 34/67 Problemas de otimização em network design e biologia Motivation • Medical imaging: recognition of brain structures from magnetic resonance images • Model: each node represents one anatomical structure, while edges represent spatial relationships. • Inaccuracies are one of the main problem characteristics: – – – – Unexpected objects (pathologies, for instance) Expected objects not found Deformations of objects Attributes in the image and in the model may be imprecise June 14, 2002 Page 35/67 Problemas de otimização em network design e biologia Motivation (a) normal brain June 14, 2002 (b) pathological brain with a tumor Page 36/67 (c) representation of a brain atlas (each grey level corresponds to a unique connected structure) Problemas de otimização em network design e biologia Scene recognition • Similar problems: character recognition, aerial or satellite image interpretation using a map, face recognition, etc. • Search for the best correspondence formulated as a combinatorial optimization problem defined by the relational attributed graphs representing the model and the image, together with node and edge between them. June similarities 14, 2002 Problemas de otimização em network design e biologia Page 37/67 Graph matching • Attributed graphs widely used in pattern recognition problems: construction depends on the imperfections of the scene or of the model, and on the attributes of the object relations • One single vertex for each region of each image • Once the graph has been constructed, vertex and edge attributes are computed. • Finally, vertex-similarity and edge-similarity matrices are computed from the values of the attributed graphs, relating each pair of vertices and each pair of edges (one from the June 14, 2002 Problemas de otimização em network design e biologia Page 38/67 model, the other from the segmented image). Graph matching • G1 = (N1,E1): model • G2 = (N2,E2): over-segmented image • Each solution of the graph matching problem is a correspondence between the vertex and edges of the graphs representing the model and the image June 14, 2002 Page 39/67 Problemas de otimização em network design e biologia Example: two solutions June 14, 2002 Page 40/67 Problemas de otimização em network design e biologia Graph matching • Each node of G2 is associated with one node of G1: – xij = 1, if nodes i N1and j N2 are associated – xij = 0, otherwise. • Each solution may be represented by a bipartite graph G' = (N1 N2,E'): – each edge (i,j) E‘ = {(i,j), i N1, j N2 | xij = 1} corresponds to the association of i N1 with j N 2. – A(i) = { j N2 | (i,j) E'} is the subset of nodes of N2 associated with i N1 June 14, 2002 Page 41/67 Problemas de otimização em network design e biologia Objective function • Similarity matrices constructed from similarity values calculated from graph attributes. • sv(i,j) [0,1] represents the vertex-similarity between vertices i N1 and j N2 • se(u,v) [0,1] represents the edge-similarity between edges u E1 and v E2 June 14, 2002 Page 42/67 Problemas de otimização em network design e biologia Objective function Maximize G’ June 14, 2002 Page 43/67 Problemas de otimização em network design e biologia Constraints 1. exactly one node i N1 such that (i,j) E‘ (i.e. xij = 1), j N2 2. The subgraph induced in G2 by A(i) is connected, i N1 3. at least one node j N2 such that (i,j) E', i N1 4. xij = 1 sv(i,j) > 0, i N1, j N2 June 14, 2002 Page 44/67 Problemas de otimização em network design e biologia GRASP For i = 1,…,MaxIterations Build a feasible correspondence; Apply local search to improve the correspondence; Update the best solution found; End; June 14, 2002 Page 45/67 Problemas de otimização em network design e biologia Construction For k = 1,…,MaxTrials Generate a random permutation of N2; For each node j N2 Randomly search all nodes in N1; Associate to j a node i N1 satisfying the constraints; Remove i from N1; End; If a feasible solution was built, then return; End;June 14, 2002 Problemas de otimização em network design e biologia Page 46/67 Local search • Iterative improvement (first improvement) • Neighborhood: – Consider every pair of nodes j1 N2, j2 N2 – Let i1 = A-1(j1) and i2 = A-1(j2) – Exchange the associations: A(i1) A(i1) – {j1} {j2} A(i2) A(i2) – {j2} {j1} June 14, 2002 Page 47/67 Problemas de otimização em network design e biologia Example Cut of a muscle: (a) original 2-D image (b) model 14 vertices 26 edges June 14, 2002 Page 48/67 (c) over-segmented image 28 vertices 62 edges Problemas de otimização em network design e biologia June 14, 2002 Page 49/67 Problemas de otimização em network design e biologia Summary • Data: – Image to be recognized – Atlas of images • Examples: – Identification of tumors – Facial recognition • Images represented by graphs: – Nodes: image regions – Edges: relations among regions (boundaries) – Attributes: colors, intensities, adjacencies, etc. June 14, 2002 Page 50/67 Problemas de otimização em network design e biologia Summary • Similarities between pairs of vertices and pairs of edges (model-image) • Problem: – Determine the most similar model in the atlas with respect to the image to be recognized – Determine the optimal correspondence between the regions of the model and those in the image • Other applications June 14, 2002 Page 51/67 Problemas de otimização em network design e biologia Problema da filogenia • Uma filogenia é uma árvore que relaciona espécies ou genes homólogos de espécies distintas (taxons) • Dado um conjunto de taxons caracterizados por um conjunto de caracteres, obter uma filogenia com o menor número ade passos b c evolucionários d e f g (mudança de um caracter) O 0 0 0 0 0 0 0 • Exemplo: A 1 1 0 0 1 1 1 B C June 14, 2002 Page 52/67 1 1 1 1 1 1 1 1 1 0 1 0 Problemas de otimização em network design e biologia 1 0 Problema da filogenia Exemplo: O A B C a 0 1 1 1 Solução com 10 passos: June 14, 2002 Page 53/67 b 0 1 1 1 c 0 0 1 1 d 0 0 1 1 e 0 1 1 0 f 0 1 1 0 g 0 1 1 0 Solução com 9 passos: Problemas de otimização em network design e biologia Roteamento de circuitos virtuais privados • Circuitos virtuais privados (PVCs) entre os pares origem e destino de cada cliente em um backbone • Roteamento de cada cliente feito pelo projetista sem conhecimento de demandas futuras • Ineficiência e necessidade ocasional de rotear os PVCs offline • Problema: minimizar atrasos e congestão, satisfazendo as demandas e restrições de capacidade • Aplicação desenvolvida para a AT&T: melhoria sensível do desempenho e dedo aproveitamento June 14, 2002 Problemas otimização em network design e biologia de Page 54/67 redes reais Roteamento de PVCs June 14, 2002 Page 55/67 Problemas de otimização em network design e biologia Roteamento de PVCs June 14, 2002 Page 56/67 Problemas de otimização em network design e biologia Roteamento de PVCs June 14, 2002 Page 57/67 Problemas de otimização em network design e biologia Roteamento de PVCs June 14, 2002 Page 58/67 Problemas de otimização em network design e biologia Roteamento de PVCs capacidade máxima = 3 June 14, 2002 Page 59/67 Problemas de otimização em network design e biologia Roteamento de PVCs capacidade máxima = 3 rerotear Caminho longo! June 14, 2002 Page 60/67 Problemas de otimização em network design e biologia Roteamento de PVCs capacidade máxima = 3 June 14, 2002 Page 61/67 Problemas de otimização em network design e biologia Roteamento de PVCs Viável e ótima! June 14, 2002 Page 62/67 capacidade máxima = 3 Problemas de otimização em network design e biologia Projeto de redes locais de acesso • Construir uma rede de fibras óticas para prover serviços de banda larga a consumidores comerciais e residenciais, levando em consideração o balanceamento entre os custos de construção (colocação das fibras) e os rendimentos potenciais dos clientes atendidos (perda de receita no caso do cliente não ser atendido). • Problema de Steiner com prêmios: aplicação June desenvolvida 14, 2002 Problemas de otimização em network design e biologia Page 63/67 para a AT&T Projeto de redes locais de acesso rua penalidade nula cliente (prêmio) raiz June 14, 2002 Page 64/67 Problemas de otimização em network design e biologia Projeto de redes locais de acesso June 14, 2002 Page 65/67 Problemas de otimização em network design e biologia Projeto de redes a k-caminhos • Dados: – Grafo suporte para a construção de uma rede – Conjunto de pares origem-destino – Custo de construção de cada arco • Problema: construir uma rede de custo mínimo, na qual todos os pares origem-destino são conectados por caminhos usando no máximo k arcos • Para que a confiabilidade da rede seja alta, os caminhos devem formados k June todos 14, 2002 Problemas deser otimização em network designpor e biologia Page 66/67 Slides and publications • Slides of this talk can be downloaded from: http://www.inf.puc-rio/~celso/talks/curico.ppt • Paper about sequential GA for OSPF setting available at: http://www.research.att.com/~mgcr/doc/gaospf.pdf • Paper about parallel GA for OSPF setting in preparation (with L. Buriol, M.G.C. Resende, and M. Thorup) • Paper about GRASP for graph matching also in June 14, 2002 Problemas de otimização em network design e biologia 67/67 preparation Page (with I. Bloch and M.C. Boeres)