Randomized Algorithms Eduardo Laber Loana T. Nogueira Quicksort Objective Quicksort Objective – Sort a list of n elements An Idea An Idea Imagine if we could find an element y S such that half the members of S are smaller than y, then we could use the following scheme An Idea Imagine if we could find an element y S such that half the members of S are smaller than y, then we could use the following scheme Partition S\{y} into two sets S1 and S2 An Idea Imagine if we could find an element y S such that half the members of S are smaller than y, then we could use the following scheme Partition S\{y} into two sets S1 and S2 S1: elements of S that are smaller than y S2: elements of S that are greater than y An Idea Imagine if we could find an element y S such that half the members of S are smaller than y, then we could use the following scheme Partition S\{y} into two sets S1 and S2 S1: elements of S that are smaller than y S2: elements of S that are greater than y Recursively sort S1 and S2 Suppose we know how to find y Suppose we know how to find y Time to find y: cn steps, for some constant c Suppose we know how to find y Time to find y: cn steps, for some constant c we could partition S\{y} into S1 and S2 in n-1 additional steps Suppose we know how to find y Time to find y: cn steps, for some constant c we could partition S\{y} into S1 and S2 in n-1 additional steps The total number os steps in out sorting procedure would be given by the recurrence T(n) 2T(n/2) + (c+1)n Suppose we know how to find y Time to find y: cn steps, for some constant c we could partition S\{y} into S1 and S2 in n-1 additional steps The total number os steps in out sorting procedure would be given by the recurrence T(n) 2T(n/2) + (c+1)n c’nlogn Quicksort What´s the problem with the scheme above? Quicksort What´s the problem with the scheme above? How to find y? Deterministic Quicksort Let y be the first element of S Deterministic Quicksort Let y be the first element of S Split S into two sets: S< and S> Deterministic Quicksort Let y be the first element of S Split S into two sets: S< and S> – S< : elements smaller than y – S> : elements greater than y Deterministic Quicksort Let y be the first element of S Split S into two sets: S< and S> – S< : elements smaller than y – S> : elements greater than y Qsort ( S< ), Qsort ( S> ) Performance – Worst Case: O( n2 ) – Avarage Case: O( nlogn ) Performance – Worst Case: O( n2 ) (The set is already sorted) – Avarage Case: O( nlogn ) Performance – Worst Case: O( n2 ) (The set is already sorted) – Avarage Case: O( nlogn ) An Randomzied Algorithm An Randomzied Algorithm An algorithm that makes choice (random) during the algorithm execution Randomized Quicksort (RandQS) Randomized Quicksort (RandQS) Choose an element y uniformly at random of S Randomized Quicksort (RandQS) Choose an element y uniformly at random of S – Every element of S has equal probability fo being chosen Randomized Quicksort (RandQS) Choose an element y uniformly at random of S – Every element of S has equal probability fo being chosen By comparing each element fo S with y, determine S< and S> Randomized Quicksort (RandQS) Choose an element y uniformly at random of S – Every element of S has equal probability fo being chosen By comparing each element fo S with y, determine S< and S> Recursively sort S< and S> Randomized Quicksort (RandQS) Choose an element y uniformly at random of S – Every element of S has equal probability fo being chosen By comparing each element fo S with y, determine S< and S> Recursively sort S< and S> – OUTPUT: S<, followed by y and then S> Intuition For some instance Quicksort works very bad O( n2 ) Intuition For some instance Quicksort works very bad O( n2 ) Randomization produces different executions for the same input. There is no instance for which RandQS works bad in avarage Analysis Analysis For sorting algorithms: we measure the running time of RandQS in terms of the number of comparisons it performs Analysis For sorting algorithms: we measure the running time of RandQS in terms of the number of comparisons it performs This is the dominat cost in any reasonable implementation Analysis For sorting algorithms: we measure the running time of RandQS in terms of the number of comparisons it performs This is the dominat cost in any reasonable implementation Our Goal: Analyse the expected number of comparisons in an execution of RandQS Analysis Si: the ith smallest element of S Analysis Si: the ith smallest element of S S1 is the smallest element of S Analysis Si: the ith smallest element of S S1 is the smallest element of S Sn is the largest element of S Analysis Si: the ith smallest element of S S1 is the smallest element of S Sn is the largest element of S Define the random variable xik = 1, if Si and Sj are compared 0, otherwise Analysis Dado aleatório comofespaço Si: um theexperimento ith smallest element S amostral S, uma variável aleatória é uma função que associa a cada S1 isamostral the smallest S elemento um element número of real Sn is the largest element of S Define the random variable xik = 1, if Si and Sj are compared 0, otherwise Analysis Xij is a count of comparisons between Si and Sj: n The total number of comparisons: Xij i=1 j>i Analysis Xij is a count of comparisons between Si and Sj: n The total number of comparisons: Xij i=1 j>i We are interested in the expected number of comparisons n E[ Xij ] i=1 j>i Analysis Xij is a count of comparisons between Si and Sj: n The total number of comparisons: Xij i=1 j>i By the linearity of E[] We are interested in the expected number of comparisons n E[Xij] E[ Xij ] = i=1 j>i n i=1 j>i Analysis Pij: the probability that Si and Sj are compared in an execution Analysis Pij: the probability that Si and Sj are compared in an execution Since Xij only assumes the values 0 and 1, EX ikij 1 Pikij 0 (1 Pikij ) Pikij Analysis – Binary Tree T of RandQS Each node is labeled with a distinct element of S T y Analysis – Binary Tree T of RandQS Each node is labeled with a distinct element of S T y S< Analysis – Binary Tree T of RandQS Each node is labeled with a distinct element of S T y S< S> Analysis – Binary Tree T of RandQS Each node is labeled with a distinct element of S T y S< S> The root of T is compared to the elements in the two sub-trees, but no comparisons is perfomed between an element of the left and righ sub-trees Analysis – Binary Tree T of RandQS Each node is labeled with a distinct element of S T y S< There is an comparison between Si and Sj if and only if one of these elements is an ancestor of the other S> The root of T is compared to the elements in the two sub-trees, but no comparisons is perfomed between an element of the left and righ sub-trees Analysis – Binary Tree T of RandQS Consider the permutation obtained by visiting the nodes of T in increasing order of the level numbers, and in a lef-to-rigth order within each level Example: S=(3, 6, 2, 5, 4,1) Example: S=(3, 6, 2, 5, 4,1) 2 Example: S=(3, 6, 2, 5, 4,1) {1} 1 2 {3, 6, 5, 4} Example: S=(3, 6, 2, 5, 4,1) {1} 2 {3, 6, 5, 4} 1 {3, 4} 5 {6} Example: S=(3, 6, 2, 5, 4,1) {1} 2 {3, 6, 5, 4} 1 {3, 4} 5 {6} {3} 3 4 6 Example: S=(3, 6, 2, 5, 4,1) {1} 2 {3, 6, 5, 4} 1 {3, 4} 5 {6} {3} 3 4 6 Example: S=(3, 6, 2, 5, 4,1) {1} 2 {3, 6, 5, 4} 1 {3, 4} 5 {6} {3} 3 4 6 Example: S=(3, 6, 2, 5, 4,1) {1} 2 {3, 6, 5, 4} 1 {3, 4} 5 {6} {3} 4 6 3 = (2, 1, 5, 4, 6, 3) Back to the Analysis To compute pij we make two observations: – There is a comparison between Si and Sj if and only if Si or Sj occurs earlier in the permutation than any element Sl such that i < l < j Any of the elements Si, Si+1, ..., Sj is likely to be the first of theses elements to be chosen as a partitioning element, and hence to appear first in The probability that this first element is either Si or Sj is exactly 2/(j-i+1) Back to the Analysis To compute pij we make two observations: – There is a comparison between Si and Sj if and only if Si or Sj occurs earlier in the permutation than any element Sl such that i < l < j Any of the elements Si, Si+1, ..., Sj is likely to be the first of theses elements to be chosen as a partitioning element, and hence to appear first in The probability that this first element is either Si or Sj is exactly 2/(j-i+1) Back to the Analysis To compute pij we make two observations: – There is a comparison between Si and Sj if and only if Si or Sj occurs earlier in the permutation than any element Sl such that i < l < j Any of the elements Si, Si+1, ..., Sj is likely to be the first of theses elements to be chosen as a partitioning element, and hence to appear first in The probability that this first element is either Si or Sj is exactly 2/(j-i+1) Back to the Analysis To compute pij we make two observations: – There is a comparison between Si and Sj if and only if Si or Sj occurs earlier in the permutation than any element Sl such that i < l < j Any of the elements Si, Si+1, ..., Sj is likely to be the first of theses elements to be chosen as a partitioning element, and hence to appear first in The probability that this first element is either Si or Sj is exactly 2/(j-i+1) Análise Therefore, Pij = 2/(j-i+1) Análise Therefore, Pij = 2/(j-i+1) n n i=1 j>i E[Xij] = i=1 j>i Pij = Análise Therefore, Pij = 2/(j-i+1) i=1 j>i n n n E[Xij] = i=1 j>i Pij = 2/(j-i+1) i=1 j>i Análise Therefore, Pij = 2/(j-i+1) i=1 j>i n n n E[Xij] = i=1 j>i n n-1+1 Pij = 2/k i = 1 k=1 2/(j-i+1) i=1 j>i Análise Therefore, Pij = 2/(j-i+1) i=1 j>i n n n E[Xij] = i=1 j>i Pij = n n-1+1 2/k i = 1 k=1 n n 1/k i = 1 k=1 2 2/(j-i+1) i=1 j>i Análise Therefore, Pij = 2/(j-i+1) i=1 j>i n n n E[Xij] = i=1 j>i Pij = 2/(j-i+1) i=1 j>i n n-1+1 2/k i = 1 k=1 n n 1/k i = 1 k=1 2 Série Harmônica Análise Therefore, Pij = 2/(j-i+1) i=1 j>i n n n E[Xij] = i=1 j>i Pij = 2/(j-i+1) i=1 j>i n n-1+1 2/k i = 1 k=1 n n 1/k i = 1 k=1 2 Série Harmônica 2 nln n RandQs x DetQs Expected time of RandQs: O( nlogn ) A certain expected value may not garantee a reasonable probability of success. We could have, for example, the following probabilities log n n of executing O( n2 ) operations 1- log n of executing O( nlogn ) operations n RandQs x DetQs For n=100 => in 7 % cases, the algorithm would execute in O( n2 ). Some times we want to garantee that the algorithm performance will not be far from its avarage one RandQs x DetQs For n=100 => in 7 % cases, the algorithm would execute in O( n2 ). Some times we want to garantee that the algorithm performance will not be far from its avarage one Objective: Prove that with high probability the RandQS algorithm works well High Probability Bound High Probability Bound The previous analysis only says that the expected running time is O(nlog n) High Probability Bound The previous analysis only says that the expected running time is O(nlog n) This leaves the possibility of large “deviations” from this expected value High Probability Bound RECALL: – Quicksort choses a pivot at random from the input array High Probability Bound RECALL: – – Quicksort choses a pivot at random from the input array Splits it into smaller and larger elements High Probability Bound RECALL: – – – Quicksort choses a pivot at random from the input array Splits it into smaller and larger elements Recurses on both subarrays High Probability Bound RECALL: – – – Quicksort choses a pivot at random from the input array Splits it into smaller and larger elements Recurses on both subarrays Fix an element x in the input High Probability Bound RECALL: – – – Quicksort choses a pivot at random from the input array Splits it into smaller and larger elements Recurses on both subarrays Fix an element x in the input x belogs to a sequence of subarrays High Probability Bound RECALL: – – – Quicksort choses a pivot at random from the input array Splits it into smaller and larger elements Recurses on both subarrays Fix an element x in the input x belogs to a sequence of subarrays x´s contribution to the running time is proportional to the number of different subarrays it belongs High Probability Bound Every time x is compared to a pivot, its current subarray is split and x goes to one of the subarrays High Probability Bound Every time x is compared to a pivot, its current subarray is split and x goes to one of the subarrays With high probability, x is compared to O(log n) pivots High Probability Bound Every time x is compared to a pivot, its current subarray is split and x goes to one of the subarrays With high probability, x is compared to O(log n) pivots With probability 1- n-c, for some pos. constant c GOOD and BAD Splits GOOD and BAD Splits We say that a pivot is good if each of the subarrays has size at most ¾ (equivalently,at least ¼) of the size of the split subarray GOOD and BAD Splits We say that a pivot is good if each of the subarrays has size at most ¾ (equivalently,at least ¼) of the size of the split subarray Otherwise, it is bad GOOD and BAD Splits We say that a pivot is good if each of the subarrays has size at most ¾ (equivalently,at least ¼) of the size of the split subarray Otherwise, it is bad The probability of a good and of a bad split is ½ GOOD and BAD Splits We say that a pivot is good if each of the subarrays has size at most ¾ (equivalently,at least ¼) of the size of the split subarray Otherwise, it is bad The probability of a good and of a bad split is ½ X can participate in at most log4/3 n good splits High Probability Bound Upper bound the probability of less than M/4 good splits in M splits High Probability Bound Upper bound the probability of less than M/4 good splits in M splits Set M so that log4/3 n M/4 High Probability Bound Upper bound the probability of less than M/4 good splits in M splits Set M so that log4/3 n M/4 Let M = 32 ln n: High Probability Bound Upper bound the probability of less than M/4 good splits in M splits Set M so that log4/3 n M/4 Let M = 32 ln n log4/3 n 8 ln n Good choice High Probability Bound Upper bound the probability of less than M/4 good splits in M splits Set M so that log4/3 n M/4 Let M = 32 ln n log4/3 n 8 ln n Good choice exp(-M/8) 1/n4 High Probability Bound Upper bound the probability of less than M/4 good splits in M splits Set M so that log4/3 n M/4 Let M = 32 ln n log4/3 n 8 ln n Good choice exp(-M/8) 1/n4 The probablility of participating in more than M = 32 ln n splits is less than 1/n4 High Probability Bound The probability that any element participates in more than M=32 ln n splits is less than 1/n3 High Probability Bound The probability that any element participates in more than M=32 ln n splits is less than 1/n3 With probability at least 1-1/n3, the running time of quicksort is O(n log n) Advantages of Randomized Algorithms For many problems, randomized algorithms run faster than the best know deterministic algorithms Many randomized algorithms are simpler to describe and implement than deterministic algorithms for comparable performance Advantages of Randomized Algorithms For many problems, randomized algorithms run faster than the best know deterministic algorithms Many randomized algorithms are simpler to describe and implement than deterministic algorithms for comparable performance Advantages of Randomized Algorithms For many problems, randomized algorithms run faster than the best know deterministic algorithms Many randomized algorithms are simpler to describe and implement than deterministic algorithms for comparable performance Minimum Cut Problem Entrada: – Grafo G=(V,E) Saída: – Conjunto S V que minimiza e S , S , ou seja, o número de arestas que tem uma extremidade em S e outra em S Minimum Cut Problem S S e S, S 3 Notação – – d(v) : grau do vértice v no grafo N(v) : vizinhança de v Minimum Cut Problem - Applications Network Reliability: If a graph has a small mincut, then it is poorly connected Clustering – – Web pages = nodes Hyperlinks = edges Divide the graph into cluster with little connection between different clusters. Contração de arestas Dado um grafo G=(V,E) e uma aresta e=(u,v), a contração da aresta e produz o grafo G/e=(V’, E’), onde V ' V u, v uv , E ' E u, v | u e v são extremidades de e x, uv| x u, v e x, u E ou x, v E Contração de arestas - Exemplo G/e G a a c c f f b b e u d v uv g d g Lema 1 Lema 1: o tamanho do corte mínimo em G/e é maior ou igual ao tamanho do corte mínimo em G Lema 1 Lema 1: o tamanho do corte mínimo em G/e é maior ou igual ao tamanho do corte mínimo em G Prova: podemos associar cada corte em G/e a um corte em G com mesmo tamanho. Basta substituir em S os nós obtidos por contração pelos nós originais. Lema 2 Lema 2: Se o corte mínimo em um grafo tem tamanho k, então d(v) k, para todo v V. Lema 2 Lema 2: Se o corte mínimo em um grafo tem tamanho k, então d(v) k, para todo v V. Prova: Caso contrário S={ v } seria um corte de tamanho menor que k. Corolário Corolário 1: Se o corte mínimo em um grafo tem tamanho k, então |E| k.n/2 Corolário Corolário 1: Se o corte mínimo em um grafo tem tamanho k, então |E| k.n/2 Prova: Segue do lema 2 e de que d v E vV 2 Randomized MinCut G0 G Para i=1 até |V|-2 – selecione aleatoriamente ei em Gi-1 – faça Gi = Gi-1 / ei Retorne os vértices de um dos supervértices obtidos Probabilidade Lema 3: Seja t1, t2, ..., tk uma coleção de eventos. Temos que k k Pr ti Prti i 1 i 1 Prova: Base: tj j 1 i 1 Prt 2 t1 Prt2 ti Prt1 t2 Prt1 Prt2 t1 ok Prt1 Assuma que vale para k, provar para k+1. Teorema Seja C = { e1, e2, ..., ek } um corte mínimo no grafo G=(V,E). Se nenhuma aresta de C é escolhida pelo RndMinCut, então as arestas que sobram no grafo final são as arestas de C. Teorema - prova Sejam A e B os dois supervértices obtidos e seja AC e BC as componentes conexas obtidas retirando C. Como nenhuma aresta de C foi escolhida: A AC ou A BC , e B AC ou B BC Teorema - prova Logo, A=AC e B=BC De fato, assuma que $a AC e $b BC tal que a,b A . Neste caso, existe um caminho em A entre a e b que utiliza somente arestas escolhidas pelo algoritmo. Como qualquer caminho entre a e b tem que utilizar arestas de C, logo uma aresta de C é escolhida. Contradição! Análise Seja C o conjunto de arestas de um corte mínimo em G. Calculamos a probabilidade do algoritmo não escolher nenhuma aresta de C para contração. Se isto acontece, o algoritmo retorna um corte mínimo. Análise Sortei : evento indicando que o algoritmo não sorteou uma aresta de C na i-ésima iteração. Prencontrarcortemínimo n 2 PrencontrarC Pr sortei i 1 Análise Temos: i 1 n 2 n2 Pr sortei Prsortei sorte j j 1 i 1 i 1 Prsorte1 1 C E Segue da relação entre o tamanho do corte e o número de arestas (corolário 1) que: n2 Prsorte1 n Análise Na segunda iteração o grafo G1 tem n-1 vértices e seu corte mínimo tem tamanho |C|. Logo, n 3 Prsorte2 sorte1 1 n 1 C n 1 2C Análise Em geral, i 1 2 Prsortei sorte j 1 n i 1 j 1 Segue que, n 2 n2 2 2 Pr sortei 1 i 1 i 1 n i 1 nn 1 Análise Logo, conseguimos um min-cut com probabilidade maior ou igual a 2 nn 1 n=100 => 0,02 % (RUIM) O que fazer ? Randomized MinCut 2 Repetir o processo RndMinCut várias vezes e devolver o melhor corte encontrado Repetindo K vezes a probabilidade de encontrar o corte mínimo é 2 1 1 nn 1 K Análise Repetindo n2/2 vezes a probabilidade de sucesso é (e-1)/e 64 % K repetições => O (K.m) tempo Complexidade do Algoritmo mais rápido de Karger Running time: O(n2 log n) Space: O(n2) Este algoritmo encontra um min-cut com probabilidade (1/log n) [D. Karger e C. Stein, Stoc 1993] Quem se habilita ? Minimum Cut Problem – Deterministic Algorithm Complexity O(nm log n2/m) J. Hao and J. B. Orlin[1994] (baseado em fluxo em redes) Dois tipos de algoritmos randomizados Algoritmos Las Vegas – Sempre produzem a resposta correta – Tempo de execução é uma variável aleatória Exemplo: RandQs – Sempre produz seqüência ordenada – Tempo de término varia de execução para execução em uma dada instância Dois tipos de algoritmos randomizados Algoritmos Monte-Carlo – Podem produzir respostas incorretas – A probabilidade de erro pode ser cotada – Executando o algoritmo diversas vezes podemos tornar a probabilidade de erro tão pequena quanto se queira Exemplo: Min-Cut MAX SAT Entrada n variáveis booleanas : x1,... Xn m cláusulas : C1,... Cm Pesos wi >= 0 para cada clausula Ci Objetivo : Encontrar uma atribuição de verdadeiro/falso para xi que maximize a soma dos pesos das clausulas satisfeitas MAX SAT Algoritmo Aleatorizado Para i=1,...,n Se random(1/2) = 1 xi true Senão xi false Com probabilidade ½ dizemos que uma variável é verdadeira ou falsa MAX SAT Teorema : O algoritmo tem aproximação ½ Prova: Considere a variável aleatória xj 1, se a clausula j é satisfeita xj 0, caso contrário Logo, w wj x j j E (w) E ( w j x j ) w j E ( x j ) j j MAX SAT E(xj) = Pr(clausula j ser satisfeita) Lj : número de literais na clausula j Obs: A clausula j não é satisfeita somente se todos literais forem 0 Cj = (x1 v x3 v x5 v x6) Devemos ter x1=0, x3 = 1, x5 =0 e x6 =0 Probabilidade = (1/2)4 Caso Geral=(1/2)Lj MAX SAT Probabilidade da clausula j ser satisfeita é 1-(1/2)Lj Logo, wj Lj OPT 1 j E ( w) w j 1 2 2 j 2 0,5-aproximação Obs: w j j é um limite superior MAX SAT O que aconteceria se toda clausula tivesse exatamente 3 literais? 1 3 7 E ( w) w j 1 w j j 2 8 j 7/8-aproximação Hastad 97) Se MAXE3SAT tem aproximação (7/8 + e para algum e > 0, P = NP MAX SAT: Desaleatorização C1 = (x1 v x2 v x3) C2 = ( x2 v x3 ) w1 w2 Cada folha da árvore corresponde a um atribuição: Cada folha esta associada a um peso (soma dos pesos das clausulas satisfeitas pela atribuição correspondente a folha) MAX SAT: Desaleatorização E(c(I)) = 7/8 w1 + w2 /2