/
Aggregating inconsistent information Aggregating inconsistent information

Aggregating inconsistent information - PowerPoint Presentation

garcia
garcia . @garcia
Follow
0 views
Uploaded On 2024-03-13

Aggregating inconsistent information - PPT Presentation

Omer Azoulay 28062023 Agenda The RANKAGGREGATION problem The FASTOURNAMENT amp weighted FASTOURNAMENT problems Reduction from RANKAGGREGATION to weighted FAS Approximating FASTOURNMANET ID: 1047725

weighted edge fas tournament edge weighted tournament fas rank edges aggregation triangle set kwiksort arc feedback majority solution cost

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Aggregating inconsistent information" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. Aggregating inconsistent informationOmer Azoulay28/06/2023

2. AgendaThe RANK-AGGREGATION problemThe FAS-TOURNAMENT & weighted FAS-TOURNAMENT problemsReduction from RANK-AGGREGATION to weighted FASApproximating FAS-TOURNMANETApproximating weighted FAS-TOURNAMENTImproving RANK-AGGREGATION approximation bound

3. Rank aggregationAssume different complete ranked lists of the same elementsWe want to combine those rankings into one ranked list, that best describes the preferences expressed in the lists.This problem dates back very early to the 18th century, dealing with voting systems for elections.It is also useful for databases and statistics, where we have many different data sources with inconsistent data. 

4. Example of aggregation

5. Rank aggregationNeed to define a criterion for the “best” aggregated listWe will use Kemeny’s “sum of distances”

6. Kemeny optimal rankingGiven candidates and permutations denotes the # of pairs of items that are in different orderKemeny ranking minimizes the pairwise “disagreements” 

7. Kemeny ranking animation

8. Kemeny ranking animation 

9. Example of aggregation#Inversions =  

10. Kemeny-Young methodThis method is an electoral system that uses ranked voting.Finding Kemeny’s optimal ranking is NP-hardIt is NP-hard even for only 4 input lists!Can be computed trivially by checking all possible permutations 

11. TournamentsA tournament graph is a directed graph obtained by assigning direction for every edge in an undirected complete graph. s.t. or in for every  

12. Feedback arc set on TournamentsA feedback arc set in a directed graph, is a subset of edges that removing them produces an acyclic subgraphFAS-TOURNAMENT is the problem of finding minimal feedback arc set in a tournament graphThe problems are closely related, but FAS-TOURNAMENT is useful on its own. e.g. finding a ranking in round-robin sports tournament where the rankings are possibly inconsistent

13. Example of tournament and FASBeach volleyball at the 2016 Summer OlympicsArrows are directed from loser to winnerUSA beat QAT, this match is an “upset”Final ranking: QAT, ESP, AUT, USA

14. The planWe will show how can FAS-TOURNAMENT be used to approximate RANK-AGGREGATIONWe will reduce from RANK-AGGREGATION to weighted FASThen we will approximate weighted FAS using normal FAS.

15. Reminder: Topological order

16. Feedback arc set on TournamentsAny linear ordering of V, induces a feedback arc setThe feedback arc set size equals to the # of the backward edges We can find the minimal feedback arc set by finding the linear ordering that induces minimal # of backward edges.The set of backward edges of  

17. Weighted FASNow there are edges in both directions Consider weights on the edges of .The score of the feedback arc set is now the sum of its weights.In weighted FAS-TORUNAMENT we look for permutation that minimizes  

18. RANK-AGGREGATION and FAS-TOURNAMENTSRANK-AGGREGATION is special case of weighted FAS-TOURNAMENT where the weight is the fraction of rankings where precede .     

19. AgendaThe RANK-AGGREGATION problem ✔️The FAS-TOURNAMENT & weighted FAS-TOURNAMENT problems ✔️Reduction from RANK-AGGREGATION to weighted FAS✔️Approximating FAS-TOURNAMENTApproximating weighted FAS-TOURNAMENTImproving RANK-AGGREGATION approximation bound

20. Approximating FAS-TOURNAMENTThe KwikSort algorithm!If , return Pick random pivot For all If then Add to Else // Add to Let tournament induced by Let tournament induced by Return  

21. pivotijpivot k

22. pivot

23.  In KwikSort, an edge becomes backward edge iff there’s a recursive call with pivot vertex with and form directed triangle in  jik

24. Proof:Since and , when we pivot in k, would join and would join Thus, in the final permutation, would be before , and would be a backward edge Lemma 1: In KwikSort, an edge becomes backward edge iff there’s a recursive call with pivot vertex with and form directed triangle in  

25.  Let be the set of all directed triangles in For each , we declare an event One of the vertices of is pivoted, and is still subset of in the recursive call stack 

26.  If happens, we charge the triangle one cost unit.In the next recursive calls, the triangle vertices are split between the left and right call chains, and won’t be charged again! 

27. From previous lemma then:  If happens, we charge the triangle one cost unit.In the next recursive calls, the triangle vertices are split between the left and right call chains, and can’t be charged again! 

28.  Denote Then  

29.  Any feedback arc set contains at least one edge from every If the edges of all the triangles in were disjoint, obviously would be lower-bound for But the triangles can possibly overlap, so we need a new method to find a lower-bound.Linear programming to the rescue! 

30. Linear programming: primal & dualThe primal for maximization problem is:The dual problem is:Construction:Each primal constraint becomes a dual variable.Each primal variable becomes a dual constraint. 

31. Reminder: LP’s duality theoremWeak duality theorem:For every feasible solution of the primal and feasible solution of the dual problem: .That is, the objective value in every feasible solution of the dual is an upper-bound on the objective value of the primalStrong duality theorem:If any of the problems has optimal solution, the other has one too and the objective values are equal:  

32.  Consider the following minimization problem:Clearly, the minimal feedback arc set is a feasible solution, so any optimal solution to this LP lower bounds  

33.  Note we also fulfill the condition to any directed cycle : 

34.  Now we consider the dual to this problem 

35.  Consider any non-negative weights on the triangles , s.t. for , Such weights are a feasible solution to the dual problem (assign for any cycle that isn’t a triangle)Now we will construct such weights assignment using  

36.  Denote the event that becomes a backward edgeAssuming , each of vertices can be chosen uniformly to be the pivot (with probability )Thus, edge becomes backward edge with probability conditioned on Summing up: 

37.  For triangles sharing an edge , the events and are disjointFor every :And so are valid weights. From the weak duality theorem: ikhj

38.  For triangles sharing an edge , the events and are disjointFor every :And so are valid weights. From the weak duality theorem: ikhj

39. Minimum feedback arc set in weighted tournamentsLet’s use KwikSort!

40. Minimum feedback arc set in weighted tournamentsLet’s use KwikSort!Weighted Tournaments aren’t really tournamentsThere are edges in both directionsMaybe we can decide wisely which edge to keep?

41. 💡For every pair of edges , one of them would be backward edge!             

42. 💡For every pair of edges , one of them would be backward edge! 

43. 💡For every pair of edges , one of them would be backward edge! 

44. Minimum feedback arc set in weighted tournamentsGiven weighted tournament , we construct a majority graph by taking the heavier edge of each pair.We construct an unweighted graph s.t. if . If we chose one arbitrarily.For any , we define to be the reverse edge ( ji0.70.3

45. Majority Tournament

46. From KwikSort on majority to FAS for  

47. Minimum feedback arc set in weighted tournamentsCBA   

48. KwikSort for weighted FASWe fix KwikSort for the weighted tournament caseFirst, we construct the majority graph Then we run KwikSort on this unweighted tournament This returns some linear ordering The feedback arc set consists of:all the backward edges induced by V on + reverse of all the forward edges induced by on  

49. Analyze KwikSort on majority tournamentFix some optimal solution of the weighted tournament Denote For any (the majority tournament), is the cost incurred by and in the optimal solutionIf is backward edge in that solution, then the cost is If isn’t, then is, so  

50. 💡For every pair of edges , one of them would be backward edge!    

51. makes sense     

52. Analyze KwikSort on majority tournamentFix some optimal solution , so Denote reflects the cost of in the linear ordering by If is backward edge in , we take its weight. Otherwise, the reverse edge is backwards edge, and we take that.  

53. Analyze KwikSort on majority tournamentWe define the set of directed triangles in the majority tournament For we denote: 

54. Theorem 5For a weighted FAS-TOURNAMENT G, if there’s s.t. for all , then Intuition: any triangle will incur at most cost of on its edges. is the cost in some optimal solution, so if then we can’t incur too much over the optimal cost. 

55. Theorem 5 - proofWe will call heavily charged if it becomes backward edge (being charged the majority weight ) and lightly charged otherwise (charged ) 

56. Analyze KwikSort on majority tournamentAs before, for to be heavily charged by , there needs to be directed triangle with all the three vertices in one recursive call and a pivot on vertex .Conditioned on , every edge would be heavily charged with equal probability of then An edge would be lightly charged with probability: 

57.  

58.  

59. Analyze KwikSort on majority tournamentFrom linearity of expectation:And for the optimal solution we can write:Since then Since then Together,  

60. Lemma 4If the weights satisfy the probability constraints:then If the weights satisfy the triangle inequality constraints:then  k ij

61. Probability constraint caseWe will assume For any triangle WLOG:In the majority tournament for every edge, Then Since at least one edge would be a backward edge,  

62. Triangle inequality caseNow we assume triangle inequality:By triangle inequality:Then And since , we get: 

63. Summing upTheorem 5:Running algorithm on the majority tournamentgives expected 5 and 2 approximation for the probability constraints and the triangle inequality case, respectively. 

64. Back to RANK-AGGREGATIONLet be a rank aggregation instanceEvery can be encoded as FAS-TOURNAMENT with binary weights 123132123110132110

65. Back to RANK-AGGREGATIONWe consider the reduction againThe weight system is the fraction of inputs ranking before That is convex combination of the acyclic tournaments from before!Since triangle inequality was true for any of them, it’s also true for their averageFrom previous theorem, achieves 2-factor approximation!But.. Can we do better? 123132

66. PICK-A-PERMConsider this novel algorithm:We will show combining and weighted resultsWe claim that: 

67. PICK-A-PERMConsider this novel algorithm:We will show combining and weighted produce an even better approximation 

68. PICK-A-PERMWe claim that:Where is the weight of in RANK-AGGREGATION reduction to weighted FAS-TOURNAMENT and the majority tournament 

69. PICK-A-PERMIn our reduction from RANK-AGGREGATION to TOURNAMENT-FAS, is the ratio of rankings with the edge .Picking a ranking at random, an edge becomes backward edge w.p. .The expected cost per edge =       

70. The best of both worldsTheorem 6:If there are and s.t.:Then the best of KwikSort and PAP is -approximation for RANK-AGGREGATION. 

71. As before, we rearrange by and  

72. Now we look at the expected cost of  

73. Bounding combination of heavily charged triangles 

74. Bounding combination of the lightly charged 

75. Conclusion:   

76. approximation Since and , the second inequality is obtained easily:We also need to prove the first inequality: 

77. approximation For triangle  

78. approximation 

79. SummaryWe saw approximation factors for FAS-TOURNAMENT and weighted FAS-TOURNAMENT using fractional packingWe saw a reduction from RANK-AGGREGATION to weighted FAS-TOURNAMENTWe saw that combining FAS-TOURNAMENT and PICK-A-PERM achieves even better factor!

80. Thank you

81. Correlation Clustering example++-

82. Correlation ClusteringWe have a complete undirected graphBetween each two nodes the edge is labeled “+” or “-”Our goal is to finding clustering that minimizes the “-” edges within clusters and the “+” edges crossing clustersIn weighted correlation clustering, each pair has weights and Cost of clustering would be sum of over in different clusters, plus the sum of over in same clusterThe unweighted problem can be encoded as weighted case 

83. KwikCluster 

84. KwikClusterKwikCluster is expected 3-approximationProof is analogus to KwikCluster is 5 approximation on weighted Correlation-Clustring 

85. The best of KwikSort and Pick-A-Perm is expected approximation for RANK-AGGREGATION when there are k=3 voters 

86. KwikCluster is expected 3-approximation

87. KwikCluster is 5 approximation on weighted Correlation-Clustring