/
Top-k Graph Summarization on Hierarchical DAGs Top-k Graph Summarization on Hierarchical DAGs

Top-k Graph Summarization on Hierarchical DAGs - PowerPoint Presentation

jaena
jaena . @jaena
Follow
0 views
Uploaded On 2024-03-13

Top-k Graph Summarization on Hierarchical DAGs - PPT Presentation

Xuliang Zhu Xin Huang Byron Choi Jianliang Xu Hong Kong Baptist University Hong Kong China Outlines Motivations Related Work KDAGProblem Algorithms Experiments Conclusions Motivations ID: 1048086

vertices bound candidates greedy bound vertices greedy candidates graph marginal gain subtree method arxiv top upper based ext maximum

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Top-k Graph Summarization on Hierarchica..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. Top-k Graph Summarization on Hierarchical DAGsXuliang Zhu, Xin Huang, Byron Choi, Jianliang Xu Hong Kong Baptist University Hong Kong, China

2. OutlinesMotivationsRelated WorkKDAG-ProblemAlgorithmsExperimentsConclusions

3. MotivationsHierarchical DAGs are everywhere (e.g. Disease Ontology, ImageNet, Wikipedia Categories)Node WeightsDisease OntologyThe occurrences of diseasesImageNetThe number of picturesWikipedia CategoriesThe views under the categoriesDisease OntologyWikipedia CategoriesImageNet

4. MotivationsTop-k summarizationMassive terminologies and complex structuresDifficult to understand or visualizeAn example (k=4)disease: general conceptCOVID-19: the most important diseasecancer & flu: two categories of multiple diseases with large weightsDisease OntologyTop-4 Summarization

5. Related WorkAggregate method[X Jing et al. 2014]Select top-k vertices with maximum aggregate value.Lack diversityGVDO ModelTreeDisease OntologyOur methodAggregate method

6. Related WorkGraph summarizationGraph stream summarization [X Gou et al. ICDE’19]RDF graph summarization [S Cebiric et al. PVLDB’15]Summarization for keyword search [G Fakas et al. SIGMOD’15]Top-k diversificationTop-k maximum clique [L Yuan et al. VLDBJ’ 16] Top-k diversified subgraph [Z Yang et al. SIGMOD’16]

7. OutlinesMotivationsRelated WorkKDAG-ProblemAlgorithmsExperimentsConclusions

8. kDAG-ProblemSummary ScoreHere, S is the selected set, V is the set of all vertices and feq(u) is the weight of vertex u.The score is determined by the weight and distance from the selected vertices.KDAG-problemFind the set

9. kDAG-Problem AnalysisDiversityThe selected vertices are not similar.Small ScaleThe size of selected vertices is small.Large CoverageThe selected vertices can reach many important vertices.High CorrelationThe representative correlation of selected vertices to an important vertex is high.Our kDAG-Problem is NP-hard Reduction from 3-SAT problem

10. OutlinesMotivationsRelated WorkKDAG-ProblemAlgorithmsExperimentsConclusions

11. Greedy+Marginal GainLAdd maximum marginal gain into S greedily.123456710244218661234567374639421866Running ExampleMarginal Gain

12. Greedy+Marginal GainLAdd maximum marginal gain into S greedily.123456710244218661234567374639421866Running ExampleMarginal Gain

13. Greedy+Marginal GainLAdd maximum marginal gain into S greedily.1234567102442186612345673739421866Running ExampleMarginal Gain

14. Greedy+Marginal GainLAdd maximum marginal gain into S greedily.1234567102442186612345673739211866Running ExampleMarginal Gain

15. Greedy+Marginal GainLAdd maximum marginal gain into S greedily.1234567102442186612345673714211866Running ExampleMarginal Gain

16. Greedy+Marginal GainLAdd maximum marginal gain into S greedily.123456710244218661234567114211866Running ExampleMarginal Gain

17. Greedy+Monotone Submodular(1 – 1/e)-approximation guarantee

18. EXT-GreedyExtract subtree based on greedy+ answer.Apply optimal method[1] on the subtree.[1] Ontology-based Graph Visualization for Summarized View. X Zhu et al. 2020. arXiv:arXiv:2008.0305312345671024421866Running Example

19. EXT-GreedyExtract subtree based on greedy+ answer.Apply optimal method[1] on the subtree.[1] Ontology-based Graph Visualization for Summarized View. X Zhu et al. 2020. arXiv:arXiv:2008.0305312345671024421866Greedy+ answer

20. EXT-GreedyExtract subtree based on greedy+ answer.Apply optimal method[1] on the subtree.[1] Ontology-based Graph Visualization for Summarized View. X Zhu et al. 2020. arXiv:arXiv:2008.0305312345671024421866Subtree extraction

21. EXT-GreedyExtract subtree based on greedy+ answer.Apply optimal method[1] on the subtree.[1] Ontology-based Graph Visualization for Summarized View. X Zhu et al. 2020. arXiv:arXiv:2008.0305312345671024421866Optimal solution in subtree

22. EXT-GreedyExtract subtree based on greedy+ answer.Apply optimal method[1] on the subtree.[1] Ontology-based Graph Visualization for Summarized View. X Zhu et al. 2020. arXiv:arXiv:2008.0305312345671024421866Optimal solution in the original DAG

23. K-PCGSCompress DAGCompress the DAG into a new graph by removing useless vertices.Pruning CandidatesPrune the vertices from candidates whose upper bound is smaller than the top-k lower bound.12345672202460066Compressed DAGPruned Candidates (Black)Candidates (Blue)123456722 220 012 240 00 03 63 6Lower Bound (Green) Upper Bound (Red)88030 60

24. K-PCGSCompress DAGCompress the DAG into a new graph by removing useless vertices.Pruning CandidatesPrune the vertices from candidates whose upper bound is smaller than the top-k lower bound.12346722024606612346722 220 012 240 03 63 688030 60Compressed DAGPruned Candidates (Black)Candidates (Blue)Lower Bound (Green) Upper Bound (Red)

25. K-PCGSCompress DAGCompress the DAG into a new graph by removing useless vertices.Pruning CandidatesPrune the vertices from candidates whose upper bound is smaller than the top-k lower bound.1234220(24, 12)60123422 220 014 300 088030 60Compressed DAGPruned Candidates (Black)Candidates (Blue)Lower Bound (Green) Upper Bound (Red)

26. K-PCGSCompress DAGCompress the DAG into a new graph by removing useless vertices.Pruning CandidatesPrune the vertices from candidates whose upper bound is smaller than the top-k lower bound.1234220(24, 12)60123422 220 2014 300 088030 60Compressed DAGPruned Candidates (Black)Candidates (Blue)Lower Bound (Green) Upper Bound (Red)

27. K-PCGSCompress DAGCompress the DAG into a new graph by removing useless vertices.Pruning CandidatesPrune the vertices from candidates whose upper bound is smaller than the top-k lower bound.1234220(24, 12)60123422 530 2014 300 088030 60Compressed DAGPruned Candidates (Black)Candidates (Blue)Lower Bound (Green) Upper Bound (Red)

28. Complexity ComparisonMethodTime ComplexitySpace ComplexityGreedy+O(nkm)O(m)EXT-GreedyO(nkm+nk3h) O(nk2h)k-PCGSO(mh + nlogn + k|C|m’)O(m)Here, n is the size of vertices, m is the size of edges, h is the height of DAG (length of longest shortest path).|C| is smaller than n and m’ is smaller than m.

29. OutlinesMotivationsRelated WorkKDAG-ProblemAlgorithmsExperimentsConclusions

30. DatasetsDatasetHierarchical DAGWeightnmLATTMedical Entity DictionaryNumber of queries from patients4,2268,190LNURMedical Entity DictionaryNumber of queries from nurses4,2268,190ANIM“Anime" catalog in WikipediaNumber of views15,13524,498IMAGEImageNet (WordNet)Number of pictures73,29874,732YAGO3Wikipedia + WordNetSynthetic5,699,25417,512,754UCISocial NetworkOnly for efficiency test Synthetic58,790,78392,208,196

31. Evaluation metricsSummary ScoreAverage Distance1-hop Weighted Coverage

32. Methods comparedFEQ [X Jing et al. 2014]Select k nodes with the highest weight.CAGG [X Jing et al. 2014]Aggregate method with a contribution ratio.LASP [G Fakas et al. SIGMOD’15]Add the largest averaged score path greedily.

33. Effectiveness EvaluationsAverage Distance of different models.Our kDAG model outperforms state-of-the-art models with the smallest average distance.

34. Efficiency EvaluationsEfficiency evaluation of Greedy+, EXT-Greedy, k-PCGS and CAGG methods on ImageNet.The k-PCGS runs fastest among all methods and EXT-Greedy is the slowest.

35. Efficiency EvaluationsThe size of candidatesThe k-PCGS reduces at least 90% candidates in the worst case and more than 99% in most cases.

36. OutlinesMotivationsRelated WorkKDAG-ProblemAlgorithmsExperimentsConclusions

37. ConclusionsWe study the top-k graph summarization problem on large hierarchical DAGs. We propose a novel model of kDAG-problem and proof its NP-hardness.We develop three algorithms to tackle kDAG-problem:Greedy+: A baseline greedy method.EXT-Greedy: A more effective method.k-PCGS: A more efficient method.

38. Thank you for watching!Contact us: csxlzhu@comp.hkbu.edu.hk