/
Managing Large Graphs  on Managing Large Graphs  on

Managing Large Graphs on - PowerPoint Presentation

erica
erica . @erica
Follow
66 views
Uploaded On 2023-07-27

Managing Large Graphs on - PPT Presentation

MultiCores With Graph Awareness Vijayan Ming Xuetian Frank Lidong Maya Microsoft Research Motivation Tremendous increase in graph data and applications New class of graph applications that require realtime responses ID: 1012238

updates graph batching cores graph updates cores batching partitioning load graphs balancing grace memory multi cache performance placement core

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Managing Large Graphs on" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. Managing Large Graphs on Multi-Cores With Graph AwarenessVijayan, Ming, Xuetian, Frank, Lidong, MayaMicrosoft Research

2. MotivationTremendous increase in graph data and applicationsNew class of graph applications that require real-time responsesEven batch-processed workloads have strict time-constraintsMulti-core revolutionDefault standards on most machinesLarge-scale multi-cores with terabytes of main memoryRun workloads that are traditionally run on distributed systemsExisting graph-processing systems lack support for both

3. A High-level Description of GraceGrace is an in-memory graph management and processing systemImplements several optimizationsGraph-specificMulti-core-specificSupports snapshots and transactional updates on graphsEvaluation shows that optimizations help Grace run several times faster than other alternativesOverviewDetails of optimizationsDetails on transactionsSubset of resultsOutline

4. Keeps an entire graph in memory in smaller parts.Exposes C-style API for writing graph workloads, iterative workloads, and updates.Design driven by two trendsGraph-specific localityPartitionable and parallelizable workloadsv = GetVertex(Id)for (i=0; i<v.degree;i++) neigh=v.GetNeighbor(i)Grace APICore 0Core 1ABDCEIterative Programs (e.g., PageRank)RPCNetGraph and Multi-core OptimizationsAn Overview of Grace

5. CCData StructuresABDCEdge Pointer ArrayA0B1C2Vertex Index1110Vertex Allocation MapABCBCBCEdges of AEdges of BEdges of CVertexLogEdgeLogData Structures in a Partition

6. Graph-Aware Partitioning & PlacementPartitioning and placement – are they useful on a single machine?Yes, to take advantage of multi-cores and memory hierarchiesSolve them using graph partitioning algorithmsDivide a graph into sub-graphs, minimizing edge-cutsGrace provides an extensible libraryGraph-aware: heuristic-based, spectral partitioning, MetisGraph-agnostic: hash partitioningAchieve better layout by recursive graph partitioningRecursively run graph partition until a sub-graph can fit in a cache lineRecompose all the sub-graphs to get the vertex layout

7. Platform for Parallel Iterative ComputationsIterative computation platform implements “bulk synchronous parallel” model.BarrierParallel computationsPropagate updatesIteration 1Iteration 2

8. Load Balancing and Updates BatchingSolution1: Load balancing is implemented by sharing a portion of verticesBarrierBCDAPart0Core0Part1Core1Part2Core2Cache lineProblem1: overloaded partitions can affect performanceProblem2: Updates in arbitrary order can increase cache missesSolution2: Updates batching is implemented by grouping updates by their destination partIssuing updates in a round-robin fashion

9. Grace supports structural changes to a graph BeginTransaction() AddVertex(X) AddEdge(X, Y) EndTransaction()Transactions use snapshot isolationInstantaneous snapshots using CoW techniquesCoW can affect careful memory layout!Transactions on Graphs

10. Graphs:Web (v:88M, e:275M), sparseOrkut (v:3M, e:223M), denseWorkloads: N-hop-neighbor queries, BFS, DFS, PageRank, Weakly-Connected Components, Shortest PathArchitecture:Intel Xeon-12 cores, 2 chips with 6 cores eachAMD Opteron-48 cores, 4 chips with 12 cores eachQuestions:How well partitioning and placement work?How useful are load balancing and updates batching?How does Grace compare to other systems?Evaluation

11. Partitioning and Placement Performance On IntelObservation: For smaller number of partitions, partition algorithm didn’t make a big differenceReason: All the partitions fit within cores of single chip minimizing communication costPageRank SpeedupOrkut graph partitionsWeb graph partitionsObservation: Placing neighboring vertices close together improves performance significantlyReason: L1, L2, and L3 cache and Data-TLB misses are reducedObservation: Careful vertex arrangement works better when graph partitioning is used for sparse graphsReason: graph partitioning puts neighbors under same part helping better placement123

12. Load Balancing and Updates Batching On IntelPageRank SpeedupOrkut graph partitionsWeb graph partitionsObservation: Load balancing and updates batching didn’t improve performance for web graphReason: Sparse graphs can be partitioned better and there are fewer updates to sendObservation: Batching updates gives better performance improvement for Orkut graphReason: Updates batching reduces remote cache accesses12Retired Load

13. Comparing Grace, BDB, and Neo4jRunning Time (s)

14. ConclusionGrace explores graph-specific and multi-core specific optimizationsWhat worked and what didn’t (in our setup; your mileage might differ)Careful vertex placement in memory gave good improvementsPartitioning and updates batching worked in most cases, but not alwaysLoad balancing wasn’t as useful