/
CSCE-608 Database Systems CSCE-608 Database Systems

CSCE-608 Database Systems - PowerPoint Presentation

fauna
fauna . @fauna
Follow
72 views
Uploaded On 2023-11-04

CSCE-608 Database Systems - PPT Presentation

Spring 2023 Instructor Jianer Chen Office PETR 428 Phone 8454259 Email chencsetamuedu Notes 16 Physical Query Plan and Optimization What Does DBMS Do Prepare a collection C of efficient algorithms for operations in relational algebra ID: 1028816

query scan leaf tree scan query tree leaf operations input lqp physical order algorithms ptr search index node logic

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "CSCE-608 Database Systems" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. CSCE-608 Database SystemsSpring 2023Instructor: Jianer ChenOffice: PETR 428Phone: 845-4259Email: chen@cse.tamu.eduNotes 16: Physical Query Plan and Optimization

2. What Does DBMS Do?Prepare a collection C of efficient algorithms for operations in relational algebra;An input database program Pparserparse tree-lqp convertorLqp-pqp convertortake care of issues in optimization and security.Machine executable codephysical query planparse treepreprocessingparse treelogic query planapply logic lawslogic query planlogic query planOptimization via logic and sizeOptimization via algorithms and cost2

3. What Does DBMS Do?Prepare a collection C of efficient algorithms for operations in relational algebra;An input database program Pparserparse tree-lqp convertorLqp-pqp convertortake care of issues in optimization and security.Machine executable codephysical query planparse treepreprocessingparse treelogic query planapply logic lawslogic query planlogic query planOptimization via logic and sizeOptimization via algorithms and cost2SELECT a1, b1, c1FROM A, B, CWHERE a2=1 AND b2=2 AND c2=3×ABC×σπa1, b1, c1a2=1, b2 =2, c2=3×ABC×σπa1, b1, c1a2=1σσb2 =2c2=3A parse tree

4. What Does DBMS Do?Prepare a collection C of efficient algorithms for operations in relational algebra;An input database program Pparserparse tree-lqp convertorLqp-pqp convertortake care of issues in optimization and security.Machine executable codephysical query planparse treepreprocessingparse treelogic query planapply logic lawslogic query planlogic query planOptimization via logic and sizeOptimization via algorithms and cost2SELECT a1, b1, c1FROM A, B, CWHERE a2=1 AND b2=2 AND c2=3×ABC×σπa1, b1, c1a2=1, b2 =2, c2=3×ABC×σπa1, b1, c1a2=1σσb2 =2c2=3A parse tree

5. Input: an optimized LQP T, and main memory constraint M ×∩ πσσσG FEDCBAConstruction of Physical Query Plan

6. Input: an optimized LQP T, and main memory constraint MReplacing each leaf R of T by “scan(R)”; ×∩ πσσσscan(G) scan(F)scan(E)scan(D)scan(C)scan(B)scan(A)Construction of Physical Query Plan

7. Input: an optimized LQP T, and main memory constraint MReplacing each leaf R of T by “scan(R)”;Combining the “scan’s” with other operations; ×∩ πσσσscan(G) scan(F)scan(E)scan(D)scan(C)scan(B)scan(A)index-scanindex-scanConstruction of Physical Query Planindex-scan

8. Input: an optimized LQP T, and main memory constraint MReplacing each leaf R of T by “scan(R)”;Combining the “scan’s” with other operations; Replacing each internal node v of T by a proper algorithm; ×∩ πσσσscan(G) scan(F)scan(E)scan(D)scan(C)scan(B)scan(A)index-scanindex-scanJ2PJ2PJ1PJ1PCJI1PConstruction of Physical Query Planindex-scan

9. Input: an optimized LQP T, and main memory constraint MReplacing each leaf R of T by “scan(R)”;Combining the “scan’s” with other operations; Replacing each internal node v of T by a proper algorithm;For each edge e in T, decide if e should be “materialized”; ×∩ πσσσscan(G) scan(F)scan(E)scan(D)scan(C)scan(B)scan(A)index-scanindex-scanJ2PJ2PJ1PJ1PCJI1PConstruction of Physical Query Planindex-scan

10. Input: an optimized LQP T, and main memory constraint MReplacing each leaf R of T by “scan(R)”;Combining the “scan’s” with other operations; Replacing each internal node v of T by a proper algorithm;For each edge e in T, decide if e should be “materialized”; ×∩ πσσσscan(G) scan(F)scan(E)scan(D)scan(C)scan(B)scan(A)index-scanindex-scanJ2PJ2PJ1PJ1PCJI1PConstruction of Physical Query Planindex-scan

11. Input: an optimized LQP T, and main memory constraint MReplacing each leaf R of T by “scan(R)”;Combining the “scan’s” with other operations; Replacing each internal node v of T by a proper algorithm;For each edge e in T, decide if e should be “materialized”;Cut all materialized edges; ×∩ πσσσscan(G) scan(F)scan(E)scan(D)scan(C)scan(B)scan(A)index-scanindex-scanJ2PJ2PJ1PJ1PCJI1PConstruction of Physical Query Planindex-scan

12. Input: an optimized LQP T, and main memory constraint MReplacing each leaf R of T by “scan(R)”;Combining the “scan’s” with other operations; Replacing each internal node v of T by a proper algorithm;For each edge e in T, decide if e should be “materialized”;Cut all materialized edges;Each subtree is a call to the subroutine at the root of the subtree. The order of the calls follows the bottom-up order in the structure. ×∩ πσσσscan(G) scan(F)scan(E)scan(D)scan(C)scan(B)scan(A)index-scanindex-scanJ2PJ2PJ1PJ1PCJI1P123Construction of Physical Query Planindex-scan

13. Input: an optimized LQP T, and main memory constraint MReplacing each leaf R of T by “scan(R)”;Combining the “scan’s” with other operations; Replacing each internal node v of T by a proper algorithm;For each edge e in T, decide if e should be “materialized”;Cut all materialized edges;Each subtree is a call to the subroutine at the root of the subtree. The order of the calls follows the bottom-up order in the structure. ×∩ πσσσscan(G) scan(F)scan(E)scan(D)scan(C)scan(B)scan(A)index-scanindex-scanJ2PJ2PJ1PJ1PCJI1P123This produces an executable code for the input DB programConstruction of Physical Query Planindex-scan

14. Physical Query Plan: SummaryReplacing internal nodes of an LQP by proper algorithms;Deciding if a subroutine call should be pipelined or materialized;Many optimization techniques are involved here;In practice, heuristic optimization techniques are used to construct good physical query plans;The resulting physical query plan is an executable code.

15. What Does DBMS Do?Prepare a collection C of efficient algorithms for operations in relational algebra;An input database program Pparserparse tree-lqp convertorLqp-pqp convertortake care of issues in optimization and security.Machine executable codephysical query planparse treepreprocessingparse treelogic query planapply logic lawslogic query planlogic query planOptimization via logic and sizeOptimization via algorithms and cost2SELECT a1, b1, c1FROM A, B, CWHERE a2=1 AND b2=2 AND c2=3×ABC×σπa1, b1, c1a2=1, b2 =2, c2=3×ABC×σπa1, b1, c1a2=1σσb2 =2c2=3A parse tree

16. What Does DBMS Do?Prepare a collection C of efficient algorithms for operations in relational algebra;An input database program Pparserparse tree-lqp convertorLqp-pqp convertortake care of issues in optimization and security.Machine executable codephysical query planparse treepreprocessingparse treelogic query planapply logic lawslogic query planlogic query planOptimization via logic and sizeOptimization via algorithms and cost2SELECT a1, b1, c1FROM A, B, CWHERE a2=1 AND b2=2 AND c2=3×ABC×σπa1, b1, c1a2=1, b2 =2, c2=3×ABC×σπa1, b1, c1a2=1σσb2 =2c2=3A parse tree

17. secondarystorage(disks)in tables(relations)databaseadministratorDDLlanguagedatabase programmerDML (query)languageDBMSfile managerbuffermanagermainmemorybuffersindex/file managerDML complierDDL complierquery executionenginetransaction managerconcurrency controllock tablelogging &recovery

18. secondarystorage(disks)in tables(relations)databaseadministratorDDLlanguagedatabase programmerDML (query)languageDBMSfile managerbuffermanagermainmemorybuffersindex/file managerDML complierDDL complierquery executionenginetransaction managerconcurrency controllock tablelogging &recoveryWhat is still missing?

19. secondarystorage(disks)in tables(relations)databaseadministratorDDLlanguagedatabase programmerDML (query)languageDBMSfile managerbuffermanagermainmemorybuffersindex/file managerDML complierDDL complierquery executionenginetransaction managerconcurrency controllock tablelogging &recoveryEfficient Algorithms forRelational algebriac operations

20. secondarystorage(disks)in tables(relations)databaseadministratorDDLlanguagedatabase programmerDML (query)languageDBMSfile managerbuffermanagermainmemorybuffersindex/file managerDML complierDDL complierquery executionenginetransaction managerconcurrency controllock tablelogging &recoveryEfficient Algorithms forRelational algebriac operations

21. secondarystorage(disks)in tables(relations)databaseadministratorDDLlanguagedatabase programmerDML (query)languageDBMSfile managerbuffermanagermainmemorybuffersindex/file managerDML complierDDL complierquery executionenginetransaction managerconcurrency controllock tablelogging &recovery

22. secondarystorage(disks)in tables(relations)databaseadministratorDDLlanguagedatabase programmerDML (query)languageDBMSfile managerbuffermanagermainmemorybuffersindex/file managerDML complierDDL complierquery executionenginetransaction managerconcurrency controllock tablelogging &recovery

23. The Main Purpose of Index StructuresSpeedup the search process23 indexσa=6(R)blockscontainingthe desired tuplesquickly figure out disksotherwise have toscan the entire RBut also need to handle dynamic changes of R

24. Support fast searchSupport range search Support dynamic changesCould be either dense or sparse dense: pointers to all records sparse: one pointer per block24B+Trees

25. A B+tree node of order n where ph are pointers (disk addresses) and kh are search-keys (values of the attributes in the index)25B+Treespn+1knk2p2k1p1p3……

26. A B+tree node of order n where ph are pointers (disk addresses) and kh are search-keys (values of the attributes in the index)How big is n? Basically we want each B+tree node to fit in a disk block so that a B+tree node can be read/written by a single disk I/O. Typically, n ~ 100-200.26pn+1knk2p2k1p1p3……B+Trees

27. B+Tree Example order n = 3271003012015018035113035100101110120130150156179180200root

28. A B+Tree of order nEach node has: n keys and n+1 pointers These are fixedTo keep the nodes not too empty, also for the operations to be applied efficiently: * Non-leaf: at least (n+1)/2 pointers (to children) * Leaf: at least (n+1)/2 pointers to data (plus a “sequence pointer” to the next leaf) Basically: use at least one half of the pointers28

29. Sample non-leaf order n = 329578195To keys k < 57To keys 57 k<81To keys 81 k<95To keys k  95

30. Sample leaf node order n = 3 30From non-leaf nodeTo next leaf in sequence578195To record with key 57To record with key 81To record with key 95

31. Example (B+ tree of order n=3)311201501803035113035Full nodeMin. nodeNon-leafLeaf

32. B+tree rulesRule 1. All leaves are at same lowest level (balanced tree)Rule 2. Pointers in leaves point to records except for “sequence pointer”Rule 3. Number of keys/pointers in nodes:32Max. #pointersMax. # keysMin. #pointersMin. #keysNon-leafn+1n(n+1)/2(n+1)/2 1Leafn+1n(n+1)/2 + 1(n+1)/2Rootn+1n21

33. B+tree rulesRule 1. All leaves are at same lowest level (balanced tree)Rule 2. Pointers in leaves point to records except for “sequence pointer”Rule 3. Number of keys/pointers in nodes:33Max. #pointersMax. # keysMin. #pointersMin. #keysNon-leafn+1n(n+1)/2(n+1)/2 1Leafn+1n(n+1)/2 + 1(n+1)/2Rootn+1n21could be 1

34. Search in a B+treeStart from the rootSearch in a leaf blockMay not have to go to the data file 34 Search(ptr, k); \\ search a record of key value k in the subtree rooted at ptr \\ assume the B+tree is a dense index of order n Case 1. ptr is a leaf \\ pn+1 is the sequence pointer IF (k = ki) for a key ki in *ptr THEN return(pi); ELSE return(Null); Case 2. ptr is not a leaf find a key ki in *ptr such that ki-1 ≤ k < ki; return(Search(pi, k));

35. Search in B+tree351003012015018035113035100101110120130150156179180200rootSearch(*prt, 130)120  130 <150130 =130return100  130 Search(ptr, k); \\ search a record of key value k in the subtree rooted at ptr \\ assume the B+tree is a dense index of order n Case 1. ptr is a leaf \\ pn+1 is the sequence pointer IF (k = ki) for a key ki in *ptr THEN return(pi); ELSE return(Null); Case 2. ptr is not a leaf find a key ki in *ptr such that ki-1 ≤ k < ki; return(Search(pi, k)); A tree nodek1k2‧‧‧‧‧‧knp1p2p3pn+1pn