/
Enhancements to the  B itemporal Enhancements to the  B itemporal

Enhancements to the B itemporal - PowerPoint Presentation

cheryl-pisano
cheryl-pisano . @cheryl-pisano
Follow
345 views
Uploaded On 2018-11-06

Enhancements to the B itemporal - PPT Presentation

M odel S upport Integrity Constraints and More Henrietta Dombrovskaya Braviant Holdings Chicago Boris Novikov Saint Petersburg University Russia Chad Slaughter Scale Genius Inc Chicago ID: 718663

bitemporal 2015 model data 2015 bitemporal data model open 2017 c100 time sep doe john effective asserted update customer

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Enhancements to the B itemporal" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Enhancements to the Bitemporal Model Support: Integrity Constraints and More

Henrietta Dombrovskaya,

Braviant

Holdings, Chicago

Boris

Novikov

, Saint Petersburg University, Russia

Chad Slaughter, Scale Genius, Inc. ChicagoSlide2

Bitemporal Data: Making It Happen In Postgres

Incorporating time, thereby making the data models temporal, is a hot topic. The practical usage include:

Snapshot queries (This report looked differently last week! I have a print out! Can you run it as it looked at the end of November?)

Time-interval queries (Why this query takes twice longer now than last month? What exactly changed in the data?)Correlated time queries (How many people where fired since X became a CTO? How many people left since the new head of Analytics started?)Many systems are already providing this kind of support (DB2, Oracle) and it is included into the SQL ANSI 2011 standard.However, Postgres never formally included the time dimensions, and we are going to fix this!

2

Sep-17

Bitemporal Data Model - PG Open 2017Slide3

ReferencesC.J. Date, Hugh

Darwen

, and Nikos

Lorentzos. 2014. Time and Relational Theory, Second Edition: Temporal Databases in the Relational Model and SQL (2nd ed.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.Tom Johnston. 2014. Bitemporal Data: Theory and Practice (1st ed.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.Tom Johnston and Randall Weis. 2010. Managing Time in Relational Databases: How to Design, Update and Query Temporal Data. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.Krishna Kulkarni and Jan-Eike Michels. 2012. Temporal features in SQL:2011. SIGMOD Rec. 41, 3 (October 2012), 34-43.

Richard Thomas Snodgrass. 1999. Developing Time-Oriented Database Applications in SQL. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.

ISO/IEC 9075-2:2011, Information technology — Database languages — SQL , 2011Martin Kaufmann, Peter M. Fischer, Norman May, Donald Kossmann

: Benchmarking

Bitemporal

Database Systems: Ready for the Future or Stuck in the Past? EDBT 2014: 738-749

3

Sep-17

Bitemporal Data Model - PG Open 2017Slide4

Previous Work OverviewWhat we’ve presented a year ago

4

Sep-17

Bitemporal Data Model - PG Open 2017Slide5

SummaryThe surveys (

Kosman

et al) indicate, that is should be easy to support time in Postgres due to the existence of the GIST indexes. And we’ve utilized this advantage.

Last year we presented the Postgres-based implementation of asserted versioning framework (Johnston and Weis). This model supports effective time and asserted time and differs from the ANSI 2011 standard.5Sep-17

Bitemporal Data Model - PG Open 2017Slide6

Where Our Work Can Be Found

pg_bitemporal

git repohttps://github.com/scalegenius/pg_bitemporalIncludes:Domain definition

Allen relations implementation (including the ones already implemented in Postgres)O

perations:create bitemporal

table

b

itemporal insert

bitemporal updatebitemporal

correctioninactivate

b

itemporal

delete

S

upport of

bitemporal

constraints

6

Sep-17

Bitemporal Data Model - PG Open 2017Slide7

Overview Of Bitemporal Operations

The next several slides present the basic

bitemporal

data manipulation operations and explain the differences between bitemporal and conventional operations.Terminology: effective timeasserted time

the combination of effective and asserted times: time regions

7

Sep-17

Bitemporal Data Model - PG Open 2017Slide8

Bitemporal Insert

#

Effective Interval

Assertive IntervalCustomer No.Name

Type1

[2015-06-01, oo

)

[ 2015-05-01 ,

oo

)

C100

John Doe

Silver

Asserted

1

Effective

now

= 2015-05-01

select

ll_bitemporal_insert

(

'customers’

,

,

$$’

customer_no

’, name

',

’type' $$,

,

$$'C100','John Doe', 'Silver' $$,

timeperiod

('2015-06-01','infinity'),

timeperiod

('2015-05-01','infinity')

8

Sep-17

Bitemporal Data Model - PG Open 2017Slide9

Bitemporal Update

#

Effective Interval

Assertive IntervalCustomer No.Name

Type1

[2015-06-01, oo)

[2015-05-01,2015-09-15)

C100

John Doe

Silver

2

[2015-06-01,2015-09-15)

[2015-09-15,

oo

)

C100

John Doe

Silver

3

[2015-09-15, oo)

[ 2015-09-15,

oo

)

C100

John Doe

Gold

Asserted

1

3

1

2

Effective

now =

2015-09-15

select

ll_bitemporal_update

($$customers$$,

$$

customer_no

$$, $$100$$,

$$type$$, $$Gold$$,

t

imeperiod

(‘2015-09-15’, ‘infinity’),

timeperiod

(‘2015-09-15’,

‘infinity’))

9

Sep-17

Bitemporal Data Model - PG Open 2017Slide10

Bitemporal Correction

#

Effective Interval

Assertive IntervalCustomer No.Name

Type1

[2015-06-01, oo)

[2015-05-01,2015-09-15)

C100

John Doe

Silver

2

[2015-06-01,2015-09-15)

[2015-09-15,

oo

)

C100

John Doe

Silver

3

[2015-09-15, oo)

[2015-09-15, 2015-09-22)

C100

John Doe

Gold

4

[2015-09-15,

oo

)

[ 2015-09-22,

oo

)

C100

John Doe

Platinum

Asserted

3

1

2

4

3

1

2

Effective

now = 2015-09-15

select

ll_bitemporal_correction

($$customers$$,

$$type

$$,

$$

Platinum$$,

$$

customer_no

$$,

$$

C100$$,

timeperiod

('2015-09-15'

,

'infinity

')

,

n

ow())

10

Sep-17

Bitemporal Data Model - PG Open 2017Slide11

Bitemporal Inactivate

#

Effective Interval

Assertive IntervalCustomer No.Name

Type1

[2015-06-01, oo)

[2015-05-01,2015-09-15)

C100

John Doe

Silver

2

[2015-06-01,2015-09-15)

[2015-09-15,

oo

)

C100

John Doe

Silver

3

[2015-09-15, oo)

[2015-09-15, 2015-09-22)

C100

John Doe

Gold

4

[2015-09-15,

oo

)

[2015-09-22, 2015-11-05)

C100

John Doe

Platinum

5

[2015-09-15,2015-12-31 )

[ 2015-11-05,

oo

)

C100

John Doe

Platinum

Asserted

4

3

1

2

5

4

3

1

2

Effective

now =

2015-11-05

select

ll_bitemporal_inactivate

(

$$customers$$,

$$

customer_no

$$,

$$C100$$,

timeperiod

('2015-12-31','infinity

'),

timeperiod

('2015-11-05','infinity'),

11

Sep-17

Bitemporal Data Model - PG Open 2017Slide12

Bitemporal Delete

#

Effective Interval

Assertive IntervalCustomer No.Name

Type1

[2015-06-01, oo)

[2015-05-01,2015-09-15)

C100

John Doe

Silver

2

[2015-06-01,2015-09-15)

[2015-09-15,

oo

)

C100

John Doe

Silver

3

[2015-09-15, oo)

[2015-09-15, 2015-09-22)

C100

John Doe

Gold

4

[2015-09-15,

oo

)

[2015-09-22, 2015-11-05)

C100

John Doe

Platinum

5

[2015-09-15,2015-12-31 )

[2015-11-05, 2015-11-17 )

C100

John Doe

Platinum

Asserted

4

5

3

1

2

5

4

3

1

2

Effective

now =

2015-11-17

select

ll_bitemporal_delete

(

'customers',

$$

customer_no

$$,

$$ 'C100' $$,

timeperiod

('2015-11-17','infinity

'))

12

Sep-17

Bitemporal Data Model - PG Open 2017Slide13

Bitemporal ConstraintsWe need to support the following constraint types:

Primary key

supported by GIST with exclusionUnique - supported by GIST with exclusionCheck – no difference from regular tablesIS/IS NOT NULL – no difference from regular tablesForeign key – most difficult to supportWe use

metacode to record the presence of the bitemporal constraints:

PK: select

bitemporal_internal.pk_constraint

('

postgres_cluster_id');

UQ: select bitemporal_internal.unique_constraint

('port');

13

Sep-17

Bitemporal Data Model - PG Open 2017Slide14

Defining Foreign Key ConstraintThe difficulty of verifying the bi-temporal FK is that the PK/UQ in the parent table should be effective and asserted all the time when a dependent record is effective/asserted

CONSTRAINT "

bitemporal fk postgres_version_database_versionsrelease_version" check (true or 'fk' <> '@

postgres_version ->

database_versions(release_version

)@'));

Function:

select

bitemporal_internal.fk_constraint

(

'

postgres_version

,'

database_versions

,'

release_version

');

14

Sep-17

Bitemporal Data Model - PG Open 2017Slide15

FK CreationCheck whether the referencing field is a PK/UQ validate_bitemporal_pk_uqCreate check constraint

fk_constraint

Check whether the validation on the parent table field already exists

ll_lookup_validation_functionIf not, create itll_generate_fk_validatecreate trigger on insert/update and a trigger function15

Sep-17Bitemporal Data Model - PG Open 2017Slide16

How a Validation Function Should Work?

In order to define the algorithm, we need to take a closer look at how

bitemporal

queries work.16Sep-17Bitemporal Data Model - PG Open 2017Slide17

Understanding Bitemporal Queries

17

Sep-17

Bitemporal Data Model - PG Open 2017Slide18

Types of Bitemporal QueriesSnapshot queries –

how the data looked at a certain point of time

What was the status of

customer_no ’C100’ on Sep 1 2015?May also use one-dimensional conditionsInterval queries in selection criteriaHow did the status change for customer_no ‘C100’ during 2015?May return multiple rows with time regionsOne-dimensional conditions are also hereCorrelated time queriesWhat was the status of customer ‘C200’ when customer ‘C100’ was ’Platinum’?Values are equal only at time when they are valid in both time regions

18

Sep-17

Bitemporal Data Model - PG Open 2017Slide19

How to Define Time Regions For the Query Result? Conceptually each row has associated bitemporal

time region

The strategy is to perform an operation as in non-temporal DB and calculate region for each tuple of resultOperations on time regions: ∩ intersection∪ union − differenceExamples:Interval query: intersect query region with tuple regions: what was the customer type for customer C100 in September 2015?Join: find matching tuples and intersect their regions: what was the status of customer C300 when customer C100 was “Platinum”?19

Sep-17

Bitemporal Data Model - PG Open 2017Slide20

Defining Time Regions For Updates

UPDATE

a_table

SET attr = (expr) WHERE (cond)Time regions for:S – the time region defined by UPDATE function (for new values to be set)P – time region for each row identified by WHERE clause (with

old values to be updated)N-

time region for each already existing row with values, which we are going to assign

(empty of not exists)

P ⋂ N = ⍉

After UPDATE the DB will contain regions:Old values: P ∖ S

New values: N ∪S

20

Sep-17

Bitemporal Data Model - PG Open 2017

P

N

s

P

N

sSlide21

Time Regions as Sets of RectanglesOperations on time regions:

intersection

∪ union − differenceAsserted interval × Effective interval = rectangleUnfortunately, union and difference of rectangles are not rectangularSets of non-intersecting rectangles: all operations produce sets or rectangles

21

Sep-17

Bitemporal Data Model - PG Open 2017Slide22

Operations on Regions (Rectangle Sets)Disjoin (auxiliary)Split both arguments into smaller rectangles such that each pair of rectangles from different arguments either coincide or do not overlap

Set-theoretic operations: union, intersection, set difference:

Calculate the disjoin

Choose rectangles that belong to union/intersection/differenceSep-17Bitemporal Data Model - PG Open 201722Slide23

Using Rectangle setsChecking foreign key constraintsRect_set (FK) ⊊

rect_set

(PK)

Computing time regions (rectangle sets) for query resultsIntersectionUnionDifferenceBulk inserts and updates

Sep-17

Bitemporal Data Model - PG Open 2017

23Slide24

Finding One-Dimensional (Interval) DisjoinSort all left and right ends of source intervals togetherChoose intervals that intersect with or overlap with at least one of source intervals

Sep-17

Bitemporal Data Model - PG Open 2017

24Slide25

Two-Dimensional Disjoin AlgorithmProject both arguments on one dimension (say, on asserted and ignore effective time)Compute on-dimensional disjoin

Put back the second dimension producing stripes

For each stripe, find one-dimensional disjoin.

Sep-17Bitemporal Data Model - PG Open 201725Slide26

Conceptual Representation: Nested Relational. (Do we still need this one?)

Each row contains a cuboid set

DB primary key is the entire non-temporal row

(business key, non-key attributes)DB foreign keys are unusableUpdates affect at most 3 rows(business) keyNon-key

attributesCuboid

dim1

dim2

Alice

10

1,10

1,100

Alice

15

10,20

1,100

Bob

9

1,5

1,30

5,10

50,100

Bob

16

10,20

30,100

26

Sep-17

Bitemporal Data Model - PG Open 2017Slide27

Current PG Implementation: Flat Relational (Do we still need this one?)

DB primary key

(

busines key, rectangle)PK uniqueness can be supported with indexesDB foreign keys are unusableUpdates almost always affect several rows(business) key

Non-key attributes

Cuboid

dim1

dim2

Alice

10

1,10

1,100

Alice

15

10,20

1,100

Bob

9

1,5

1,30

Bob

9

5,10

50,100

Bob

16

10,20

30,100

27

Sep-17

Bitemporal Data Model - PG Open 2017Slide28

Enhancements To The Existing Bitemporal Functions

New features added as a result of

bitemporal

production system implementation28Sep-17Bitemporal Data Model - PG Open 2017Slide29

Bulk Operations

INSERT-SELECT

The power of SQL is operating on sets, but so far we only implemented single-row insert. The new function

ll_bitemporal_insert_select

(

p_table text

,

p_list_of_fields

text

,

p_select

TEXT

,

p_effective

temporal_relationships.timeperiod

,

p_asserted

temporal_relationships.timeperiod

)

Allows to pass a select statement as a parameter.

29

Sep-17

Bitemporal Data Model - PG Open 2017Slide30

UPDATE/CORRECTION AS SELECTOur existing bitemporal_update and bitemporal_correction

support updating multiple records in one call, but the update conditions are reduced to the lists of values

New “bulk” update and correction will allow free-format expressions for both selection criteria and new values.

The same concept will be applied to the DELETE/INACTIVATE functions.30Sep-17

Bitemporal Data Model - PG Open 2017Slide31

Consistent Transactional BehaviorQuestion:What should be recorded in a

bitemporal

table, when several operations happen in one transaction and affect the same record(s)?

Transactions are atomic, thereby the intermediate states should not be visible.Also, if we materialize the record status before (and regardless) of transaction commit/rollback, we might (most likely) get inconsistent results for the snapshot queries.31Sep-17

Bitemporal Data Model - PG Open 2017Slide32

Bitemporal Correction Immediately After Insert

#

Effective Interval

Assertive IntervalCustomer No.

NameType

1

[2015-06-01, oo)

[2015-05-01,2015-09-15)

C100

John Doe

Silver

2

[2015-06-01,2015-09-15)

[2015-09-15,

oo

)

C100

John Doe

Silver

3

[2015-09-15, oo)

[2015-09-15,

2015-09-15)

C100

John Doe

Gold

4

[2015-09-15,

oo

)

[ 2015-09-15,

oo

)

C100

John Doe

Platinum

Asserted

3

1

2

4

3

1

2

Effective

now = 2015-09-15

select

ll_bitemporal_correction

($$customers$$,

$$type

$$,

$$

Platinum$$,

$$

customer_no

$$,

$$

C100$$,

timeperiod

('2015-09-15'

,

'infinity')

,

32

Sep-17

Bitemporal Data Model - PG Open 2017Slide33

Corrected Behavior Of UPDATE/CORRECTIONNow we check the asserted interval of the “last” record, and if the “last” asserted start is the same as the “new” asserted start we do not insert a new record, but update the old one.

This still allows the ”non-transactional” behavior, of the app developers choose to ignore transactions

Bitemporal

correction has a new parameter: asserted_start_time (instead of “now” by default). One can still use the old version33

Sep-17Bitemporal Data Model - PG Open 2017Slide34

Bitemporal Correction – Correct Behavior

#

Effective Interval

Assertive IntervalCustomer No.Name

Type

1[2015-06-01, oo)

[2015-05-01,2015-09-15)

C100

John Doe

Silver

2

[2015-06-01,2015-09-15)

[2015-09-15,

oo

)

C100

John Doe

Silver

3

[2015-09-15, oo)

[2015-09-15,

oo

)

C100

John Doe

Platinum

4

[2015-09-15,

oo

)

[ 2015-09-22,

oo

)

Cl 00

John Doe

Platinum

Asserted

3

1

2

3

1

2

Effective

now = 2015-09-15

select

ll_bitemporal_correction

($$customers$$,

$$type

$$,

$$

Platinum$$,

$$

customer_no

$$,

$$

C100$$,

timeperiod

('2015-09-15'

,

'infinity')

,

34

Sep-17

Bitemporal Data Model - PG Open 2017Slide35

High-level FunctionsThe app developers do not care about how exactly bitemporal support is implemented.

We created a high-level functions for them to perform select/insert/update/delete without knowing about the

bitemporal

operationsExample:create or replace function common.hl_bitemporal_insert

(p_schema_name

text,

p_table_name

text

,

p_list_of_fields

text

,

p_list_of_values

text) returns integer

35

Sep-17

Bitemporal Data Model - PG Open 2017Slide36

PerformanceWe are using pg_bitemporal for both OLTP and OLAP systems, and the data volumes appear to be manageable

Indexing Strategy: GIST indexes are efficient, but occasionally we need to build extra indexes, for example conditional index with UPPER (asserted)=‘infinity’

Sep-17

Bitemporal Data Model - PG Open 201736Slide37

Future WorkFinalizing FK supportFinalizing constraints creation Finalizing

bitemporal

UPDATE

Continue research on performance and the choice of indexesMany thanks to Braviant Holdings leadership for their continuous support!37Sep-17

Bitemporal Data Model - PG Open 2017