/
Physical and Logical Structure Physical and Logical Structure

Physical and Logical Structure - PowerPoint Presentation

briana-ranney
briana-ranney . @briana-ranney
Follow
400 views
Uploaded On 2016-06-14

Physical and Logical Structure - PPT Presentation

SNU IDB Lab XML Documents 1 structure Peeping into XML document at Physical view Entity at logical view DTD 2 Peeping into XML document15 3 ltxml version10 standaloneyesgt ID: 362567

xml element document entity element xml entity document month dtd greeting pcdata internal structure entities cdata attlist year attribute

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Physical and Logical Structure" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Physical and Logical Structure

SNU IDB Lab.Slide2

XML Documents 1 : structure

Peeping into XML document

at Physical view : Entity

at logical view : DTD

2Slide3

Peeping into XML document(1/5)

3

<?xml version=“1.0” standalone=“yes”?>

<GREETING>

Hello, XML!!

<!--this is greeting-->

</GREETING>

Mark-up

data

Mark-up and character dataSlide4

Peeping into XML document(2/5)

4

<? xml version=“1.0” standalone=“yes” ?>

<!DOCUMENT

DATE

[

<!ELEMENT

DATE

(#PCDATA)>] >

<DATE> 001224</DATE>

XML document : date.xml

XML declaration

xml

문서임을 선언

.

<?

로 시작하여

?>

로 끝난다

.

DTD(Document Type Definition)

user

가 사용할

tag를 정의한다.여기서는 DATE tag를 정의.

Content

<!--This is date -->

Comment

:

parser

는 이를 무시

.Slide5

Peeping into XML document(3/5)

Structure of XML document

physical structure :

allows components of the document, called entities

logical structure : allows a document to be divided into named units and sub-units, called elements

5Slide6

Sub-unit

Unit

Document

elements

Logical Structure

entities

(internal)

(separate)

Physical Structure

5

Peeping into XML document(4/5

)Slide7

Peeping into XML document(5/5)

7

<person>

<name> kim </name>

<ID>771224</ID>

<office>301-453</office>

<phone>1830</phone>

<photo source=“k.jpg”/>

</person>

<person>

<name> kim </name>

<ID>771224</ID>

<office>301-453</office>

<phone>1830</phone>

<photo source= />

</person>

“k.jpg”

element

entitySlide8

XML Documents 1 : structure

Peeping into XML document

at Physical view : Entity

at logical view : DTD

8Slide9

Content of Physical structure

Entity

Figures of Document Entity

Defining an entityGrammar in Declaring EntityExamples of EntityDeclaration

URL format

9Slide10

Entity (1/3)

unit of physically isolating and storing any part of a document (

정보저장단위

)Each unit of information is called an entity

entities

(internal)

(separate)

Physical Structure

<person>

<name> kim </name>

<ID>771224</ID>

<office>301-453</office>

<phone>1830</phone>

<photo source= />

</person>

“k.jpg”

entity

SNU

OOPSLA

Lab.Slide11

Entity (2/3)

Purpose of Entity

contain all the information

(well-formed XML data , other text file, binary data…)

11

<person>

<name> kim </name>

<ID>771224</ID>

<office>301-453</office>

<phone>1830</phone>

<photo source= />

</person>

“k.jpg”

Document entity

Image entitySlide12

Entity (3/3)

Internal Entity

해당

document 안에서 완전하게 정의되는 entity

External EntityURL을 통해 알려진 외부의 source

로부터 그들의 content를 받아 오는 entity 12Slide13

Figures of Document Entity

13

document entity

(no entities)

document entity

(main content)

A

A

B

C

D

document entity

(framework file)Slide14

Defining an entity

Entity must be defined before the first reference to them in the data stream

Declared in the DTD(Document Type Definition)

14

<!DOCTYPE DOCUMENT [

<!ENTITY EMAIL “sjlee@oopsla.snu.ac.kr”>

<!ENTITY TEXT “(#PCDATA)”>

]>

Entity definition in DTDSlide15

Example : EntityDeclaration(1/3)

Internal text entities

<!ENTITY XML “eXtensible Markup Language”>

<!ENTITY DemoEntity ‘The rule is 6” long.’>

Built-in entities (내장entity)<!ENTITY sample “Use &quot; and ‘as delimiters.”>

15

&li;

&gt;

&amp;

&apos;&quot;

for ‘<‘for ‘

>’for ‘&’for ‘ ’

’for ‘ ” ’;Slide16

Example : EntityDeclaration(2/3)

External text entities

<!ENTITY myent SYSTEM “/EMTS/MYENT.XML”>

<!ENTITY myent PUBLIC “-//MyCorp//ENTITY Syperscript Chars//EN”….>

Binary entities<!ENTITY Jsphoto SYSTEM “/ENTS/Jsphoto.tif” NDATA “TIFF”>

16Slide17

Example : EntityDeclaration(3/3)

URL format

<!ENTITY ent9 SYSTEM “entities/entity9.xml”>

/xml/document.xml/entities/entity9.xml

<!ENTITY ent9 SYSTEM “../entities/entity9.xml”>

/xml/docs/document.xml/

entities/entity9.xml

xml

document.xml

entities

entity9.xml

xml

entities

entity9.xml

docs

document.xmlSlide18

XML Documents 1 : structure

Peeping into XML document

at Physical view : Entity

at logical view : DTD

18Slide19

Content of Logical structure

Concepts

DTD Structure

Element DeclarationAttribute DeclarationsParameter Entities

Conditional SectionsNotation DeclarationsDTD Processing Issues

19Slide20

Concepts of DTD(1/3)

DTD(Document Type Definition)

An optional but powerful feature of XML

Comprises a set of declarations that define a document structure tree

XML processors read the DTD and check whether the document is valid and use it to build the document model in memory Describes user’s own tag set as meta markup language

20Slide21

Concepts of DTD(2/3)

DTD describes..

Element , attribute , notation , relation between each elements

Establishes formal document structure rules

21Slide22

Concepts of DTD(3/3)

Declare Vs. Define

Declare

 “This document is a concert poster”

Define  “A concert poster must have the following features”DTD defineElement type + Attribute + Entities

Valid Vs. InvalidValid  conforms to DTDInvalid  fail to conform to DTD22

Well formed

XML Document

Valid XML DocumentSlide23

Valid & Invalid Documents

Valid:

<GREETING>

various random text but no markup

</GREETING>

Invalid: anything else including <GREETING>

<sometag>various random text</sometag>

<someEmptyTag/> <GREETING> 23

Example:

<!DOCTYPE GREETING[ <ELEMENT GREETING (#PCDATA)> ]>Slide24

DTD structure

DTD is composed of a number of declarations

ELEMENT (tag definition)

ATTLIST (attribute definitions)ENTITY (entity definition)

NOTATION(data type notation definition)DTD can be stored in an external subset or an internal subset

24Slide25

Internal and External Subset(1/3)

Internal subset

Form :

<!DOCTYPE … [ <!-- Internal Subset -->

… ]>ProsEasy to write XML

ConsEditing two files without movingOther document can’t reuse without copying internal subset25Slide26

Internal and External Subset(2/3)

External subset

better to use external DTDs

Reason why?Many benefits

document managementupdatingeditingFew reasons

If you use an external DTD, you can use public DTDs(capability)External DTDs provide for better document managementExternal DTDs make it easier to validate you document26Slide27

Internal and External Subset(3/3)

27

internal

external

Internal subset

external subset

full parsing pathSlide28

Element Declarations

Used to define a new element, specify its allowed content and gives the name and content model of the element

Each tag must be declared in a <!ELEMENT> declaration.

The content model uses a simple regular expression-like grammar to precisely specify what is and isn't allowed in an element

28

ELEMENT Type declaration

‘<!ELEMENT’ S Name S Contentspec S? ‘>’Slide29

Content Specifications

ANY

#PCDATA

SequencesChoicesMixed Content

ModifiersEmpty

29Slide30

ANY

A SEASON can contain any child element and/or raw text (parsed character data)

Rarely used in practice, due to the lack of constraint on structure it encourages.

30

<!ELEMENT SEASON ANY>Slide31

#PCDATA

Parsed Character Data; i.e. raw text, no markup

Represent normal data and preceded by the hash-symbol, ‘#’, to avoid confusion with an identical element name, when used within a model group

( for example, ‘(#PCDATA | PCDATA)’)

31

<!ELEMENT YEAR (#PCDATA)>Slide32

Use of #PCDATA in XML

32

Valid:

Invalid:

<

YEAR>1999</YEAR>

<YEAR>99</YEAR>

<YEAR>1999 .E.</YEAR>

<YEAR>

The year of our Lord one thousand, nine hundred, and ninety-nine

</YEAR>

<

YEAR>

<MONTH>January</MONTH>

<MONTH>February</MONTH>

<MONTH>March</MONTH>

<MONTH>April</MONTH>

<MONTH>May</MONTH>

<MONTH>June</MONTH>

<MONTH>July</MONTH>

<MONTH>August</MONTH>

<MONTH>September</MONTH>

<MONTH>October</MONTH>

<MONTH>November</MONTH>

<MONTH>December</MONTH>

</YEAR>Slide33

Child Elements

To declare that a LEAGUE element must have a LEAGUE_NAME child:

33

<!

ELEMENT LEAGUE (LEAGUE_NAME)>

<!ELEMENT LEAGUE_NAME (#PCDATA)>Slide34

Sequences(1/2)

Separate multiple required child elements with commas; e.g.

One or More Children +

34

<!

ELEMENT SEASON (YEAR, LEAGUE, LEAGUE)>

<!ELEMENT LEAGUE (LEAGUE_NAME, DIVISION,

DIVISION, DIVISION)>

<!ELEMENT DIVISION_NAME (#PCDATA)>

<!ELEMENT DIVISION (DIVISION_NAME, TEAM+)>Slide35

Sequences(2/2)

Zero or More Children *

Choices

35

<!ELEMENT TEAM (TEAM_CITY, TEAM_NAME, PLAYER*)>

<!ELEMENT TEAM_CITY (#PCDATA)>

<!ELEMENT TEAM_NAME (#PCDATA)>

<!ELEMENT PAYMENT (CASH | CREDIT_CARD)>

<!ELEMENT PAYMENT (CASH | CREDIT_CARD | CHECK)>Slide36

Grouping With Parentheses

Parentheses combine several elements into a single element.

Parenthesized element can be nested inside other parentheses in place of a single element.

The parenthesized element can be suffixed with a plus sign, a comma, or a question mark.

36

<!ELEMENT dl (dt, dd)*>

<!ELEMENT ARTICLE (TITLE, (P | PHOTO |GRAPH

| SIDEBAR | PULLQUOTE | SUBHEAD)*, BYLINE?)>Slide37

Mixed Content

Both #PCDATA and child elements in a choice

#PCDATA must come first

#PCDATA cannot be used in a sequence

37

<!

ELEMENT TEAM (#PCDATA | TEAM_CITY

| TEAM_NAME | PLAYER)*>

Empty elements

<!

ELEMENT BR EMPTY>Slide38

Attribute Declarations

Consider this element:

It is declared like this:

38

<

GREETING LANGUAGE="Spanish">

Hola!

</GREETING>

<!

ELEMENT GREETING (#PCDATA)>

<!ATTLIST GREETING LANGUAGE CDATA "English">

<!

ATTLIST Element_name Attribute_name Type

Default_value>Slide39

Multiple Attribute Declarations

Consider this element

With two attribute declarations:

With one attribute declaration

Indentation is a convetion, not a requirement

39

<

RECT LENGTH="70px" WIDTH="85px"/>

<!

ELEMENT RECTANGLE EMPTY>

<!ATTLIST RECTANGLE LENGTH CDATA "0px">

<!ATTLIST RECTANGLE WIDTH CDATA "0px">

<!ATTLIST RECTANGLE LENGTH CDATA "0px" WIDTH CDATA "0px">Slide40

Attribute Types

40

CDATA

ID

IDREF

IDREFS

ENTITY

ENTITIES

NOTATION

NMTOKEN

NMTOKENS

EnumeratedSlide41

CDATA

Most general attribute type

Value can be any string of text not containing a less-than sign (<) or quotation marks (")

41Slide42

ID

Value must be an XML name

May include letters, digits, underscores, hyphens, and periods

May not include whitespaceMay contain colons only if used for namespaces

Value must be unique within ID type attributes in the documentGenerally the default value is #REQUIRED

42Slide43

IDREF

Value matches the ID of an element in the same document

Used for links and the like

43

IDREFS

A list of ID values in the same document

Separated by white spaceSlide44

ENTITY

Value is the name of an unparsed general entity declared in the DTD

44

ENTITIES

Value is a list of unparsed general entities declared in the DTD

Separated by white spaceSlide45

NOTATION

Value is the name of a notation declared in the DTD

45

<!

NOTATION

Tex SYSTEM “..\TEXVIEW.EXE”>

<!ENTITY Logo SYSTEM “LOGO.TEX”

NDATA Tex

>

TEXVIEW.EXE

LOGO.TEX

1

2

3

4Slide46

NMTOKEN

Value is any legal XML name

46

NMTOKENS

Value is a list of XML names

Separated by white spaceSlide47

Enumerated

Not a keyword

Refers to a list of possible values from which one must be chosen

Default value is generally provided explicitly

47

<!ATTLIST P VISIBLE (TRUE | FALSE) "TRUE">Slide48

Attribute Default Values

A literal string value

One of these three keywords

#REQUIRED#IMPLIED

#FIXED

48Slide49

#REQUIRED

No default value is provided in the DTD

Document authors must provide attribute value for each element

49

<!

ELEMENT IMG EMPTY><!ATTLIST IMG ALT CDATA #REQUIRED><!ATTLIST IMG WIDTH CDATA #REQUIRED>

<!ATTLIST IMG HEIGHT CDATA #REQUIRED>Slide50

#IMPLIED

No default value in the DTD

Author may(but does not have to) provide a value with each element

50Slide51

#FIXED

Value is the same for all elements

Default value must be provided in DTD

Document author may not change default value

51

<!ELEMENT AUTHOR EMPTY><!ATTLIST AUTHOR NAME CDATA #REQUIRED>

<!ATTLIST AUTHOR EMAIL CDATA #REQUIRED>

<!ATTLIST AUTHOR EXTENSION CDATA #IMPLIED>

<!ATTLIST AUTHOR COMPANY CDATA #FIXED "TIC">Slide52

Example of Internal DTDs

52

<?

xml version="1.0"?>

<!DOCTYPE GREETING [

<!ELEMENT GREETING (#PCDATA)>

]>

<GREETING>

Hello XML!

</GREETING>Slide53

Internal DTD Subsets

Internal declarations override external declarations

53

<?

xml version="1.0"?>

<!DOCTYPE GREETING SYSTEM "greeting.dtd" [

<!ELEMENT GREETING (#PCDATA)>

]>

<GREETING>

Hello XML!

</GREETING>