L eafing through XML NLM Journal Article Tag Suite Conference 2010 Martin Latterner and Marilu Hoeppner National Center for Biotechnology Information National Library of Medicine nextgt lt ID: 227594
Download Presentation The PPT/PDF document "Bookshelf" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
BookshelfLeafing through XMLNLM Journal Article Tag Suite Conference 2010Martin Latterner and Marilu HoeppnerNational Center for Biotechnology InformationNational Library of Medicine
next>
<
prevSlide2
NLM
BOOK
DTD
v2.3Slide3
NLM Collection Catalog
PubMed
Abstracts
Electronic Literature Archive
Books, Monographs, Reports
Journals
Other publication formats
Book chapters, Monographs, Reports
Books in
PubMed
Non-PubMed Books
User guides, Documentation
Journal articles
PMC Journals
PubMed
Central
Bookshelf
Entrez
L
iterature
R
esourcesSlide4
Features of the Book DTDBooks and journals within PubMed CentralBookshelf WorkflowsIntegration of information between databasesSlide5
ModificationsAllowed icon as a child of exlnk.Allowed pre as a child of entry.Allowed glossary as a child of chapter.Added type: ppt.Added attributes id and BID to <foot>.Added attribute id to <p>.
Added <title>, child of <bibsect>.Added <bb>, <
gf
> and <
figgrp
> as children of <
linkgrp
>.
Added <email> as child of <
txtstyle
>.
Added <
pdf
> as child of <glossary>.
Added <figgrp1> as child of <entry>.
…
NCBI Book DTD 1.0
Based on ISO 12083 Article DTDSlide6
March 2003
v1.0
December 2004
v2.0
November 2005
v2.1
BOOKSHELF XML DATA
NCBI BOOK DTDSlide7
Book DTDof theNLM Journal Article Tag SuiteSlide8
Designed to capture the semantic elements of the content, not forme.g. bibliographic metadataSlide9
<front> <div type="titlepage
" level="1" id="2001902bddd00001"> <booktitle>
<
ils
style="strong">CONFLICT OF INTEREST IN MEDICAL RESEARCH</
ils
>
</
booktitle
>
<
bookauthor
>
<bookauthor.name>Committee on Conflict of Interest in Medical Research</bookauthor.name>
<bookauthor.info>Board on Health Sciences Policy</bookauthor.info>
<bookauthor.info>INSTITUTE OF MEDICINE <
ils
style="
smallcap
">
<
ils
style="emphasis"> OF THE NATIONAL ACADEMIES</
ils> </ils
> </bookauthor.info> </bookauthor> <publication.stmt> <p style="center">
<publisher> <publisher.name>THE NATIONAL ACADEMIES PRESS</publisher.name> <publisher.address><state>Washington, D.C.</state></publisher.address> </publisher>
</p> </publication.stmt> <page number="ii" id="2001902bppp00002"/> </div>
<div type="copyrightpage" level="1" id="2001902bddd00002"> <publication.stmt> <p style="normal"> <publisher>
<publisher.name><ils style="strong">THE NATIONAL ACADEMIES PRESS</ils></publisher.name> <publisher.address> <street><
ils style="strong">500 Fifth Street, N.W.</ils></street> <state><ils style="strong">Washington, DC</ils
></state> <postcode><ils style="strong">20001</ils></postcode> </publisher.address>
</publisher> </p> </publication.stmt> <publication.stmt>
<p style="flindent">ISBN <isbn
>978-0-309-13188-9</isbn> (hardcover)</p>
</publication.stmt> <copyright>Copyright <
copyright.year>2009</copyright.year
> by the <copyright.holder>National Academy of Sciences</copyright.holder>. All rights reserved.</copyright> <printinfo> <print>Printed in the United States of America</print> </
printinfo> </div></front>Slide10
<book-meta> <book-title-group> <book-title>Conflict of Interest in Medical Research</book-title>
</book-title-group> <contrib-group
>
<
contrib
contrib
-type="author">
<
collab
>Institute of Medicine (US) Committee on Conflict of Interest in Medical
Research, Education, and Practice</
collab
>
</
contrib
>
</
contrib
-group>
<publisher>
<publisher-name>National Academies Press (US)</publisher-name>
<publisher-loc>Washington (DC)</publisher-loc> </publisher> <isbn>978-0-309-13188-9</
isbn> <pub-date pub-type="ppub"> <year>2009</year> </pub-date> <permissions>
<copyright-statement>Copyright © 2009, National Academy of Sciences</copyright-statement> <copyright-year>2009</copyright-year> </permissions></book-meta>Slide11
More granular text descriptions are handled at attribute levele.g. preface, foreword<sec sec-type=“preface”>Slide12Slide13
ArticleBook
<abbrev-journal-title>
<article>
<article-categories>
<article-id>
<article-meta>
<conf-acronym>
<conference>
<conf-num>
<conf-theme>
<floats-group>
<front>
<front-stub>
<issue-sponsor>
<journal-meta>
<journal-subtitle>
<journal-title>
<journal-title-group>
<response>
<series-text>
<series-title>
<string-conf>
<sub-article>
<unstructured-
kwd
-group>
<x>
<alternate-form><area><book><book-front><book-meta><book-part>
<book-part-categories><book-part-meta><book-title><book-title-group>
<collection><collection-id><collection-list><collection-member><collection-meta>
<collection-name><map><map-group><multi-link>
DTD v3.0
ElementsSlide14
<map-group>XML<map-group id="my-map-id"> <graphic xlink:href="
img-uri"/>
<map
map
-name="my-map">
<area map-shape="
rect
" map-
coords
="1,1,51,76"
xlink:href
="uri1"/>
<area map-shape="
rect
" map-
coords
="54,4,94,74"
xlink:href
="ur2"/>
</map>
</map-group>
X
HTML
<
img
src="img-uri" usemap="#my-map-id"/>
<map id="my-map-id" name="my-map"> <area href="uri1" shape="rect
" coords="1,1,51,76"/> <area href="uri2" shape="rect
" coords="54,4,94,74"/></map>Slide15
<multi-link>XML<multi-link> <term>IDDM2</term> <ext-link ext-link-type="url
" xlink:href="LINK1">Bookshelf</ext-link>
<ext-link ext-link-type="
url
"
xlink:href
="LINK2">PubMed Central</ext-link>
…
</multi-link>Slide16
ArticleBook
abbrev-type
article-type
response-type
alternate-form-type
book-id
book-part-number
book-part-type
graphic-type (obsolete)
indexed
map-alt
map-
coords
map-name
map-shape
primary
qualifier
taxonomic-id
DTD v3.0
AttributesSlide17
Books & Journals in PubMed CentralSlide18
Source ConversionThird-party vendor services: Tagging rules for journals can be applied to book content, especially, for lower level document objects. Citations Figures Tables
In-house conversion: For content submitted in external DTDs, code reuse of PMC journal modules for handling: Dates
Strings
CALS to XHTML table conversionSlide19
Data Processing and IngestSoftware to lookup PubMed IDs in citations<pub-id pub-id-type=”pmid”>Imaging resizing software and
validation checks for graphics and supplementary data files such as PDFLoading code for the extraction of key information, such as dates, subject categories, etcSlide20
CHOP-IT-UPSlide21
Output FormatsHTMLUses base XSLT Article rendering rules for conversion of XML to HTML; book-specific overwrites or modificationsPDFUses XSL-FO base code for articles; book-specific overwrites or modifications Slide22
Advantages of using a Shared Tag SetShare XSLT modules during ingest, conversion processes, and renderingUse similar database infrastructureEnables closer integration for a variety of processes, such as PubMed submission and
indexingSlide23
Bookshelf WorkflowsSlide24
Submission of Content to BookshelfPDF or WordXML in NLM Book DTDXML in external DTDsWord authoring followed by conversion to XML (in-house
)Slide25
<book>
Submitted
Files
PDF
Word
XML (External DTD)
NLM Book DTD XML
Third-party vendor
or
In-house Converters
Requirements
Pass validation
Pass
stylecheck
Slide26
<book-part>
PMC
<
book-part>
<
book-part>
<
book-part>
CMS
<
book>
CHOP-IT-UPSlide27Slide28
NCBI
Word converter
XML
Instant HTML Preview
Publish to
Bookshelf
Microsoft Word
document
Word Authoring Followed by Conversion to XMLSlide29
StylecheckerCheck business rulesGoal: one set of rendering rules for uniform source XML data2 Checkpoints
Whole book (modified article stylechecker)
Individual book-part (article
stylechecker
)Slide30
Integrating Content from Different Databases Slide31Slide32
<!DOCTYPE sec SYSTEM "book.dtd"><sec><title/><sec id="molgen.tables" >
<title/><p content-type="
molecular_genetics
"><italic>Information in the
Molecular Genetics and OMIM tables may differ from that elsewhere
in the
GeneReview
: tables may contain
more recent information. —</italic>ED.</p>
<table-wrap id="
pkd-ar.molgen.TA
" position="anchor">
<caption>
<p>Table A. Polycystic Kidney Disease,
Autosomal
Recessive:
Genes and Databases</p>
</caption>
<table>
<
tbody
>
<
tr
><th
>Gene Symbol</th><th>Chromosomal Locus</th>
<th>Protein Name</th><
th>Locus Specific</th><th>HGMD</th>
</tr>Data in the JATS Book DTD Delivered from External Database
<?get-external-xml molgen.tables?>Processing Instruction in Source XMLSlide33
next><prev