Preserving Our Past and Present
35K - views

Preserving Our Past and Present

for the Future Generations. Raj Reddy. Carnegie Mellon University. June 15, 2017. Keynote Speech at . International Conference on Digital Library and Knowledge. Zhejiang University, Hangzhou, China. 1.

Download Presentation

Preserving Our Past and Present

Download Presentation - The PPT/PDF document "Preserving Our Past and Present" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Presentation on theme: "Preserving Our Past and Present"— Presentation transcript:


Preserving Our Past and Present for the Future Generations

Raj ReddyCarnegie Mellon UniversityJune 15, 2017Keynote Speech at International Conference on Digital Library and KnowledgeZhejiang University, Hangzhou, China



Preserving Our Cultural Heritagefor Future Generations

Preserving Culture

Much is Freely Available on the web

Wikipedia: Donation Model

New Trend: Free Online Access and pay for physical copy

Neural Networks and Deep Learning by

Michael Nielsen

Except Commercially Published Copyrighted Knowledge

Most Important Works are Copyrighted and must be preserved!

Books, Newspapers, Music, Movies and Video, and Paintings

Multiple Media Now All Accessible In Digital Representation

Online Digital Access

Instantly Available

To Anyone, Anywhere in the World, in Any Language, and

Searchable, Findable And


By Humans And Intelligent Agents


Future Generations?

10 to 1000 generations?

Next Millennium: Y3K – 40 Generations?


Preserving Our Past: CADAL and The Million Book Project

Y2K: Technology for Large Scale scanning became available around 2000

No need to cutup books

OCR over 99* correct

Not orphan languages



from libraries in US, China and India

MOU between CMU and ZJU signed 2002

Scanned over Million Books in China

Google Scanning of Books started around 2007

Over 20 million scanned

but not accessible pending litigation

We Also Need to Continue Preservation of the Past and Expand to

Newspapers, Music, Movies,


and Software


Preserving Our Present: Accessing and Archiving Born Digital Content

At present, there is

no cogent plan for saving born-digital entities

for future generations.

Unlike webpages,

copyright protected objects are being lost forever

except for a few best sellers.

GRAND CHALLENGE: Create a Digital Archive Of All Born Digital Content: Books

, newspapers, paintings, music, movies, software, etc. from now in perpetuity

Instantly Available to Anyone, Anywhere In The World

, in any language and searchable, findable and


by humans and intelligent agents.

Requires a Government Ordinance

within which all the born-digital content is captured and archived before it is irretrievably lost.


The Main Bottleneck: Copyright LawsIncompatible with Speed of Progress in Information Age

Different Countries have Different policiesSome Countries Have Compulsory LicensingStatus not always KnowableLife+ 50 yearsMickey Mouse lawSome Content Not Digital


Universal Copy Right Summit

Fixed Copyright Term for 100 years from the Date of Creation

Require Registration and Self Archiving and Renewal Every 10 years

Life + 50 and other Arcane laws superseded

For options such as Royalties for number of access

Online Global Copyright Registry

OPT-IN: Owner can reduce the Copyright Period

Public Domain Works

Orphan and/or Abandoned works and Government Publications

Digital Depository

All Media that Enjoys Copyright Protection: Books, Music, Movies, Newspapers, Paintings

Ordinance requiring all publishers of all media to submit a digital copy (along with usually required physical copies) to the National Archive of the Country.


Technical Issues

How to Better Preserve Our Culture, Heritage and Creative Works.

Establish a Digital Cultural Conservancy of the World

A Thousand Year Archive of All Books, Newspapers, Music, Movies, and Paintings

How Can We Be Sure We Can Read or View Books, Newspapers, Music, Movies, Paintings etc. Created Long Ago?


In a World Where File Formats Change Monthly, How Can We Retrieve and Experience Archived Content?

Format conversion Tools

VM Ware

Who pays to maintain our ability to access artifacts? Stakeholders:






Authors and Creators (of Books, Movies, Music, Newspapers, Paintings




National Digital Archive of China (NDAC)

Establish National Digital Archives of China as part of State Administration of Cultural Heritage

Industry and Universities may provide Technology, Training and Management support

Government shall provide space, personnel, equipment and operating costs of INDAC


Digital Depository for All Born Digital Copyrighted Works

Software and Tools to enable drag-drop-and click submission of the copyrighted work to the NDA without Leaving the Office/Home

Such material would be released to the public only when the work is out-of-copyright or when requested by the author/publisher on an opt-in basis.


PDF/A: ISO Standard for Archiving Digital Content

PDF/A is an ISO-standardized version of the Portable Document Format (PDF) specialized for use in the archiving and long-term preservation of electronic documents.

PDF/A differs from PDF by prohibiting features ill-suited to long-term archiving, such as font linking (as opposed to font embedding) and encryption.[1] 

The ISO requirements for PDF/A file viewers include color management guidelines, support for embedded fonts, and a user interface for reading embedded annotations.


Orphan Works and Public Lending Right

Orphan and abandoned works problem

State Usually Inherits the IP

Define that A Creative Work is “Orphan” When It is Generating Little or No Revenue to The Creator.

A Creative Work is Considered “Abandoned” When Attempts to Locate Owners Of Unclaimed Works Through Letters and Newspaper Announcements are Unsuccessful.

An “Opt-in Process” May Be Provided If There is an Incentive to Receive Royalty for Every Access

Analogous To Public Lending Right For Paying Royalties To Authors In UK And Other Countries.


Value Added Services of NDACTo Enhance the Income to Authors and Creators

National Digital Archives of China may Undertake Value Added Services of Access and Distribution such as

Find Copyrighted Content by Discovery and Search Tools

Enable buying and Selling Copyrighted Content thru


Enable Dynamic Pricing based on Market Demand

Pay Royalty to Authors and Creators that Borrow Copyrighted Content

under Public Lending Right

Discoverability by Search


using Alibaba and Amazon-like Services

Market Pricing

Public Lending Right


Next Steps

Set-up a Cloud based Server to Serve as Global Digital Archive

Government to Approve Establishment of an Archive for National Digital Depository of Copy-righted Works replacing the Current Physical Depository

Define Meta Data for Digital Works: Ease of Use for Self Archiving by Authors

Self Archiving

but not free for 100 years unless placed in Public



Conclusions and Action Items

IKCEST, the UNESCO International Knowledge Center for Engineering Science and Technology may convene a International Summit on

Universal Copyright Policies

, to discuss and establish

Terms and Conditions of a Universal Copyright Law.

A Global Copyright Registry

A Digital Depository at National Digital Archives

Compulsory Licensing of Orphan works

While respecting the rights of authors and creative artists.

Establish National Digital Archives of China as part of State Administration of Cultural Heritage

With Responsibility of Acquisition, Preservation and Access to All Born-Digital Copyrighted Content

Including Books, Newspapers, Music , Movies and Videos, Paintings

Establish scanning equipment, computers, petabytes of storage, and

Software tools for Management and Preservation