/
Deep Representation of Biological Knowledge for Question Answering Deep Representation of Biological Knowledge for Question Answering

Deep Representation of Biological Knowledge for Question Answering - PowerPoint Presentation

laxreffa
laxreffa . @laxreffa
Follow
343 views
Uploaded On 2020-06-25

Deep Representation of Biological Knowledge for Question Answering - PPT Presentation

Vinay K Chaudhri 1 Outline Introduction KBBio101 Biology textbook knowledge base Representing structure and function Representation and Reasoning needs Upper ontology Representing structure ID: 787545

structure function bio cell function structure cell bio entity knowledge ontology representing 101 functions http chromosome relation biology www

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "Deep Representation of Biological Knowle..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Deep Representation of Biological Knowledge for Question Answering

Vinay K. Chaudhri

1

Slide2

Outline

IntroductionKB_Bio_101- Biology textbook knowledge baseRepresenting structure and function Representation and Reasoning needsUpper ontologyRepresenting structureRepresenting functionRepresenting structure function relationshipAnswering questionsSome open research problems

Summary

2

Slide3

Deep Representation

3

Bio-medical

Ontologies

KB Bio 101

Representation Language:

Web Ontology Language (OWL)

Representation Language:

Existential Rules

OOKB

(e.g., Gene Ontology)

Classes

Relations

DisjointnessMultiple InheritanceDomain/range constraints

Every instance of a class Is related to a set of individuals Which are themselves related in arbitrary ways

Frame-based

Systems

Representation Language: Open Knowledge Base ConnectivityOKBC

Sufficient properties

Reasoning operations:

get-class-subclassesget-slot-valuesget-facet-valuesget-instance-types

Reasoning operations:

ClassificationConsistency checking

Reasoning operations:

Query answering

Non-standard unification

Slide4

Deep Representation

4

Bio-medical

Ontologies

KB Bio 101

Ontology Language:

Very few relations

Ontology Language:

Small number of

domain-general relations

(e.g., Gene Ontology)

Frame-based

Systems

Ontology Language:

Small number of

Domain-specific relations

is_ahas_partlocated_inhas_participant

has_partpossesseshas_regionmaterial

element

agentobjectinstrumentbaseraw-materialresult

Process_directionCell_line_locationAssociated_downstream_activities

T-edgeA-edgeP-edge

Slide5

Question Answering

5

Watson

KB Bio 101

Primarily w-h question

What, when, where,

etc

Short answers

94.5 % of correct answers were the titles of some Wikipedia page

No Dialog

There was a single shot interaction with the system

Educationally useful questions

Describe, compare, relate

Long answers

Answers are synthesized and are like an essay

Drill down

Unlimited drill-down on the returned answer is available

Slide6

KB_Bio_101

Campbell Biology is a textbook used in an advanced placement biology course in schools and introductory biology courses in collegesA team of biologists used AURA to curate the KB from the textbook, using a sophisticated knowledge authoring process The KB is a valuable asset: it was created by an estimated 12 person years of encoding effort by biologists, and an estimated 5 person years of work on the upper ontology (CLib)

Vulcan has released this asset free of charge for research purposes

http://www.ai.sri.com/halo/halobook2010/exported-kb/biokb.html

KB Bio 101 is used in an electronic textbook application

6

Slide7

7

Slide8

KB_Bio_101 Statistics

# Classes

# Relations

# Constants

Avg. #

Skolems

/ Class

Avg

. #

Atoms /

Necessary Condition

Avg. # Atoms / Sufficient

Condition

6430

455

634

24

64

4

# Constant

Typings

#

Taxonomi

cal Axioms

#

Disjointness

Axioms

# Equality Assertions

# Qualified

Number Restrictions

714

6993

18616

108755

936

Regarding Class Axioms:

Regarding Relation Axioms:

# DRAs

# RRAs

# RHAs

# QRHAs

# IRAs

# 12NAs

/

# N21As

# TRANS +

# GTRANS

449

447

13

39

212

10

/

132

431

# Cyclical

Classes

# Cycles

Avg. Cycle Length

#

Skolem

Functions

1008

8604

41

73815

Regarding Other Aspects:

8

Slide9

Core Themes in Biology

Challenge

Structure

and Function

Relating

structure and function

Regulation

Qualitative reasoning about

dynamic processes

Energy Transfer

Representing energy production, consumption

Continuity and Change

Representing genetic change across generations

Evolution

Models of population dynamics

Science as a Process

Experimentation and hypothesis testing

Interdependence in Nature

Represent large inter-related complex systems

Science, Technology,

Society

Represent technological and social forces

9

Slide10

Core Theme: Structure & Function

Structure and function

are correlated at all levels of biological organization:

The form fits the function

10

Figures from Biology (9

th

Edition) by Neil A. Campbell and Jane B. Reece.

Copyright © 2011 by Pearson Education, Inc. Used by permission of Pearson Education, Inc.

Slide11

Computational Meaning of a Core Theme

Identify the requirements in terms of a set of questionsDiagnostic questionsHelp assess the basics of KR&REducationally useful questionsThe question must be of interest to teachers and studentsThe question must be ``Google hard”The question should not require solving an open-ended research problem

11

Slide12

Diagnostic S&F Questions

What is the structure of X?

What is the function of X?

12

Slide13

Educationally Useful Questions

Relate Structures to FunctionsWhat structure of Biomembrane facilitates a function of biomembrane

, namely

phagocytosis

?

Qualitative Comparisons

If the Loop of

Henle

gets longer, how will its function be impacted?

Detailed Comparisons

What is the functional similarity between

prions

and

viroids

?Similarity ReasoningGlucose is to Glycogen as ATP is to what?Negatively Modified Structures Impacting Functions

If hydrogen is removed from a saturated fatty acid, then how is its function impacted?13

Slide14

Starting Point - Component Library

A simple upper ontology designed to be accessible to domain experts (Barker et. al, KCAP 2001)

Other key distinctions:

Roles

Properties

14

See

http://www.ai.sri.com/halo/public/clib/20130328/clib-tree.html

for more information

Slide15

Component Library

A vocabulary of relations to describe events

Event to Entity

Event to Event

Event to Value

agent

first-

subevent

direction

object

next-event

distance

instrument

causes

duration

raw-materialenables

frequencyresult

preventsintensity

siteinhibitsrate

origin

by-means-of

15

Slide16

Representing Structure

Structure of an entity represents its parts, their spatial arrangements and sizes

Meronymic

Spatial

Properties

has-part

is-at

length

has-region

is-inside

diameter

material

is-outside

height

possesses

abutsarea

elementis-between

depthis-along

volume

16

Slide17

Choosing Structural Slots

Inspired by work of Maria Keet, but simplified for use by biologists:It must make sense to say ``X has Y” in EnglishX has-region Y ifY is a region of space defined in relation to XIt does not make sense to associate Y with properties such as mass or density, but can be associated with measures such as length, area, or volumeX has material Y only if Y is tangible and pervasive in X

X has element Y if

X is a set of entities of the same type (or sibling types) that Y is an instance of

X possesses Y only if

Y is Energy, bond or gradient

Otherwise X has part Y

17

Slide18

Example Structure Representation

18

Slide19

A difficult example: Carbon Skeleton

What should be the relationship between an organic molecule and a skeleton?

It is more than simply a set of entities

Can have length and shape

Is not an entity in its own right

- Biologists do not associate mass with it

The remaining choice is has-region

-behaves differently than a human skeleton

19

Slide20

Representing Functions

Is function a primitive or a computed notion?Could function be inferred from participant relations, thus, reducing the encoding time?

20

Slide21

Representing Functions

Is function a primitive or a computed notion?It is a primitive notion and should be encoded by a biologist

has-function

21

Slide22

Representing Functions

What is a function?We understand functions as “special” events in which an entity participatesAlternatively, a function is an event which is a reason for an entity’s existenceThe “special” nature of functions will be indicated by using a new slot called has-functionTypes of functionsInherent functions of an entity

These will appear on the entity’s concept graph

Contextual functions of an entity

These will appear on

another

entity or event’s concept graph

22

Slide23

Example of an Inherent Function

An inherent function of a Golgi Apparatus is to store chemicalsThis is true regardless of which specific type of cell it is a part ofInherent functions are placed on the Entity graph, using the has-function slot

23

Slide24

Example of Function in an Environment

Not every smooth ER detoxifies drugsHowever, drug detoxification is the function of a smooth ER in a liver cell

24

Slide25

Answering Questions

Create an ABOXInstantiate every concept in the knowledge base and compute the individuals it is related to up to depth threeConjunctive query answeringReduce questions to conjunctive queries on an ABOXPath findingFind all possible paths between two individualsComparisonsComputer intersection and difference between two sets of triples

25

Slide26

26

Slide27

27

Slide28

28

Slide29

29

Generating good sentences is a research problem in natural language generation

See

http://kbgen.org

Slide30

30

Slide31

Relate Structure to Function

What structures of a plasma membrane facilitate a function of the plasma membrane, namely active movement of ions?

31

Slide32

Path-Based Similarity Reasoning

Model Relation

Path Similar Relation

32

Slide33

Prior Work

Structure, Behavior & Function (Chandrasekran, 2000)Basic Foundational Ontology (Arp & Smith, 2008)General Formal Ontology (Herre, et. al., 2006) DOLCE (Borgo et. al. 2010)

33

Slide34

Open Research Problems

Representing core themesGenerality outside the current textbook34

Slide35

Representing Structure and Function

What are some longer-term research problems?Defining spatial slots for the whole bookSpecifically, boundaries, regions and cavitiesPreliminary work done by Bennett et. al. published at the 2013 Conference on Spatial Information TheoryAvailable at: http://www.ai.sri.com/pub_list/1959 Specifying the structure at multiple levels of detail and from multiple perspectives

35

Slide36

Representing rest of the textbook

36

Frequently, the challenge lies in taking a piece of biological information and reducing it to a known modeling approach

Deep KR Challenge Workshop

https://sites.google.com/site/dkrckcap2011/

Number of Sentences from chapters 2-12

Slide37

Generalization to multiple textbooks

Textbook

Middle school biology

Comparable to Campbell biology

Cell biology

Neuroscience

Introductory college physics

Introductory college algebra

Introductory college US

history

Introductory college psychology

Slide38

Generalization to multiple textbooks

Textbook

General Aspects:

Conceptual and qualitative knowledge cuts across domains

Some domains are more mathematical than others and require mathematical/symbolic problem solving

Challenges in representing Campbell also exist in other disciplines: models, hypotheses, experiments

Unique aspects:

Each domain requires domain-specific vocabulary

design

Each domain has some new question formulation challenges

Each domain has some new unique representations needs

Slide39

Summary

Deep representation of biological knowledge can enable advanced form of question answering such as comparing and relatingWe have made substantial advance in achieving this goal by using a language based on existential rules and an ontology richer than used in current bio-ontologiesSuch capability has been found useful for education, and we expect similar benefits in bio-informatics39

Slide40

40

Thank You!

Slide41

41

Backup Slides

Slide42

KR in KB_Bio_101

All the favorite featuresClassesNecessary and sufficient conditionsDisjoint-nessMultiple InheritanceRelations and Property Valuesdomain, rangeinverse relationstransitivity

Relation hierarchy

Relation composition

qualified number restrictions

Nominals

 

Cell ⊑ Living-Entity ⊓

(

hasPart.Ribosome

) ⊓

(

hasPart.Chromosome

)

Every Cell is a Living Entity and

has a Ribosome and a Chromosome part

42

Slide43

KR in OOKB

Unique Features in relation to DLs (and FDNC, Datalog±, ASPfs)Graph structured descriptions

Every Eukaryotic Cell is a Cell and

has parts Eukaryotic Chromosome, Nucleus and a Ribosome such that

Eukaryotic Chromosome is inside the Nucleus

 

The knowledge shown in red is not expressible in known decidable description logics such as OWL 2

This can be captured in Rule Languages

43

Slide44

KR in OOKB

Unique Features in relation to DLs Inherit and Specialize

In the Eukaryotic Cell

Ribosome was inherited from Cell

Chromosome was inherited from Cell and specialized to Eukaryotic Chromosome

44

Slide45

KR in OOKB

Unique Features in relation to DLsInherit and Specialize

In the Eukaryotic Cell

Ribosome was inherited from Cell

Chromosome was inherited from Cell and specialized to Eukaryotic Chromosome

 

 

45

Slide46

KR in OOKB

Unique Features in relation to DLsInherit and Specialize

In the Eukaryotic Cell

Ribosome was inherited from Cell

Chromosome was inherited from Cell and specialized to Eukaryotic Chromosome

 

 

=

 

=

 

46

Slide47

KR in KB_Bio_101

Computational propertiesReasoning with KR in KB_Bio_101 is, in general, un-decidableThere are, however, some decidable fragments that introduce guardedness and acyclic structure in the KBObject Oriented Knowledge Bases in Logic Programming (Chaudhri, et. al., Technical Communication of the International Conference on Logic Programming, 2013)Available at:

http://www.ai.sri.com/pub_list/1958

A more thorough formal investigation is an open problem

Challenge for TPTP

reasoners

:

http://www.ai.sri.com/pub_list/1937

Challenge for OWL

reasoners: http://www.ai.sri.com/pub_list/1961 Challenge for ASP

reasoners 47

Slide48

Structure Function Relationship

We know how an entity participates in a function

48

Slide49

Structure Function Relationship

We do not know how an entity participates in a function

has-part

For example, Chlorophyll-A contains

Poryphrin

. The textbook says that

Poryphrin

facilitates Chlorophyll-A’s function of absorbing violet-blue light, but does not say how.

49

Slide50

Structure Function Relationship

We do not know how an entity participates in a function

50

Slide51

System architecture

Encoding

Problem solving

Significant effort devoted to the usability of answers by students

51

Slide52

Knowledge Authoring in AURA

Knowledge engineers provide a small library of domain independent representationsThe Component Library (CLIB) contains classes representing physical actions, e.g., Move, Attach, Penetrate, and semantic relations, e.g., agent, object, has-part (Barker, Clark, Porter, KCAP’01)

See

http://www.ai.sri.com/pub_list/864

Biologists apply those representations to encode biology knowledge

AURA provides graphical editing

See

http://www.ai.sri.com/pub_list/1545

and

http://www.ai.sri.com/pub_list/865

52