1 Trust and Hybrid Reasoning for Ontological Knowledge Bases Hui Shi Kurt Maly and Steven Zeil Contact malycsoduedu 2 Outline Problem Semantic web subject to changes How to scale a reasoner to big data ID: 435423
Download Presentation The PPT/PDF document "Bigscholar 2014, April 8, Seoul, South ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Bigscholar 2014, April 8, Seoul, South Korea
1
Trust and Hybrid Reasoning for Ontological Knowledge Bases
Hui
Shi, Kurt Maly, and Steven Zeil
Contact
: maly@cs.odu.eduSlide2
2
Outline
Problem
Semantic
web subject to changes
How to scale a reasoner to big data?Background Knowledge base using ontologiesInference strategiesBenchmarksOptimized backward chainingHybrid ReasonerMaterialization with search and retrieval Change and marking trusted/untrusted areasReason over untrusted goals Conservative trust assessmentProperty based trustPattern based trustEvaluationConclusions
Bigscholar
2014, April 8, Seoul, South KoreaSlide3
3
Problem
Efficiency of reasoning in the face of large scale and frequent changes within a question/answer system over a semantic web
Issue
Forward chaining scales well for fixed knowledge bases
Backward chaining can handle changes in knowledge base but does not scaleBigscholar 2014, April 8, Seoul, South KoreaSlide4
Background
Existing semantic application: question/answer systemsLibra,
Cimple
,
Arnetminer
Semantic WebResource Description Framework(RDF)Web Ontology Language (OWL) for specific knowledge domainsSPARQL query language for RDFSWRL rule languageReasoning systemsJena proprietary Jena rulesPellet and KANON ORACLE 11gOWLIMBigscholar 2014, April 8, Seoul, South Korea
4Slide5
5
Background
Knowledge base (KB)
Ontologies
Representation formalism: Description Logic (DL)
Inference methods for First Order LogicMaterialization and forward chaining pre-computes inferred truths and starts with the known data suitable for frequent computation of answers with data that are relatively staticOwlim and OracleQuery-rewriting and backward chaining expands the queries and starts with goals suitable for efficient computation of answers with data that are dynamic and infrequent queriesVirtuoso
Bigscholar 2014, April 8, Seoul, South KoreaSlide6
Background
Benchmarks evaluate and compare the performances of different reasoning systemsThe Lehigh University Benchmark (LUBM)
The University Ontology Benchmark (UOBM)
6
Bigscholar 2014, April 8, Seoul, South KoreaSlide7
Background
Optimized backward-chaining algorithmgenerate a query response for a given query pattern based on a specific rule set (RDFS
, Horst, custom)
Ordered
Selection Function
Switching between Binding Propagation and Free Variable Resolution Avoid Repetition and Non-Termination (OLDT)owl:sameAs Optimization Bigscholar 2014, April 8, Seoul, South Korea7Slide8
Hybrid reasoner
Motivation example
Assume fully materialized KB
Harvester adds new fact: student0 enrolled course0
Query ‘Who is enrolled in course 0?’ ok
Assume fact Porf0 teaches course0 in KBQuery “Who is being taught by Prof0?” not ok as simple lookup; needs reasoning with rule such as:enrolledIn(?Student,?Course?), teaches(?Faculty,?Course) :- isTaughtBy(?Student,?faculty)Bigscholar 2014, April 8, Seoul, South Korea8Slide9
Hybrid reasoner
Mark region of KB ‘trusted’ that is not affected by changeHybrid algorithm:
If a
goal
is in trusted region then return substitutions from KB
else for each rule R and substitution σ1 such that the head of R σ1 matches goal proveTheRuleBody (R.body, σ1)proveTheRuleBody: prove each goal in the rule body one by one recursively9Bigscholar 2014, April 8, Seoul, South KoreaSlide10
Trustworthy Goals
proof goal p(?X,?Y) is trustworthy if all instances of that goal derivable from facts and rules in the knowledge base are present in that knowledge base as instances
In practice, we will need to approximate set of trustworthy goals
A partition into trusted and untrusted sets is called
conservative
if no untrustworthy goals are trusted10Bigscholar 2014, April 8, Seoul, South KoreaSlide11
Approximation 1: Trusted Properties
property-based trust: assume that any property P that was involved in a change is itself untrusted
take
the closure of the “is used as a premise of” relation,
P
occurs in the body of a rule used to prove R …, P(x,y), … :- R(w,z) then R is also untrusted.Bigscholar 2014, April 8, Seoul, South Korea11Slide12
Approximation 1: Trusted Properties
property-based trust breaks down in the face of “meta-rules” in the knowledge base, rules that permit reasoning about properties
themselves, e.g., inverse rule
special handling of the meta-rules common to RDF and
OWL
result in significant fractions of the knowledge base being marked as untrusted unnecessarily12Bigscholar 2014, April 8, Seoul, South KoreaSlide13
Approximation 2: Trusted Patterns
Pattern-based trust: a pattern P(X,Y) (where X and Y could be ground instances or free variables) is
untrusted
if
it matches a change to the knowledge base or
if it can be derived from a rule with an untrusted pattern as a premiseOffers finer discrimination than property-based Bigscholar 2014, April 8, Seoul, South Korea13Slide14
Computing Untrusted Pattern
Marking algorithm
Add change
Check
each rule in the rule set to see if we can propagate the “
untrust” forward by a limited, specialized analogue of forward chainingAdd untrusted set produced from the above one change to the existing untrusted set, discarding any patterns that are specializations of other elements14Bigscholar 2014, April 8, Seoul, South KoreaSlide15
Untrusted Pattern
ExampleHarvester adds ‘worksFor(Fullprofessor0, University0)’
Marking algorithm discovers as untrusted
worksFor
(Fullprofessor0, University0)
member (University0, Fullprofessor0)memberOf (Fullprofessor0, University0)Query: “Who are members of University0?” needs reasoningQuery: “Who are members of University1” ok for direct retrieval as memberOf(?x, University1) is trustedBigscholar 2014, April 8, Seoul, South Korea15Slide16
Evaluation: property-based
16
Bigscholar 2014, April 8, Seoul, South Korea
Changes
Actual
# new properties
Actual
# new facts
#
untrusted properties
Adding a new class
2
3
12
Add a subclass relationship between two new classes
2
6
12
Add new Class as subClass of existing class
2
5
12
Adding a new Property
2
2
12
Add a new Property as subPropertyOf of another new Property
2
4
12
Add new Property as subPropertyOf of existing Property
2
3
12
Add new Class as domain to a new Property
3
5
13
Add new Class as range to a new Property
3
5
13Slide17
Evaluation: pattern-based
Produces the same number of properties as the ‘actual’ columns show
comparison of performance of our hybrid pattern-based proof algorithm against our regular, optimized backward chaining algorithm
and
against the OWLIM using LUBM1, LUBM10, and LUBM40, of size 100,839, 1,272,871, and 5,307,754
objects Query response time (ms) after adding student
17
Bigscholar 2014, April 8, Seoul, South KoreaSlide18
Evaluation
Query
response time (
ms
) after
adding undergraduate studentpercentage of untrusted facts in KB ranges 0 to a high of 10%percentage of untrusted patterns in KB ranges 0 to a high of 5%18
Bigscholar 2014, April 8, Seoul, South KoreaSlide19
19
Conclusions
We reported on our efforts to use ‘ trust’ in backward-chaining
reasoners
to accommodate the changing knowledge base.
We have shown that a pattern-based marking algorithm errs on the conservative side at an acceptable level and We show that compared to a forward chaining algorithm and a pure backward chaining algorithm that our hybrid algorithm is better in almost all cases testedBigscholar 2014, April 8, Seoul, South KoreaSlide20
Future Work
Explore the performance of the trust marking algorithm and of the hybrid reasoner
as a function of the fraction of the knowledge base that is
untrusted
Explore
the impact of long sequences of individual changes on the marking algorithm time and subsequently on the hybrid reasonerExplore performance of the hybrid reasoner as a function of the overall degree of inter-connection within the knowledge base semantics as a loosely connected network will lead to faster termination of the trust marking algorithm20Bigscholar 2014, April 8, Seoul, South Korea