Casebased reasoning Introduction Common term in everyday language where two objects usually are considered similar if they look or sound similar Similarity is a core concept within CBR From a CBR perspective Two problems are similar if they have similar solutions ID: 559832
Download Presentation The PPT/PDF document "Chapter 6 - Basic Similarity Topics" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Chapter 6 - Basic Similarity Topics
Case-based reasoningSlide2
Introduction
Common term in everyday language, where two objects usually are considered similar if they look or sound similar
Similarity is a core concept within CBR
From a CBR perspective: «Two problems are similar if they have similar solutions»
Not as clear defined as the term equality
Accepted that similarity is subjective and requires approximate rather than exact reasoningSlide3
Similarity and case representation
Similarity measures are defined to compare objects (cases)
The measures operate on the case representation
Similarity is the essential function used for retrieval and the link between case representation and retrieval
Only consider attribute-value case representations and attribute-based similarity measuresSlide4
The mathematics of similarity
Two influencing factors:
Fuzzy sets
offers a background to model inexact expressions. Do not deal with classical yes-or-no answers, but rather ones that have vague character
Metrics
are used in mathematics whenever approximations (rather than exact solutions) are involved. This make them suitable for modeling similarity
Similarity measures may inherit and benefit from properties of these two factors. Examples of such properties are symmetry, transitivity, etc.Slide5
Two mathematical models of similarity
Similarity as a
relation
:
Qualitative measure comparing different similarities
Example: two objects are more similar to each other than two other objects
R(x,y,z) ⇔ «x is at least as similar to y as x is to z»
Allows the definition the nearest neighbour concept
The nearest neighbor of
x
is the
y for which the R-relation above holds for all z
Example of k-NN where k=3Slide6
Two mathematical models of similarity
Similarity as a
function
:
Make similarity quantitative by expressing how similar two objects are
Assigning a number/degree of similarity to pairs of objects
Def.: A similarity measure for a problem space P is a function
sim: P x
P
→
[0,1
]
Example of similarity functions and how they may be compared
sim (x,y) ≥ sim (x,z) ⇔ «x is at least as similar to y as x to z»Slide7
Distances
Proxy to similarities, both look at the same object from different point of view
In most situations we can freely choose between distances and similarities
It is possible to convert between similarities and distances. However, such a transformation may not necessarily conserve the exact numerical similarity/distance valuesSlide8
Types of similarity measures
Counting similarities
Metric similarities
Transformation similarities
Structure-oriented similarities
Information-oriented similarities
Relevance-oriented similarities
Dynamic-oriented similaritiesSlide9
Types of similarity measures
Counting similarities
Metric similarities
Transformation similarities
Structure-oriented similarities
Information-oriented similarities
Relevance-oriented similarities
Dynamic-oriented similarities
Measures similarity by counting certain occurrences in the representation
Count the number of family members for tax purposes
Example: Hamming measuresSlide10
Types of similarity measures
Counting similarities
Metric similarities
Transformation similarities
Structure-oriented similarities
Information-oriented similarities
Relevance-oriented similarities
Dynamic-oriented similarities
Applicable to attributes with numerical values
Arise as variations of Euclidean metrics
Typically distance functions that represent a travel viewSlide11
Types of similarity measures
Counting similarities
Metric similarities
Transformation similarities
Structure-oriented similarities
Information-oriented similarities
Relevance-oriented similarities
Dynamic-oriented similarities
The measure counts the number of operations required to transform one object into another
Example: Levenshtein distance. Uses insertion, deletion and modification as possible change actions and counts the number of changes requiredSlide12
Types of similarity measures
Counting similarities
Metric similarities
Transformation similarities
Structure-oriented similarities
Information-oriented similarities
Relevance-oriented similarities
Dynamic-oriented similarities
The
structure
in
which
the
knowledge
is
presented
plays
a
role
, e.g.
object
-orient
representation
Refers mainly to attributes that have
symbolic
attribute
values from with the attribute-based structure is
builtSlide13
Types of similarity measures
Counting similarities
Metric similarities
Transformation similarities
Structure-oriented similarity
Information-oriented similarities
Relevance-oriented similarities
Dynamic-oriented similarities
Information and knowledge plays an essential role
Often used for texts; considered similar if they provide similar information to the userSlide14
Types of similarity measures
Counting similarities
Metric similarities
Transformation similarities
Structure-oriented similarity
Information-oriented similarities
Relevance-oriented similarities
Dynamic-oriented similarities
Weight the importance of different aspects contributing to similarity
Not a type in itself, but rather may rather be used in combination with the other typesSlide15
Types of similarity measures
Counting similarities
Metric similarities
Transformation similarities
Structure-oriented similarity
Information-oriented similarities
Relevance-oriented similarities
Dynamic-oriented similarities
Consider and compare dynamic processesSlide16
Local-global principle of similarity
Useful when dealing with complex structures
The principle: Each object is constructed from atomic parts, by some construction process.
Possible to compare the atomic parts by using local measures, before comparing the more complex structure.
Determine the influence of each one of the local parts should have on the global measure by assigning weights to each part
Difficult problem to determine the weightsSlide17
Virtual attributes
A problem with the local-global principle arises when there are dependencies between the attributes that influence similarity
Example: bank
loans
Reliability
for getting a
loan
depends
on
both income and spendingAssigning weights to independent attributes make little sense
Introduce additional attributes that reflect the dependencies explicitly
Such attributes are defined in terms of the given attributes and are called virtual attributes
Allows simpler similarity measureSlide18
Which similarity measure should be used?
Some influencing factors for the choice are:
Case representation
Size of case base
Efficiency needed for retrieval
Number of values in the domain of the attributes
Useful guidelines:
Try to ensure compatibility between case representation and the similarity measure
If possible, apply the local-global principle for complex structuresSlide19
Summary
Link between case representation and retrieval
There is no clear definition of the concept and there exists a variety of different types of measures
Similarity measures are heavily influenced by mathematics. Two mathematical ways to represent similarity is as a function or as a
relation
The local-global principle may
also
apply
to
similarity measuresWhat type of similarity measure that should be used depends on the objects to be comparedSlide20
Comments
Few
comparisons
,
missing an overview of the differences between the different types of similarity measuresMainly descriptive presentation, making it difficult to distinguish between the different measuresWhat that the implications of choosing one type of measure over the other
In a later
chapter
?