/
Mining Subjective Properties Mining Subjective Properties

Mining Subjective Properties - PowerPoint Presentation

tatyana-admore
tatyana-admore . @tatyana-admore
Follow
422 views
Uploaded On 2017-04-03

Mining Subjective Properties - PPT Presentation

from the Web Writers Immanuel Trummer Alon Halevy Hongrae Lee Sunita Sarawagi Rahul Gupta Presenting Amir Taubenfeld Outline for Todays Lecture Motivation Search future is in structured data ID: 533003

user opinion system dominant opinion user dominant system web model estimating behavior modeling data extracting parameters surveyor subjective probability bayesian overview statements

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Mining Subjective Properties" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Mining Subjective Propertiesfrom the Web

Writers:

Immanuel Trummer, Alon Halevy, Hongrae Lee,

Sunita

Sarawagi

, Rahul

Gupta

Presenting:

Amir TaubenfeldSlide2

Outline for Today’s Lecture

Motivation: Search future is in structured data

Introduction

to The

Surveyor

System

Getting into the details:

Extracting

subjective properties from the web and

polarity

of

statements

Determine the

dominant opinion

of the authors of the Web

Experimental Evaluation & conclusionSlide3

Outline for Today’s Lecture

Motivation: Search future is in structured data

Introduction

to The

Surveyor System

Getting into the details:

Extracting

subjective properties from the web and

polarity

of statements

Determine the

dominant opinion

of the authors of the Web

Experimental Evaluation & conclusionSlide4

Answering queries with linksSlide5

Answering queries with links – Page Rank

Page Rank

the algorithm from which Google

began.

 

It calculates the probability that

a

person

randomly clicking on links will arrive

at

any

particular pageSlide6

Answering queries with links

Page Rank is awesome!

It shows us the best links for the words we are looking for, but it

does not understand

our queries.Slide7

Now: Answering queries from Structured DataSlide8

Now: Answering queries from Structured Data

We all remember

YAGO

from Databases courseSlide9

Now: Answering queries from Structured Data

But regular knowledge bases don’t capture

subjective

propertiesSlide10

Current limitation: Objective Queries

Need Subjective knowledge baseSlide11

Objective: Create a subjective knowledge base.

Main challenge:

No ground of truth - Need to aggregate many opinions.

Subjective Property MiningSlide12

Outline for Today’s LectureMotivation: Search future is in structured data

Introduction

to The

Surveyor System

Getting into the details:

Extracting

subjective properties from the web and

polarity

of statements

Determine the

dominant opinion

of the authors of the Web

Experimental Evaluation & conclusionSlide13

The Surveyor System

Specialized system for mining subjective properties from the Web

(

Surveyor

derived from

survey

)Slide14

Extract statements involving entities and

subjective properties

from the Web

Determine the

polarity

of each statement

Aggregate the results

Determine the

dominant opinion

System overview -

Course of actionSlide15

System overview – Extraction & aggregationSlide16

System overview – dominant opinion

Is it enough for concluding that kittens

are cute and tigers are not cuteSlide17

System overview – dominant opinion

Is it enough for concluding that kittens

are cute and tigers are not cute

Of course not, otherwise I wouldn’t

have asked this question. But why?Slide18

System overview – dominant opinion

(Tel-Aviv, Safe city)

pro: 5, contra: 10

 Slide19

System overview – dominant opinion

(Tel-Aviv, Safe city)

pro: 5, contra: 10

 

Does it means that

Tel-Aviv

is not safe?Slide20

System overview – dominant opinion

(Tel-Aviv, Safe city)

pro: 5, contra: 10

 

Does it means that

Tel-Aviv

is not safe?

No! we must consider skewSlide21

System overview – dominant opinion

(

Ibtin

, Big city)

pro: 0, contra: 0

 

Can we take it into our advantage?Slide22

System overview – dominant opinion

Can we take it into our advantage?Slide23

System overview – dominant opinion

Can we take it into our advantage?

Big cities tend to be mentioned more

Often on the Web than small citiesSlide24

System overview – dominant opinion

Example: Taking skew and correlation

into account improves the model.Slide25

System overview – dominant opinion

Conclusion I:

Skew &correlation

exists, and we can use

them for our advantage.

Conclusion II:

Skew

and

correlation are

property

& type

specific.

Slide26

System overview – Putting it

togetherSlide27

Outline for Today’s LectureMotivation: Search future is in structured data

Introduction

to The

Surveyor System

Getting into the details:

Extracting

subjective properties from the web and polarity

of statements

Determine the

dominant opinion

of the authors of the Web

Experimental Evaluation & conclusionSlide28

Getting into the details of Surveyor

The Surveyor system can be divided into two main algorithms:

Extracting

evidence

about entities from the web.

Evidence = Statement

connecting an entity to a property

Aggregating

the evidences from previous step and determine the

dominant opinionSlide29

Outline for Today’s LectureMotivation: Search future is in structured data

Introduction

to The

Surveyor System

Getting into the details:

Extracting

subjective properties from the web and

polarity

of statements

Determine the

dominant opinion

of the authors of the Web

Experimental Evaluation & conclusionSlide30

Input: Collection of annotated web documentsKnowledge base containing entities and their types.

Output:

Set of tuples

where

.

.

.

 

Extracting evidences – problem definitionSlide31

Surveyor system receives as an input a web snapshot was preprocessed with NLP methods such as Stanford Parser.

The output of such parsers is a

dependencies tree

that represents the lexical structure of the sentence.

Extracting evidences - annotating documentsSlide32

Natural language processing is a very interesting story. Unfortunately it is also a very

long

story, thus we will only discuss it in a

nutshell

60

seconds on NLPSlide33

60

seconds on NLP

Natural language processing refers to the ability of computers to process text in

natural human language

such as Hebrew, rather than artificial language such as

JavaSlide34

60

seconds on NLP

In order to do that we need to parse natural language text into a more

formal representation

, usually this representation is a

treeSlide35

One basic model uses a probabilistic CFG in order to create a parsing tree that

represents a formal

structure of a given

sentence

60

seconds on NLPSlide36

Each derivation rule have a different probability,and the goal is to find the parsing tree with the

highest total probability.

Naïve algorithm has

an

exponential

running

time, but using DP we

can get

polynomial

complexity

60

seconds on NLPSlide37

Back to the paperSlide38

Extracting evidences - matching patterns

Red = Tokens that together form the property.

Green = The entities.Slide39

Extracting evidences – filtering

New York is bad

France is warm

Greece is a southern country

for

parking

SouthernSlide40

Extracting evidences – filtering

New York is bad

France is warm

Greece is a southern country

for

parking

SouthernSlide41

Extracting evidences – filtering

New York is bad

France is warm

Greece is a southern country

Solution: Check for subtrees that can represent

constrictions.

for

parking

SouthernSlide42

Extracting evidences – filtering

New York is bad

France is warm

Greece is a southern country

Solution: Check for subtrees that can represent

constrictions.

for

parking

SouthernSlide43

Extracting evidences – filtering

New York is bad

France is warm

Greece is a southern country

Solution: Check for subtrees that can represent

constrictions.

Solution: Don’t allow co-reference to the same

entity.

for

parking

SouthernSlide44

Extracting evidences – determine polaritySlide45

Outline for Today’s LectureMotivation: Search future is in structured data

Introduction

to The

Surveyor System

Getting into the details:

Extracting

subjective properties from the web and polarity of statements

Determine the

dominant opinion

of the authors of the Web

Experimental Evaluation & conclusionSlide46

Estimating the dominant opinion -

problem definition

Input:

Knowledge base that links types to entities.

Set

of

tuples

(from stage 1)

where

Output:

The dominant of whether P applies to E

 Slide47

Estimating the dominant opinion

As we saw previously, estimating the dominant

opinion based on the majority vote counting

does not work very well.

We must take into consideration different types

of biases.Slide48

Estimating the dominant opinion

Each property-type combination is associated with

two

probability distributions over the statement counters

.

The first distribution represent the probability for an

evidence, given that the

dominant

opinion

applies

the

property to the

entity

,

whereas in

the second

,

the

dominant opinion does

not apply.Slide49

Estimating the dominant opinion

Therefore, we assume that each evidence tuple was

drawn from

one of the two

possible probability

If we know how to express those two probability

distributions,

then we can calculate

for each

evidence tuple

the probability with which it was

drawn from one distribution or the

otherSlide50

Estimating the dominant opinionSlide51

Modeling user behavior

In

order to model the probability to receive a certain

number

of positive or negative statements, we must

model

the probability that a single user decides to

issue

a

positive or negative statement. Slide52

Modeling user behaviorSlide53

Modeling user behavior

Now, lets write our model as a

Bayesian networkSlide54

Modeling user behavior

Now, lets write our model as a

Bayesian network

But first, what is a Bayesian network?Slide55

Bayesian

networks - definition

From Wikipedia:

Model that represent a set of

random variables and their conditional

dependencies via a directed acyclic graph

Rain

Sprinkler

Grass wetSlide56

Bayesian

networks - exampleSlide57

Bayesian

networks - example

If we know that the grass is wet, we can calculate

the probability that it was raining.

 Slide58

Bayesian

networks - example

We calculate each element in the sum by the tables

=

)

 Slide59

Bayesian

networks - example

We calculate each element in the sum by the tables

=

)

 Slide60

Modeling user behavior

Now we are ready to write our model as a

Bayesian Network

 Slide61

Modeling user behavior

 

Dominant opinionSlide62

Modeling user behavior

 

Dominant opinion

User

n

opinion

User 1 opinion

…Slide63

Modeling user behavior

 

Dominant opinion

User

n

opinion

User 1 opinion

User 1 post opinion?

User

n

post opinion?

…Slide64

Modeling user behavior

 

Dominant opinion

User

n

opinion

User 1 opinion

User 1 post opinion?

User

n

post opinion?

# Pro statements

#

Contra statements

…Slide65

Modeling user behavior

 

Dominant opinion

User

n

opinion

User 1 opinion

User 1 post opinion?

User

n

post opinion?

# Pro statements

#

Contra statements

Given

To infer

…Slide66

Modeling user behavior

 

 

 

 

 

 

,

 

=

D

ominant opinion

= User’s

O

pinion

=

User makes

S

tatement

=

C

ount

 Slide67

Modeling user behavior

 

 

 

 

,

 

=

D

ominant opinion

= User’s

O

pinion

=

User makes

S

tatement

=

C

ount

 Slide68

Modeling user behavior

 

 

 

 

,

 

=

D

ominant opinion

= User’s

O

pinion

=

User makes

S

tatement

=

C

ount

 Slide69

Modeling user behavior

Our goal is to compute

 

=

D

ominant opinion

= User’s

O

pinion

=

User makes

S

tatement

=

C

ount

 Slide70

Modeling user behavior

Our goal is to compute

 

Which is from bayes:

 

=

D

ominant opinion

= User’s

O

pinion

=

User makes

S

tatement

=

C

ount

 Slide71

Modeling user behavior

Our goal is to compute

 

Which is from bayes:

 

Because

is a deterministic function of

We first solve

 

=

D

ominant opinion

= User’s

O

pinion

=

User makes

S

tatement

=

C

ount

 Slide72

Modeling user behavior

=

=

 

=

D

ominant opinion

= User’s

O

pinion

=

User makes

S

tatement

=

C

ount

 

 

 

 

 

And from the

Bayesian network

, we obtainSlide73

Modeling user behavior

The variables

are

obtained by summing up n

variables

each of which can be +,−, or neutral.

We

assume that the variables

are

independent

for

different

since the chances that two randomly

selected

documents on the Web are authored by the

same

person are negligible.

This

implies that

follows a

Multinomial

distribution

where

Where

=

 Slide74

Modeling user behavior

And by assuming the n

is very big ,

compare to

,

we can estimate the distribution as a multiplication of

two Poisson distribution

Where

 Slide75

Modeling user behavior

The

bottom line

is this:

If we know

, than we can use our Bayesian network to compute

two

expressions that

depend on

,

.

The first represent the probability distribution when

the dominant opinion is

positive

,

whereas the second,

represent

the

probability distribution when

the dominant opinion is

negative

 Slide76

Modeling user behavior

Example: If we assume

,

The agreement parameter is relatively high,

and the probability to post a statement when the

dominant opinion is ‘+’, is significantly higher than when

it is ‘-’. We get the two distributions that we saw

on the big cities example

 Slide77

Estimating model parameters

But how do we choose

,

?

 Slide78

Estimating model parameters

But how do we choose

,

?

 

Well, it is kind of

maximum

likelihood

problemSlide79

Maximum Likelihood - Remainder

Method for

estimating the parameters

of a

statistical model given a set of sample data

Example: Estimating

Bernoulli

parameter.

Assumption:

Input:

 Slide80

Estimating model parameters

But in our case we don’t have all the sample dataSlide81

Estimating model parameters

But in our case we don’t have all the sample dataSlide82

Estimating model

parameters - definitions

Set of tuples with positive/negative

E

vidence

count

Vector of random variables that represent the

possible

D

ominant

opinion for each entity

Vector

containing the parameters for our Bayesian

network (which we are trying to estimate)

 Slide83

Estimating model parametersSlide84

Estimating model parametersSlide85

Estimating model parameters

Done using the Bayesian network we defined

Previously

Done by a special ML method that take into

account all possible sample data sets. Each data set

is weighted using the probabilities from the

previous step

6:

7:Slide86

Estimating model

parameters - definitions

 

The “

expectation maximization

” function which

we want to maximize:

Expansion of the ML function that considers all the data sets, each one with

different weight

The

weight

function is the probability function that we computed in the

previous step

Weight

 Slide87

Estimating model parameters

 

Problem:

Exponential

number of terms Slide88

Estimating model parameters

 

 

Problem:

Exponential

number of terms

Thus, we will work with a summarized

version, that is

linear

in the number of entitiesSlide89

Estimating model parameters

 

Last step - differentiate and compare to 0Slide90

Estimating model parameters

 

Last step - differentiate and compare to 0

I will leave it for you as a home exerciseSlide91

Estimating model parameters

 

Last step - differentiate and compare to 0

I will leave it for you as a home exercise

Submit it to Schreiber

cell 777Slide92

Outline for Today’s LectureMotivation: Search future is in structured data

Introduction

to The

Surveyor System

Getting into the details:

Extracting

subjective properties from the web and polarity of statements

Determine the

dominant opinion

of the authors of the Web

Experimental Evaluation & conclusionSlide93

Surveyor was applied on a 40TB annotated web snapshotThe data processing pipeline was executed on a large cluster (5000 nodes) and took 2 hoursInferred dominant opinion for over 4 billion entity-property pairs

Experiment evaluationSlide94

Surveyor was applied on a 40TB annotated web snapshotThe data processing pipeline was executed on a large cluster (5000 nodes) and took 2 hoursInferred dominant opinion for over 4 billion entity-property pairs

Statistics Slide95

Selected 500 entity-property pairs: 5 types X 20 entities X 5 propertiesCompared against 20 AMT workers

Each worker was asked about each of the 500 entity property pairs. In total 10000 opinions

Experiment against AMT workersSlide96

Experiment against AMT workersSlide97

Experiment against AMT workersSlide98

Introduced a new problem of “Subjective Property Mining”There is a need for special type of systems to solve this problemIntroduced Surveyor system

ConclusionsSlide99