WEB FORUM MINING BASED ON USER SATISFACTION

WEB FORUM MINING BASED ON USER SATISFACTION - Description

. Page . 1. WEB FORUM MINING BASED ON USER SATISFACTION . By:. Suresh Pokharel. Information and Communications Technologies. Asian Institute of Technology. Committee:. Dr. Sumanta Guha . ID: 387279 Download Presentation

43K - views

WEB FORUM MINING BASED ON USER SATISFACTION

. Page . 1. WEB FORUM MINING BASED ON USER SATISFACTION . By:. Suresh Pokharel. Information and Communications Technologies. Asian Institute of Technology. Committee:. Dr. Sumanta Guha .

Similar presentations


Download Presentation

WEB FORUM MINING BASED ON USER SATISFACTION




Download Presentation - The PPT/PDF document "WEB FORUM MINING BASED ON USER SATISFACT..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.



Presentation on theme: "WEB FORUM MINING BASED ON USER SATISFACTION"— Presentation transcript:

Slide1

WEB FORUM MINING BASED ON USER SATISFACTION

Page 1

WEB FORUM MINING BASED ON USER SATISFACTION

By:Suresh PokharelInformation and Communications TechnologiesAsian Institute of TechnologyCommittee:Dr. Sumanta Guha (Chairperson)Prof. Phan Minh Dung Assoc. Prof. Tapio J. ErkeMay 2010

Slide2

Introduction

Problem Statement

Objectives

Methodology Implementation Results and Discussion Conclusion and Future Work Demonstration

WEB FORUM MINING BASED ON USER SATISFACTION

Page 2

Agendas

Introduction

Slide3

Introduction

Internet-Forum or Message Board

Online Discussion Site

Asynchronous

People participating in an Internet forum may cultivate social bonds and interest groups for a topic may form from the discussions

WEB FORUM MINING BASED ON USER SATISFACTION

Page 3

Slide4

Figure 1: Organization of Threads

WEB FORUM MINING BASED ON USER SATISFACTION

Page

4

Introduction

Slide5

Table 1: An example of a thread

WEB FORUM MINING BASED ON USER SATISFACTION

Page

5

Introduction

Title: Software for Ubuntu.

Post No.

Post

Users

Category

1I am really new to Linux, where can I find software for Ubuntu? avacomputers Question 2Applications>Add/remove http://www.getdeb.net/ If you are using Ubuntu Firefox, you can also use http://allmyapps.com/danielrmt Answer 3http://www.getdeb.net theozzlives Answer 4something else I was reading about is a port from OzOs (http://www.cafelinux.org/OzOs/) called apt:foo devildoc5 Answer 5or the easy way >.> CLICK SYSTEM > PREFERENCES > SYNAPTIC PACKAGE MANAGER there you can search all kinds of games, software! anything you like, thousands to choose from! They download and install easily onto your system Codix121 Answer 6Thanks GUys. avacomputers Answer

Questioner

Repliers

Questioner

Questioner Post

Slide6

Introduction

Problem Statement

ObjectivesMethodology Implementation Results and Discussion Conclusion and Future Work Demonstration

WEB FORUM MINING BASED ON USER SATISFACTION

Page 6

Agendas

Slide7

Problem Statement

Which forum may have solution?

Lots of Forums…… Ooops

WEB FORUM MINING BASED ON USER SATISFACTION

Page

7

I don’t want to test all forums…

Slide8

Introduction

Problem Statement

Objectives

Methodology Implementation Results and Discussion Conclusion and Future Work Demonstration

WEB FORUM MINING BASED ON USER SATISFACTION

Page 8

Agendas

Slide9

Objectives

To categorize a post as a question post or an answer post.

To classify a thread as answered or unanswered based on questioner’s satisfaction and forum features.

To predict a solution post based on interaction and satisfaction of questioner.

WEB FORUM MINING BASED ON USER SATISFACTION

Page

9

Slide10

Introduction

Problem Statement

Objectives

Methodology Implementation Results and Discussion Conclusion and Future Work Demonstration

WEB FORUM MINING BASED ON USER SATISFACTION

Page 10

Agendas

Slide11

Methodology:

Framework of Study

Figure 4:

Framework of Study

WEB FORUM MINING BASED ON USER SATISFACTION

Page

11

Slide12

Figure 5 : Sentence Classification

WEB FORUM MINING BASED ON USER SATISFACTION

Page

12

Methodology:

Sentence Classification

Example

Slide13

Label Sequential Patterns (LSPs), p, in the form of

<LHS, c>

LHS

is a sequence <a

1, ..., am>, ai is named “item”. c is a class label (question/non-question)A=<abcdefgh> has a subsequence B=<bdeg> A contains B A LSP p1 is contained by p2 if the sequence p1.LHS is contained by p2.LHS and p1.c = p2.c.

Example:t1 = (< a, d, e, f >,Q)t2 = (< a, f, e, f >,Q) t3 = (< d, a, f >,NQ)1 ) LSP p1 = (< a, e, f >, Q)is contained in t1 and t2sup(p1) = 2/3 = 66.7%, conf(p1)=(2/3)/(2/3) = 100%

2) LSP p2 = (< a, f >, Q)sup(p2) = 2/3 = 66.7%, conf(p2)= (2/3)/(3/3) = 66.7%

WEB FORUM MINING BASED ON USER SATISFACTION Page 13

Methodology:

Sentence Classification

Slide14

Mining LSPs

Word length of sequence : 4

Setting minimum

support

at 0.01% and minimum confidence at 95%Converting to features LSP 5W1H word Auxiliary Verb Question MarkThe corresponding feature being set at 1 if a sentence includes a LSP, question mark, start with 5W1H word, or auxiliary verb.

WEB FORUM MINING BASED ON USER SATISFACTION Page 14

Methodology:

Sentence Classification

Slide15

Figure 6 : Classification of Thread

WEB FORUM MINING BASED ON USER SATISFACTION

Page

15

Methodology:

Thread Classification

Slide16

Satisfied Phrase

Derive Features :

Unsatisfied Phrase

WEB FORUM MINING BASED ON USER SATISFACTION

Page

16

Methodology:

Questioner Post Classification

Slide17

Question Present

Presence of More Post

WEB FORUM MINING BASED ON USER SATISFACTION

Page

17

Methodology:

Questioner Post Classification

Derive Features :

Slide18

Satisfied Post Length

WEB FORUM MINING BASED ON USER SATISFACTION

Page

18

Presence of Emoticons

Happy emoticon (

) : mood of satisfaction

Unhappy emoticon () : mood of un-satisfaction

Methodology:

Questioner Post Classification

Derive Features :

Slide19

Original Post

WEB FORUM MINING BASED ON USER SATISFACTION

Page

19

Methodology:

Questioner Post Classification

Derive Features :

Slide20

WEB FORUM MINING BASED ON USER SATISFACTION

Page

20

Figure 7: Classification of Questioner Post

Methodology:

Questioner Post Classification

Slide21

Presence of Quote

Find Solution Post

Presence of User Name

WEB FORUM MINING BASED ON USER SATISFACTION

Page

21

Methodology:

Predict Solution Posts

Slide22

May be between the questioner post

QP

R

R

SP

Solution Posts

WEB FORUM MINING BASED ON USER SATISFACTION

Page

22

Methodology:

Predict Solution Posts

Slide23

Introduction

Problem Statement

Objectives

Scope and Limitation Methodology Implementation Results and Discussion Conclusion and Future Work Demonstration

WEB FORUM MINING BASED ON USER SATISFACTION

Page 23

Agendas

Slide24

Dataset

Forum : Ubuntu (

http://ubuntuforums.org/

)

Sentence classification: datasets of 100, 200 and 300 from 3000 sentences

Questioner Post Classification: 250 posts from 79 threadsManual Evaluation : 100 threads by two team contains 5 person in each team Tools and LanguagePOS Tag, Tokenization, Sentence Detection : OpenNLP Classifier : Support Vector Machine (LibSVM, SMO)Model : SVM is trained using libSVM for classifier modelLanguage : Java

WEB FORUM MINING BASED ON USER SATISFACTION Page 24

Implementation

Slide25

Introduction

Problem Statement

Objectives

Methodology Implementation Results and Discussion Conclusion and Future Work Demonstration

WEB FORUM MINING BASED ON USER SATISFACTION

Page 25

Agendas

Slide26

Features

Accuracy

Recall

Precision

F-Measure

WH word (5W1H)0.590.590.770.67Question Mark (QM)0.860.860.880.87Auxiliary Verb (Aux)0.660.660.780.72 5W1H + QM + Aux 0.880.880.890.88Labeled Sequential Pattern (LSP)0.940.940.940.94QM+Aux+LSP+5W1H (LSP+)0.960.960.960.96

Table 3: Accuracy of Sentence Classification by using LSP+ by Class

WEB FORUM MINING BASED ON USER SATISFACTION Page 26

Result and Discussion :

Sentence Classification Comparison

All the results obtained from10 fold cross validation

Slide27

Table 5: Questioner Post Classification Comparison using Different Features

Features

Precision

Recall

F-Measure

Satisfied Words (SW)

0.780.780.78Unsatisfied Words (UW)0.720.720.72Question (Ques)0.840.850.84More Post (MP)0.830.830.83Word Count (WC)0.700.580.63Happy Emoticon (HE)0.620.550.58Unhappy Emoticon (UE)0.760.560.64Original Post (OP)0.830.760.79Ques+MP0.840.840.84Ques+OP0.860.860.86MP+UE+OP0.880.880.88Ques+MP+OP0.850.850.85SW + UW + Ques0.830.820.83SW+UW+Ques+MP+WC+HE+UE+OP0.910.910.91

WEB FORUM MINING BASED ON USER SATISFACTION Page 27

Result and Discussion :

QP Classification Comparison

Slide28

Performance

Accuracy

Recall

Precision

F-Measure

System with Team A0.840.790.820.80System with Team B0.790.730.780.75Average0.810.760.800.78

Table 6: Comparison of System Result with Manual Evaluation for thread classification

Accuracy RecallPrecisionF-MeasureSystem Accuracy with Team A0.450.650.54System Accuracy with Team B0.430.650.52Average System Accuracy0.440.650.53

Table 7: System Accuracy for Prediction of Solution Posts

WEB FORUM MINING BASED ON USER SATISFACTION Page 28

Result and Discussion :

Comparison with Team’s Evaluation

Slide29

Introduction

Problem Statement

Objectives

Methodology Implementation Results and Discussion Conclusion and Future Work Demonstration

WEB FORUM MINING BASED ON USER SATISFACTION

Page 29

Agendas

Slide30

Conclusion :

Finding answered threads in web forum is achieved by

tracing user satisfaction.

Thread and sentence are classified by deriving different features.Performance of system is increased when combining different features.

WEB FORUM MINING BASED ON USER SATISFACTION Page 30

Conclusion and Future Work:

Conclusion

Slide31

Future Work :

It can be used for

query based raking

of thread.It can be used for extracting answered sentences with better accuracy. The performance of system can be increased by incorporating semantics.

WEB FORUM MINING BASED ON USER SATISFACTION Page 31

Conclusion and Future Work:

Future Work

Slide32

Demo

Slide33

Slide34

Slide35

Slide36

Slide37

Slide38