/
Concept-Based Analysis of Concept-Based Analysis of

Concept-Based Analysis of - PowerPoint Presentation

pasty-toler
pasty-toler . @pasty-toler
Follow
397 views
Uploaded On 2017-08-19

Concept-Based Analysis of - PPT Presentation

Scientific Literature ChenTse Tsai Gourab Kundu Dan Roth CS UIUC Understanding Research Communities Consider following questions What are the key applications studied by the community ID: 580170

mentions concept 1995 based concept mentions based 1995 computational research decision context svm concepts vector mention maximal linguistic cortes quinlan cluster 1993

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Concept-Based Analysis of" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Concept-Based Analysis of Scientific Literature

Chen-Tse

Tsai

,

Gourab Kundu, Dan Roth

CS @ UIUCSlide2

Understanding Research CommunitiesConsider following questionsWhat are the key applications studied by the

community?

What

applications have matured enough to be used as a technique of other applications?What methods were developed to solve a particular problem? In this paperExtract concepts from scientific papersA concept is a cluster of possible mentions{svm, support vector machines, maximal margin classifiers,…}Analyze computational linguistic research by answering above questions

2Slide3

OutlineComputational ApproachConcept Mention ExtractionCitation-Context based Concept Clustering

Evaluation of Algorithms

Understanding Computational Linguistic Research

3Slide4

Concept Mention ExtractionIdentify and categorize mentions of concepts (Gupta and Manning, 2011)

TECHNIQUE

and

APPLICATION “We apply support vector machines on text classification.”Unsupervised Bootstrapping algorithm (Yarowsky, 1995; Collins and Singer, 1999)The proposed algorithmExtract noun phrases

(

Punyakanok

and Roth, 2001)

For

each

category, initialize a decision list by seeds.

For several rounds,Annotate NPs using the decision lists.Extract top features from new annotated phrases, and add them into decision lists.

4Slide5

Paper1……………………………………

support vector machine

………………... …………………………………………………………………………………….

c4.5

……..

Paper2……………………………………

svm-based classification

…………………

.………………………………….............

decision

_

trees

………….…….…………………………………

Paper4……………………………………

maximal_margin_classifiers…………………………………….…………………………………………………………………..

Paper3.…………………………………………………………………………..svm….…………………………………….……………………………………………………

(Cortes,1995)

(Quinlan,1993)

(Vapnik,1995)

(Vapnik,1995)

(Quinlan,1993)

(Cortes,1995)

(Cortes,1995)

(Quinlan,1993)

(Vapnik,1995)

(Vapnik,1995)

(Quinlan,1993)

(Cortes,1995)

c4.5decision trees

s

upport vector machinesvm-based classification

svm

maximal margin classifiers

Citation-Context Based Concept Clustering(CitClus)

Cluster mentions into semantic coherent concepts

Group concept mentions by citation context

Merge clusters based on lexical similarity between mentions in the clustersSlide6

OutlineComputational ApproachConcept Mention Extraction

Citation-Context based Concept Clustering

Evaluation of Algorithms

Understanding Computational Linguistic Research 6Slide7

Evaluation of Mention ExtractionACL Anthology Network Corpus (Radev

et al., 2009)

Training data: 11,005 abstracts

Test data: 474 abstracts (Gupta and Manning 2011)7

Approach

Technique

Application

Pre.

Rec.

F1

Pre.

Rec.

F1

GM 2011

30.5

46.7

36.9

27.6

57.5

37.3

Our approach

48.2

48.8

48.5

44.0

47.3

45.6Slide8

Evaluation of Concept ClusteringManually cluster the extracted mentions from 1000 full text papers. CitClus

: the proposed

approach

LexClus: group the concept mentions by lexical similarityCitClus groups “maximal entropy classifier” and “logistic classifier”“topic modeling” and “

latent dirichlet allocation

8

Approach

Technique

Application

LexClus

1.72

1.62

CitClus

1.28

1.49Slide9

OutlineComputational ApproachConcept Mention Extraction

Citation-Context based Concept Clustering

Evaluation of Algorithms

Understanding Computational Linguistic Research 9Slide10

Trends Analysis10

CitClus

LexClus

LDA

The emergence of SVM

The emergence of

Topic modeling

Topic modeling is high in 90’s, because LDA cannot generate a tight enough cluster for a specific concept Slide11

Predictive QualityFor a concept, predict the number of papers in a year, given the number of papers in the previous three years

Linear regression over every three consecutive years

The better the grouping of mentions into coherent concept is, the more stable the trend graph is.

11

Approach

SVM

Decision

Tree

Topic

Modeling

Sentiment Analysis

LexClus

0.97

0.83

0.73

0.48

CitClus

0.52

0.37

0.37

0.46Slide12

Relations Between Concept CategoriesFor a given concept, calculate the ratio between number of application mentions and technique mentions.Three concepts in ACL community

S

upport vector machines, Machine translation, POS tagging

12

SVM,

#app/#tech

MT, #tech/#app

POS tagging, #tech/#appSlide13

Relations Between Concept CategoriesFor a given application, what techniques have been applied to it.

13

Machine translation

Named entity recognition

Phrase-based and MERT

Decision Tree

Decision Tree disappears

CRF Slide14

ConclusionThis work proposed algorithms for identifying, categorizing and clustering mentions of scientific concepts.These tools can provide rather deep understanding and useful insight of research communities.

14

Named entity recognition