AdHoc Network Using Google Core Distance to extract the most relevant information Presenter Wei Hao Huang Authors PingI Chen ShiJen Lin KBS 2010 2 Outlines ID: 593818
Download Presentation The PPT/PDF document "1 Word" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
1
Word AdHoc Network: Using Google Core Distance to extract the most relevant information
Presenter : Wei-
Hao
Huang
Authors
:
Ping-I
Chen,
Shi-Jen Lin
KBS
2010Slide2
2
OutlinesMotivation
Objectives
Methodology
Experiments
Conclusions
CommentsSlide3
3
MotivationMost previous research methods need
predictive models
, which are based on the
training data
or
Web log
of the users’ browsing behaviors.
Those are
complexity
and the
keyword extraction methods are
limited to certain areas
.Slide4
Objectives
4
To present
a new algorithm called ‘‘
Word
AdHoc
Network
’’ (WANET
).
This method needs
no pre-processing
, and all the executions are
real-time. To extract any keyword sequence from various knowledge domains.
Document
WANET System
Relevant DocumentsSlide5
5
MethodologyWord AdHoc Network System Architecture
1-gram filtering method
Part-of-speech
Length of the words
Number of Google search results
Google Core Distance
Hop-by-Hop Routing algorithm
PageRank algorithm
BB’s graph-based clustering algorithmSlide6
WANET System Architecture6Slide7
1-gram filtering methodPart-of-speechNN (common noun, singular), NP (proper noun), DT (determiner), or JJ (adjectives)Length of the wordsAt least 3 wordNumber of Google search results7Slide8
Google Core DistanceThe original algorithm NGDThe New algorithm GCD 8Slide9
Hop-by-Hop Routing AlgorithmPageRank algorithm9Slide10
Hop-by-Hop Routing AlgorithmBB’s graph-based clustering algorithm10
BB score =
Slide11
Hop-by-Hop Routing Algorithm11Slide12
ExperimentsTime variance effect of the Google search resultsExecution timePrecision and recall rateTop-k search results analysisDataset:To select four knowledge domains from the Elsevier Web site, and to chose the top 25 most-downloaded papers in each journal.12Slide13
Time variance effect of the Google search resultsTo use spearman’s footrule to compare the sequences that were extracted by those two algorithm.13Slide14
Execution time14Slide15
Precision and recall rate15Slide16
Top-k search results analysis16Slide17
17
Conclusions
To propos
a new system that can extract
the most
important keyword sequence
to represent a document
To help users
automatically
find relevant documents
or Web
pages.Future workTo hope it can used in a mobile device or an e-book.Slide18
18
CommentsAdvantagesTo extract the most important keyword
sequence
.
Applications
Information retrieval