/
Tagging with DHARMA Tagging with DHARMA

Tagging with DHARMA - PowerPoint Presentation

ellena-manuel
ellena-manuel . @ellena-manuel
Follow
396 views
Uploaded On 2016-09-09

Tagging with DHARMA - PPT Presentation

A DH Tbased A pproach for R esource M apping through A pproximation Luca Maria Aiello Marco Milanesio Giancarlo Ruffo Rossano Schifanella Università degli Studi ID: 463175

aiello 2010 luca maria 2010 aiello maria luca studi degli universit

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Tagging with DHARMA" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Tagging with DHARMAA DHT-based Approach for Resource Mapping through Approximation

Luca Maria Aiello, Marco MilanesioGiancarlo Ruffo, Rossano SchifanellaUniversità degli Studi di TorinoComputer Science Department

Keywords : navigational search, folksonomy, DHT, approximated graph mapping, last.fm

Seventh International Workshop on Hot Topics in Peer-to-Peer Systems

Speaker: Luca Maria Aiello, PhD student

aiello@di.unito.itSlide2

OverviewGoal: enrich the p2p layer with a tag-based navigational search engineTask: mapping a folksonomy on a DHTProblem: mapping dense graphs on distributed layer is very inefficientSolution: approximated mapping using a complexity-bounded algorithmEvaluation: the approximated solution does not upset the structure/semantic of the folksonomy23/04/2010

HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino2Slide3

MotivationsDirect search Navigational searchTaxonomies (e.g. Yahoo! directory): not successfulFolksonomies: successfully emerging thanks to the social webTag-based search engine on DHTs is profitableApplications layered on DHTs use exact match search (e.g. eMule)  folksonomic search adds flexibilityRecent research activity on P2P for privacy-aware online social networksFolksonomic

search engine is an important OSN feature23/04/2010HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino3Slide4

Folksonomies structureRepresented as a tripartite hypergraph23/04/2010HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino4

MetallicaIron Maidenmetalclassic

JohnSlide5

Tag-Resource Graph (TRG)Projection on user dimension23/04/2010HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino5Metallica

Iron Maidenmetalclassic

John

Bipartite graph

Edges are weighted depending on number of tag assignments

3

5

1Slide6

Folksonomy graph (FG)Models tag-to-tag similarity with co-occurrence23/04/2010HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino6

MetallicaIron Maidenmetalclassic

3

5

1

2

5+3=8

1+2=3

8

3

Co-occurrence similarity

is widely known and used

Local

calculation

DirectionalSlide7

Folksonomic searchUser selects a tagRelated resources and tags are displayedRanking based on arc weightsUser can shift to another tag or select a resource23/04/2010HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino7

metalthrashpower

grind

classic

Iron Maiden

Metallica

Manowar

8

3

12

1

5

7

9Slide8

Folksonomy evolutionThe folskonomy grows quickly due to massive user activityResource insertionTag insertion23/04/2010HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino

8Both FG and TRG should be updated properly!Slide9

Folksonomy maintenance23/04/2010HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino9Slide10

Mapping on a DHTMap FG and TRG on the DHT and implement the update operations on the distributed layerIdea:Splitting the FG and TRG into small structural parts or modulesEach module contains a node (tag or resource) and its outgoing edges/arcsEach module is mapped on a different p2p index node23/04/2010HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino10Slide11

Mapping FG on a DHT23/04/2010HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino11Slide12

Mapping TRG on a DHT23/04/2010HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino12Slide13

Folksonomy maintenance on the DHTWhen inserting a new tag or resource, proper modules have to be modifiedComplexity of resource insertion and search are linear with the input size  OK!Tagging operation is linear with |Tags(r)|Tags(r) is the set of tags labeling the resource rHow many tags for a resource?Let’s see a real-world example…

23/04/2010HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino13Insert(r, t1,…, tm)Tag(t,r)Search step#lookups2 + 2m

4 +|Tags(r)|2Slide14

Last.fm folksonomy sample99,405 users, 1,413,657 items and 285,182 different tags23/04/2010HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino14

|Tags(r)|Can be huge!Slide15

An approximate solutionDHARMADHT-based Approach for Resource Mapping through ApproximationIdea:Put a constant upper bound to the number of lookups for tagging operationThe resulting FG will be approximated…

23/04/2010HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino15Slide16

Approximate tagging: k=1

23/04/2010HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino16metal

thrash

classic

cool

Lookup count :

0

Metallica

1

3

4

Now, select

k

tags at random in Tags(Metallica) and link them with the new tag

5Slide17

Approximate graph vs real graphWith DHARMA approximation, the complexity is affordable for very small kThe smaller k is, the more the structure of the Folksonomy Graph is upset! We need to compare the approximate and the real Folksonomy Graph23/04/2010HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino

17Insert(r, t1,…, tm)Tag(t,r)Search step#lookups2 + 2m4 + k

2Slide18

Approximation in Last.fmWe simulate the evolution of FG with the approximated protocolSimulated resource insertion and tagging activityGoal: draw a comparison between the real Last.fm FG and the approximated FG, for different values of kNodal degreeArcs weight23/04/2010HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino

18Slide19

Nodal degree comparison23/04/2010HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino19Slide20

Arc weights comparison23/04/2010HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino20Slide21

Keeping proportionsSimulated and real graph are different but……they should not necessarily be equalOnly proportions must be keptArcs weight ordering must be keptProportion between weights is not lost (no flattening)Only the “less informative” arcs should be deletedSame proportions grant the preservation of the navigational semantic

23/04/2010HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino21Slide22

Keeping proportions: k = 1Arcs weight ordering is keptKendall’s tau measure between arcs weight rankings is very high (0.7)  OK!Proportion between weights is not lost (no flattening)Cosine similarity between arcs weight sets is very high (0.8)  OK!Only “less informative” arcs are deleted40% of arcs gets lost, but 99% of them has weight <= 3

 OK! (we wipe out only noisy links)23/04/2010HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino22Slide23

Query convergence rapidity: resultsStepsLastRandFirstOriginalgraph

μ3.476.41233.94σ1.41754.458715.9942

μ1/235

33

Approximatedgraph

μ

3.38

5.2140

19.17

σ

1.2373

2.6994

10.3065

μ

1/2

3

5

16

23/04/2010

HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino

23

A query converges

through subsequent filtering on the resource set

Estimation of convergence with simulation

Convergence is

quick

Simulated search has no semantic

notion

Approximation

speeds up the searchSlide24

23/04/2010HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino24Slide25

23/04/2010HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino25Slide26

23/04/2010HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino26Slide27

23/04/2010HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino27Slide28

ConclusionsApproximated folksonomy mapping allows an efficient P2P implementationThe information lost in approximation is prevalently noise (automatic filtering)Search navigation realizes vocabulary specialization, converging to narrow semantic categoriesConvergence is quick (and even quicker with approximation)DHARMA (Java implementation) is available at: http://likir.di.unito.it/applications23/04/2010

HOTP2P 2010 - Luca Maria Aiello, Università degli Studi di Torino28Slide29

Related worksOther attempts in mapping folksonomies on p2p systems:E.g. Tagster [1]Privacy-aware P2P online social networksSafebook [2], PeerSon[3], Likir[4,5]29/03/2010SESOC 2010 - Luca Maria Aiello, Università

degli Studi di Torino29[1] Görlitz, Sizov, Staab – ESWC 2008[2] Cutillo, Molva, Strufe – WONS 2009[3] Buchegger, Schöiberg, Vu, Datta – SocialNets 2009[4] Aiello, Milanesio,

Ruffo, Schifanella – P2P 2008[5] Aiello, Ruffo – SESOC 2010Slide30

Speaker: Luca Maria Aiello, PhD studentaiello@di.unito.itThank you for your attention!

Seventh International Workshop on Hot Topics in Peer-to-Peer Systems