4Gize istheEthiopianwordfortime RelatedWork ApproachesliketheDBpediaWaybackMachine4allowtoretrievedataatacertaintimestampprovidedbytheuserOtherapproacheseg7accesshistoricaldataviaHTT ID: 846123
Download Pdf The PPT/PDF document "GizeATimeWarpintheWebofDataValeriaFionda..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
1 Gize:ATimeWarpintheWebofDataValeriaFiond
Gize:ATimeWarpintheWebofDataValeriaFionda1,MelisachewWudageChekol2,andGiuseppePirro31UniversityofCalabria,Italyfionda@mat.unical.it2UniversityofMannheim,Germanymel@informatik.uni-mannheim.de3InstituteforHighPerformanceComputingandNetworking,ICAR-CNR,Italypirro@icar.cnr.itAbstract.WeintroducetheGizeframeworkforqueryinghistoricalRDFdata.Gizebuildsupontwomainpillars:alightweightapproachtokeephistoricaldata,andanextensionofSPARQLcalledSPARQ{LTL,whichincorporatestemporallogicprimitivestoenablearichclassofqueries.OnestrikingpointofGizeisthatitsfeaturescanbereadilymadeavailableinexistingqueryprocessors.1IntroductionQueryinghistoricaldataisofutmostimportanceinmanycontexts,fromcitymonitoring,whereoneneedstotrackdierentaspects(e.g.,pollution,popula-tion)togenericexploratoryresearch,whereoneisinterestedinposingquerieslike\Retrieveplayersthatarenowmanagingsomeclubtheyplayedfor"or\Re-trievetheannotationofagenesincethediscoveryofaparticularinteraction".TheclassicalapproachforqueryingRDFdata(e.g.,viaSPARQLendpoints)onlyconsidersthelatestversion.Infact,queryingofhistoricalRDFdataposessomechallenges.Therstconcernstherepresentationandstoringofhistoricaldata.Someapproaches(e.g.,[4,7])allowtoretrievedatabyprovidingtimes-tamps.Otherresorttodedicatedindexingstructures(e.g.,[5])tospeed-upqueryprocessing.Thesecondproblemconcernswhattypeofqueryingprimitivetopro-vide.ThemostcommonapproachistodeviseSPARQLextensionsthatworkwithintervalsorSPARQLtranslationsintotemporallogic.Theadditionofad-hoccomponentseitherintermsofdatarepresentation,querylanguageorboth,hinderstheapplicabilityonexistingRDF(query)processinginfrastructures.TocopewiththeseissueswepresenttheGize4framework.Gizeisbuiltaroundtwomaincomponents.TherstisalightweightapproachtostoreRDFdata,whereeachversionofthedataisstoredinaseparatenamedgraph.ThesecondcomponentisapowerfulextensionofSPARQLcalledSPARQ{LTL.ThislanguageinheritsavarietyoftemporaloperatorsfromLinearTemporalLogics(LTL)[2].ToevaluateSPARQ{LTLonexistingSPARQLprocessorswedeviseatranslationtostandardSPARQLqueries. 4Gize( )istheEthiopianwordfortime. RelatedWork. ApproachesliketheDBpediaWaybackMachi
2 ne[4]allowtore-trievedataatacertaintimes
ne[4]allowtore-trievedataatacertaintimestamp(providedbytheuser).Otherapproaches(e.g.,[7])accesshistoricaldatavia(HTTP)contentnegotiation,typicallyusingtheMementoframework.Yetotherapproaches(e.g.,[5])introducetimestampsintoRDFtriplesalongwithad-hocindexingstrategies.InDescriptionLogics,someproposalsfocusontemporalconjunctivequeryanswering(e.g.,[1]);othereorts(e.g.,[6])havefocusedonthetranslationofSPARQLintoLTL.Sur-prisingly,thedesignofsolutionsforqueryinghistoricalRDFdataonexistingSPARQLprocessorsisstillinitsinfancy.GizellsthisgapbycontributinganextensionofSPARQL,calledSPARQ{LTL,thatallowsforarichclassoftemporalprimitivesborrowedfromLTL(e.g.,SINCE,NEXT,PREVIOUS)alongwithatranslationfromSPARQ{LTLqueriesintostandardSPARQLqueries.2TheGizeFrameworkRepresentingHistoricalRDFData.ThetenetofGizeistoenablequeryingofhistoricalRDFdataonexistingprocessors.Torepresentversions,Gizelever-agesthenotionofRDFquad.AnRDFquad(forsimplicity,weomitbnodes)isatupleoftheformhs;p;o;ci2II(I[L)I,whereI(IRIs)andL(literals)arecountablyinnitesets.Theforthelementofthequadrepresentsthatnamedgraphtowhichthetriplebelongs. Fig.1.AnexceprtofevolvingdatafromDBpedia.Fig.1showstheevolutionofsomedatatakenfromDBpedia.Eachofthe5versionsconsideredisrepresentedbyanamedgraph.FromthissmalldatasampleonemaynoticethatItalyhaschanged3coachesfromJuly2012(C.Prandelli)toJuly2016(G.Ventura).Interestingly,thelatestcoach(G.Ventura)willstartonJuly18th.OnemayalsonoticethatsomeplayerslikeA.Pirlowerepartoftheteamalongthewholeperiod,whilesomeotherlikeG.PazziniorM.Darmianiwereleftoutoradded,respectively.Thegurealsoshows(bottomrightcorner)updatestatisticsaboutPeopleinDBpedia.Eachpercentageiscomputedwrtthepreviousversion.Theavailabilityofhistoricaldataallowstoposequerieslike\Findallplayersthatplayedwiththehighestnumberofcoaches"or\FindplayersthatplayedsinceC.Prandelliwasthecoach".Mostofexistingapproacheseitherarenotabletoexpresssuchkindofqueriesorhavetoresorttoad-hocprocessinginfrastructures. TheSPARQ{LTLLanguage.ThesyntaxofSPARQ{LTLisshownbelowwhileTable1providesadescriptionofthetemporaloperators.LetqbeaSPARQ{LTLquery,H=fh1;h2;:::;hmgb
3 ethesetofversions,andhcbethecurrent(notn
ethesetofversions,andhcbethecurrent(notnecessarilythelatest)versionofthedata.QP::=QP1:QP2jfQP1gUNIONfQP2gjfQP1gMINUSfQP2gjQP1OPTIONALfQP2gjjGRAPHI[VQP1jQP1FILTERfRgjt=(I[V)(I[V)(I[L[V)jjXfQPgjWfQPgjFfQPgjGfQPgjfQP1gUfQP2gjjYfQPgjZfQPgjPfQPgjHfQPgjfQP1gSfQP2g Operator SPARQ{LTLSyntax Meaning Xq NEXT Evaluateqonversionhc+1 Fq EVENTUALLY Evaluateqonallversionshc;::::hm Gq ALWAYS Theevaluationofqmustbethesameonallversionshc;::::hm q1Uq2 UNTIL IfS2isthesolutionofq2inaversion,hk2fhc;::::hmgthen thereexistsasolutionS1ofq1onhcsuchthat S1iscompatiblewithS2inallversionsfhc;::::hkg Yq PREVIOUS Evaluateqonversionhc1 Pq PAST Evaluateqonallversionsh1;::::hc Hq ALWAYSPAST Theevaluationofqmustbethesameonallversionsh1;::::hc q1Sq2 SINCE IfS2isthesolutionofq2inaversion,hk2fh1;::::hcgthen thereexistsasolutionS1ofq1onfhk;::::hcgsuchthat S1iscompatiblewithS2inallversionsfhk;::::hcg Table1.MeaningofthetemporaloperatorsinSPARQ{LTL.SPARQ{LTLallowstouseanadditionalsetofkeywordswhenwritingSPARQLqueries.SPARQ{LTLareevaluatedbytranslatingthetemporaloperatorsviaa(setof)pattern(s)evaluatedonnamedgraphsmaintainingdataversions.WegivesomeexamplesbyconsideringtheveversionofdatashowninFig.1(storedinseparatenamedgraphs).Inwhatfollows,dbp:INFTisashorthandfordbp:Italy_national_football_team.Example1.SelectplayerswhoareplayingintheItaliannationalfootballteamandplayedatleastunderadierentcoachthanthecurrentone. SPARQ{LTL TranslationintoSPARQL SELECT?pWHERE{dbp:INFTdbpo:name?p.dbp:INFTdbpo:coach?c1.PAST{dbp:INFTdbpo:name?p.dbp:INFTdbpo:coach?c2.FILTER(?c1!=?c2)}} SELECT?pWHERE{dbp:INFTdbpo:name?p.dbp:INFTdbpo:coach?c1.{GRAPHhttp;://g;ize.;org/;v500;{dbp:INFTdbpo:name?p.dbp:INFTdbpo:coach?c2.FILTER(?c1!=?c2)}}UNION{GRAPHhttp;://g;ize.;org/;v400;{dbp:INFTdbpo:name?p.dbp:INFTdbpo:coach?c2.FILTER(?c1!=?c2)}}UNION{GRAPHhttp;://g;ize.;org/;v300;{dbp:INFTdbpo:name?p.dbp:INFTdbpo:coach?c2.FILTER(?c1!=?c2)}}UNION{GRAPHhttp;://g;ize.;org/;v200;{dbp:INFTdbpo:name?p.dbp:INFTdbpo:coach?c2.FILTER(?c1!=?c2)}}UNION{GRAPHhttp;://g;ize.;org/;v100;{dbp:INFTdbpo:name?
4 p.dbp:INFTdbpo:coach?c2.FILTER(?c1!=?c2)
p.dbp:INFTdbpo:coach?c2.FILTER(?c1!=?c2)}}} TheSPARQLqueryontherightisautomaticallygeneratedandcanbeeval-uatedonexistingprocessors.Notethatthetranslation(becauseofthesemanticsofPASTdescribedinTable1)requirestolookintoallversions. Example2.FindthenameofthecoachoftheItaliannationalfootballteamafterthesackingofCesarePrandelli. SPARQ{LTL TranslationintoSPARQL SELECT?nWHERE{PAST{dbp:INFTdbpo:coachdbp:CP.NEXT{dbp:INFTdbpo:coach?n.FILTER(?n!=dbp:CP)}}} SELECT?nWHERE{{GRAPHhttp;://g;ize.;org/;v500;{dbp:INFTdbpo:coachdbp:CP.GRAPHhttp;://g;ize.;org/;/v60;{dbp:INFTdbpo:coach?n.FILTER(?n!=dbp:CP)}}}UNION......UNION{GRAPHhttp;://g;ize.;org/;/v10;{dbp:INFTdbpo:coachdbp:CP.GRAPHhttp;://g;ize.;org/;/v20;{dbp:INFTdbpo:coach?n.FILTER(?n!=dbp:CP)}}}} Inthepreviousquery,dbp:CPisashorthandfordbp:Cesare_Prandelli.Asbefore,thetranslationofPASTmakesusageofUNIONqueriesovereachver-sionsvi;then,foreachvi,NEXTchecksinversionvi+1(viaaFILTER)thatthecoachchanged.3ConclusionsWehaveoutlinedGize,whichenablestoset-upaninfrastructureforquery-inghistoricalRDFdataonexistingSPARQLprocessors.Gizeadoptsasimpleapproachtostoredierentversionsofthedataandapowerfultemporalex-tensionofSPARQLcalledSPARQ{LTL.Asafuturework,weareconsideringapproacheslikeRDFHDT[3]toimprovethestoragespaceconsumption.References1.S.Borgwardt,M.Lippmann,andV.Thost.TemporalizingRewritableQueryLan-guagesOverKnowledgeBases.JWS,33:50{70,2015.2.G.D.DeGiacomoandM.Y.Vardi.LinearTemporalLogicandLinearDynamicLogiconFiniteTraces.InIJCAI,2013.3.J.D.Fernandez,M.A.Martnez-Prieto,C.Gutierrez,A.Polleres,andM.Arias.BinaryRDFRepresentationforPublicationandExchange.JWS,19:22{41,2013.4.J.D.Fernandez,P.Schneider,andJ.Umbrich.TheDBpediaWaybackMachine.InSEMANTICS,pages192{195,2015.5.S.Gao,J.Gu,andC.Zaniolo.RDF-TX:AFast,User-FriendlySystemforQueryingtheHistoryofRDFKnowledgeBases.InEDBT,pages269{280,2016.6.R.Mateescu,S.Meriot,andS.Rampacek.ExtendingSPARQLwithTemporalLogic.Report,2009.7.H.VandeSompel,R.Sanderson,M.L.Nelson,L.L.Balakireva,H.Shankar,andS.Ainsworth.AnHTTP-basedVersioningMechanismforLinkedData.201