unimannheimde Abstract Type information is very valuable in knowledge bases How ever most large open knowledge bases are incomplete with respect to type information and at the same time contain noisy and incorrect data That makes classic type inferen ID: 25981
Download Pdf The PPT/PDF document "Type Inference on Noisy RDF Data Heiko P..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
2Therestofthispaperisstructuredasfollows.Section2motivatesourworkbyshowingtypicalproblemsofreasoningonreal-worlddatasets.Section3in-troducestheSDTypeapproach,whichisevaluatedinSect.4indierentex-perimentalsettings.InSect.5,weshowhowSDTypecanbeappliedtosolveareal-worldproblem,i.e.,thecompletionofmissingtypeinformationinDBpedia.WeconcludeourpaperwithareviewofrelatedworkinSect.6,andasummaryandanoutlookonfuturework.2ProblemswithTypeInferenceonReal-worldDatasetsAstandardwaytoinfertypeinformationintheSemanticWebistheuseofreasoning,e.g.,standardRDFSreasoningviaentailmentrules[20].Toillustratetheproblemsthatcanoccurwiththatapproach,wehaveconductedanexper-imentwithDBpediaknowledgebase[2].Wehaveusedthefollowingsubsetofentailmentrules:{?xa?t1.?t1rdfs:subClassOf?t2entails?xa?t2{?x?r?y.?rrdfs:domain?tentails?xa?t{?y?r?x.?rrdfs:range?tentails?xa?tWehaveappliedthesethreerulestotheinstancedbpedia:Germany.Theserulesintotalinduce23typesfordbpedia:Germany,onlythreeofwhicharecorrect.Thelistofinferredtypescontains,amongothers,thetypesaward,city,sportsteam,mountain,stadium,recordlabel,person,andmilitarycon ict.Areasonerrequiresonlyonefalsestatementtocometoawrongconclusion.Intheexampleofdbpedia:Germany,atmost20wrongstatementsareenoughtomakeareasonerinfer20wrongtypes.However,therearemorethan38,000statementsaboutdbpedia:Germany,i.e.,anerrorrateofonly0:0005isenoughtoendupwithsuchacompletelynonsensicalreasoningresult.Inotherwords:evenwithaknowledgebasethatis99.9%correct,anRDFSreasonerwillnotprovidemeaningfulresults.However,acorrectnessof99.9%isdicult,ifnotimpossible,toachievewithreal-worlddatasetspopulatedeither(semi-)automatically,e.g.,byinformationextractionfromdocuments,orbythecrowd.Intheexampleabove,theclassMountainintheaboveisinducedfromasinglewrongstatementamongthe38,000statementsaboutdbpedia:Germany,whichisdbpedia:Mzedbpedia-owl:sourceMountaindbpedia:Germany.Like-wise,theclassMilitaryConflictisinducedfromasinglewrongstatement,i.e.,dbpedia:XII Corps (United Kingdom)dbpedia-owl:battledbpedia:Ger-many.Theseproblemsexistbecausetraditionalreasoningisonlyusefulifa)boththeknowledgebaseandtheschemadonotcontainanyerrorsandb)theschemaisonlyusedinwaysforeseenbyitscreator[4].Bothassumptionsarenotreal-isticforlargeandopenknowledgebases.Thisshowsthat,althoughreasoningseemsthestraightforwardapproachtotackletheproblemofcompletingmissingtypes,itis{atleastinitsstandardform{notapplicableforlarge,openknowl-edgebases,sincetheyareunlikelytohavecorrectenoughdataforreasoningto 4property.Furthermore,eachpropertyisassignedacertainweightwp,whichre ectsitscapabilityofpredictingthetype(seebelow).Withthoseelements,wecancomputethecondenceforaresourcerhavingatypetasconf(T(r)):=1 NXallpropertiespofrP(T(r)j(9p:)(r));(1)whereNisthenumberofpropertiesthatconnectsaresourcetoanotherone.Byusingtheaverageprobabilitiesofeachtype,weaddresstheproblemoffaultylinks,sincetheydonotcontributetoomuchtotheoverallprobability.Intheexamplewithdbpedia:Germanyusedabove,theclassMountainwasinferredduetoonewrongstatementoutof38,000.Withtheabovedenition,thatrelationwouldonlybeweightedwith1 38;000,thus,thetypeMountainwouldreceiveacomparablysmalloverallcondence.Bylookingattheactualdistributionoftypesco-occurringwithaproperty,insteadofthedeneddomainsandranges,propertieswhichare\abused",i.e.,useddierentlythanconceivedbytheschemacreator,donotcauseanyproblemsforSDType.Aslongasapropertyisusedmoreorlessconsistentlythroughouttheknowledgebase,theinferenceswillalwaysbeconsistentaswell.Singlein-consistentusages,justlikesinglewrongstatements,donotcontributetoomuchtotheoverallresult.Furthermore,whenlookingattheactualusageofaschema,theresultscanbemorene-grainedthanwhenusingtheschemaonly.Forex-ample,ontheMusicBrainzdataset2,foaf:nameisalwaysusedasapropertyofmo:MusicArtist.WhileRDFSentailmentrulescouldnotinferanyspecictypefromthefoaf:nameproperty,sinceithasnoexplicitdomaindened.3Whileusingtheactualdistributioninsteadofdeneddomainsandrangeseliminatesthoseproblems,itcaninducenewoneswhenadatasetisheavilyskewed,i.e.,theextensionsofsomeclassesareseveralordersofmagnitudelargerthanothers.Thisisaprobleminparticularwithgeneralpurposeproperties,suchasrdfs:labelorowl:sameAs,whichareratherequallydistributedintheoverallknowledgebase.Ifthatknowledgebaseisheavilyskewed(e.g.,adatabaseaboutcitiesandcountrieswhichcontains10,000citiespercountryonaverage),anditcontainsmanyofsuchgeneralpurposeproperties,thereisadangerofoverratingthemorefrequenttypes.Thus,wedeneaweightwpforeachproperty(notethatpandp1aretreatedindependentlyandareeachassignedanindividualweight),whichmeasuresthedeviationofthatpropertyfromtheaprioridistributionofalltypes:wp:=Xalltypest(P(t)P(tj9p:))2(2)Withthosetypes,wecanrenetheabovedenitiontoconf(T(r)):=XallpropertiespofrwpP(T(r)j(9p:)(r));(3) 2http://dbtune.org/musicbrainz/3Thedeneddomainoffoaf:nameisowl:Thing,seehttp://xmlns.com/foaf/spec/ 6asintable1,denition(1)wouldyieldacondencescorefor:xadbpedia-owl:Personand:xadbpedia-owl:Placeof0.14and0.60,respectively.4Whenusingweights,thenumbersaredierent.InourexamplefromDBpe-dia,theobtainedweightsfordbpedia-owl:locationandfoaf:nameare0:77and0:17,hence,theoverallcondencescoresfor:xadbpedia-owl:Personand:xadbpedia-owl:Placeinthatexample,usingdenition(3),are0:05and0:78,respectively.Thisshowsthattheweightshelpreducingthein uenceofgeneralpurposepropertiesandthusassigningmoresensiblescorestothetypesthatarefoundbySDType,andintheendhelpreducingwrongresultscomingfromskeweddatasets.Insummary,wearecapableofcomputingascoreforeachpairofaresourceandatype.Givenareasonablecutothreshold,wecanthusinfermissingtypesatarbitrarylevelsofquality{thresholdsbetween0.4and0.6typicallyyieldstatementsataprecisionbetween0.95and0.99.3.2ImplementationSDTypehasbeenimplementedbasedonarelationaldatabase,asshowninFig.1.Theinputdataconsistsoftwotables,onecontainingalldirectpropertyassertionsbetweeninstances,theothercontainingalldirecttypeassertions.Fromtheseinputles,basicstatisticsandaggregationsarecomputed:thenumberofeachtypeofrelationforallresources,andthetheaprioriprobabilityofalltypes,i.e.,thepercentageofinstancesthatareofthattype.Eachofthosetablescanbecomputedwithonepassovertheinputtablesortheirjoin.Thebasicstatistictablesserveasintermediateresultsforcomputingtheweightsandconditionalprobabilitiesusedintheformulasabove.Onceagain,thoseweightsandconditionalprobabilitiescanbecomputedwithonepassovertheintermediatetablesortheirjoins.Inanalstep,newtypescanbematerializedincludingthecondencescores.Thiscanbedoneforallinstances,orimplementedasaservice,whichtypesaninstanceondemand.Sinceofeachofthestepsrequiresonepassoverthedatabase,theoverallcomplexityislinearinthenumberofstatementsintheknowledgebase.4EvaluationToevaluatethevalidityofourapproach,weusetheexistingtypeinformationintwolargedatasets,i.e.,DBpedia[2]andOpenCyc[9],asagoldstandard,5andletSDTypereproducethatinformation,allowingustoevaluaterecall,precision,andF-measure. 4TheactualnumbersforDBpediaare:P(Personjfoaf#name)=0:273941,P(Placejfoaf#name)=0:314562,P(Personjdbpedia#location)=0:000236836,P(Placejdbpedia#location)=0:876949.5InthecaseofDBpedia,thedatasetisratherasilverstandard.However,itprovidesthepossibilityofalarger-scaleevaluation.Aner-grainedevaluationwithmanualvalidationoftheresultsbyanexpertcanbefoundinSect.5. 8 Fig.2.Precision/recallcurvesofSDTypeonDBpedia,forinstanceswithatleastone,atleast10,andatleast25incominglinksleastoneingoinglink),achievinganF-measureof88:5%,theresultsareslightlybetteroninstancesthathaveatleast10or25ingoinglinks,withanF-measureof88:9%and89:9%,respectively.Thedierencesshowmoresignicantlyintheprecision@95%(i.e.theprecisionthatcanbeachievedat95%recall),whichis0:69(minimumonelink),0:75(minimumtenlinks),and0:82(minimum25links),respectively.Figure3depictsthecorrespondingresultsforOpenCyc.TherstobservationisthattheoverallresultsarenotasgoodasonDBpedia,achievingamaximumF-measureof60:1%(60:3%and60:4%whenrestrictingtoinstancesthathaveatleast10or25ingoinglinks).Thesecondobservationisthattheresultsforinstanceswithdierentnumbersofingoingpropertiesdonotdiermuch{infact,mostofthedierencesaretoosmalltobevisibleinthegure.While95%recallcannotbereachedonOpenCycwithSDType,theprecision@90%is0:18(minimumonelink),0:23(minimumtenand25links),respectively.ThestrongdivergenceoftheresultsbetweenDBpediaandOpenCyc,asdiscussedabove,wastobeexpected,sinceOpenCychasontheonehandmore(andmorespecic)typesperinstance,ontheotherhandlessevidenceperinstance,sincethenumberofpropertiesconnectinginstancesissmaller.Asthediagramsshow,lookingatinstanceswithmorelinksimprovestheresultsonDBpedia,butnotonOpenCyc(apartfromasmallimprovementinprecisionatarecallofaround0.9).ThereasonforthatisthatDBpedia,withitsstrongerfocusoncoveragethanoncorrectness,containsmorefaultystatements.Whenmorelinksarepresent,thein uenceofeachindividualstatementisre-duced,whichallowsforcorrectingerrors.OpenCyc,ontheotherhand,withitsstrongerfocusonprecision,benetslessfromthaterrorcorrectionmechanism.Sinceweassumethatitismorediculttopredictmorespecictypes(suchasHeavyMetalBand)thanpredictingmoregeneralones(likeBandorevenOrganization),wehaveadditionallyexaminedthebestF-measurethatcanbe 9 Fig.3.Precision/recallcurvesofSDTypeonOpenCyc,takingintoaccountonlyin-coming,onlyoutgoing,andbothincomingandoutgoingpropertiesachievedwhenrestrictingtheapproachtoacertainmaximumclasshierarchydepth.TheresultsaredepictedinFig.4.ItcanbeobservedthatSDTypeinfactworksbetteronmoregeneraltypes(achievinganF-measureofupto97:0%onDBpediaand71:6%onOpenCycwhenrestrictingtheapproachtopredictingonlytop-levelclasses).However,theeectsareweakerthanweexpected.5Application:CompletingMissingTypesinDBpediaInthefollowing,weapplySDTypetoinfermissingtypeinformationinDBpe-dia.WhileDBpediahasaquitelargecoverage,therearemillionsofmissingtypestatements.Toinferthosemissingtypes,wehavecombinedtheapproachsketchedabovewithapreclassicationstepseparatingtypeablefromuntypeableresourcesinordertoreducefalseinferences.5.1EstimatingTypeCompletenessinDBpediaAsidefromthetypeinformationinDBpediausingtheDBpediaontology,whichisgeneratedusingWikipediainfoboxes,resourcesinDBpediaarealsomappedtotheYAGOontology[18].ThosemappingsaregeneratedfromWikipediapagecategories.Thus,theyarecomplementarytoDBpediatypes{anarticlemayhaveacorrectinfobox,butmissingcategoryinformation,orviceversa.Bothmethodsofgeneratingtypeinformationareproneto(dierenttypesof)errors.However,lookingattheoverlapsanddierencesoftypestatementscreatedbybothmethodsmayprovidesomeapproximateestimatesaboutthecompletenessofDBpediatypes.ToestimatethecompletenessoftypeinformationinDBpedia,weusedapartialmappingbetweentheYAGOontology[18]andtheDBpediaontology.8 8http://www.netestate.de/De/Loesungen/DBpedia-YAGO-Ontology-Matching 10 Fig.4.MaximumachievableF-measurebymaximumclassdepthforDBpediaandOpenCyc.ThegraphdepictsthemaximumF-measurethatcanbeachievedwhenrestrictingtheapproachtondingclassesofamaximumhiearchydepthof1,2,etc.AssumingthattheYAGOtypesareatleastmoreorlesscorrect,wecanestimatethecompletenessofaDBpediatypedbpedia#tusingthemappedYAGOtypeyago#tbylookingattherelationofallinstancesofdbpedia#tandallinstancesthathaveatleastoneofthetypesdbpedia#tandyago#t:completeness(dbpedia#t)jdbpedia#tj jdbpedia#t[yago#tj(5)Thedenominatordenotesanestimateofallinstancesthatshouldhavethetypedbpedia#t.Sincetheactualnumberofresourcesthatshouldhavethattypecanbelargerthanthat(i.e.,neithertheDBpedianortheYAGOtypeisset),thecompletenesscanbesmallerthanthefraction,hencetheinequation.Calculatingthesumacrossalltypes,weobservethatDBpediatypesareatmost63.7%complete,withatleast2.7millionmissingtypestatements(whileYAGOtypes,whichcanbeassessedaccordingly,areatmost53.3%complete).TheclassesthemostmissingtypestatementsareshowninFig.5Classesthatareveryincompleteinclude{dbpedia-owl:Actor(completeness4%),with57,000instancesmissingthetype,including,e.g.,BradPittandTomHanks{dbpedia-owl:Game(completeness7%),with17,000instancesmissingthetype,includingTetrisandSimCity{dbpedia-owl:Sports(completeness5:3%),with3,300instancesmissingthetype,includingBeachVolleyballandBiathlonAsimilarexperimentusingtheclassesdbpedia-owl:Personandfoaf:Person(assumingthateachpersonshouldhavebothtypes)yieldedthattheclassdbpedia-owl:Personisatmost40%complete.TheseexamplesshowthattheproblemofmissingtypesinDBpediaislarge,andthatitdoesnotonlyaect 11 Fig.5.Largestnumberof(estimated)missingtypestatementsperclassmarginallyimportantinstances.InDBpedia,commonreasonsformissingtypestatementsare{Missinginfoboxes{anarticlewithoutaninfoboxisnotassignedanytype.{Toogeneralinfoboxes{ifanarticleaboutanactorusesapersoninfoboxinsteadofthemorespecicactorinfobox,theinstanceisassignedthetypedbpedia-owl:Person,butnotdbpedia-owl:Actor.{Wronginfoboxmappings{e.g.,thevideogameinfoboxismappedtodbpedia-owl:VideoGame,notdbpedia-owl:Game,anddbpedia-owl:VideoGameisnotasubclassofdbpedia-owl:GameintheDBpediaontology.{Unclearsemantics{someDBpediaontologyclassesdonothaveclearseman-tics.Forexample,thereisaclassdbpedia-owl:College,butitisnotclearwhichnotionofcollegeisdenotedbythatclass.Thetermcollege,accord-ingtodierentusages,e.g.,inBritishandUSEnglish,candenoteprivatesecondaryschools,universities,orinstitutionswithinuniversities.95.2TypingUntypedInstancesinDBpediaInoursecondexperiment,wehaveanalyzedhowwellSDTypeissuitableforaddingtypeinformationtountypedresources.Asdiscussedabove,resourcesmaybemissingatypebecausetheyusenoinfobox,aninfoboxnotmappedtoatype,orarederivedfromaWikipediaredlink.Inparticularinthelattercase,theonlyusableinformationaretheincomingproperties.SimplytypingalluntypedresourceswithSDTypewouldleadtomanyerrors,sincetherearequiteafewresourcesthatshouldnothaveatype,asdiscussed 9seehttp://oxforddictionaries.com/definition/english/college 12in[1].Examplesareresourcesderivedfromlistpages,10pagesaboutacategoryratherthananindividual,11orgeneralarticles.12Inordertoaddressthatproblem,wehavemanuallylabeled500untypedresourcesintotypeableandnon-typeableresources.Forthoseresources,wehavecreatedfeaturesusingtheFeGeLODframework[13],andlearnedarulesetforclassifyingtypeableandnon-typeableresourcesusingtheRipperrulelearner[3].Theresultingrulesethasaccuracyof91.8%(evaluatedusing10-foldcrossvalidation).Fromall550,048untypedresourcesinDBpedia,thisclassieridenties519,900(94.5%)astypeable.Wehavegeneratedtypesforthoseresourcesandevaluatedthemmanuallyonasampleof100randomresources.TheresultsforvariousthresholdsaredepictedinFig.6.Itcanbeobservedthat3.1typesperinstancecanbegeneratedwithaprecisionof0.99atathresholdof0.6,4.0typeswithaprecisionof0.97atathresholdof0.5,and4.8typeswithaprecisionof0.95atathresholdof0.4.13.Incontrast,RDFSreasoningonthetestdatasetgenerates3.0typesperinstancewithaprecisionof0.96,whichshowsthatSDTypeisbetterinbothprecisionandproductivity.Withthosethresholds,wecangenerateatotalof2,426,552and1,682,704typestatements,respectively,asdepictedinTable3.Itcanbeobservedthatwiththehigherthresholdguaranteeinghigherprecision,moregeneraltypesaregenerated,whilemorespecictypessuchasAthleteorArtist,arerarelyfound.Inmostcases,thegeneratedtypesareconsistent,i.e.,anArtistisalsoaPer-son,whilecontradictingpredictions(e.g.,OrganizationandPersonforthesameinstance)areratherrare.6RelatedWorkTheproblemsofinferenceonnoisydataintheSemanticWebhasbeenidenti-ed,e.g.,in[16]and[8].Whilegeneral-purposereasoningonnoisydataisstillactivelyresearched,therehavebeensolutionsproposedforthespecicproblemoftypeinferencein(generalorparticular)RDFdatasetsintherecentpast,us-ingstrategiessuchasmachinelearning,statisticalmethods,andexploitationofexternalknowledgesuchaslinkstootherdatasourcesortextualinformation.[11]useasimilarapproachasours,butonadierentproblem:theytrytopredictpossiblepredicatesforresourcesbasedonco-occurrenceofproperties.TheyreportanF-measureof0.85atlinearruntimecomplexity.Manyontologylearningalgorithmsarecapableofdealingwithnoisydata[19].However,whenusingthelearnedontologiesforinferringmissinginformationusingareasoner,thesameproblemsaswithmanuallycreatedontologiesoccur. 10e.g.,http://dbpedia.org/resource/Lists_of_writers11e.g.,http://dbpedia.org/resource/Writer12e.g.,http://dbpedia.org/resource/History_of_writing13AwebserviceforDBpediatypecompletion,aswellasthecodeusedtoproducetheadditionaltypes,isavailableathttp://wifo5-21.informatik.uni-mannheim.de:8080/DBpediaTypeCompletionService/ 14 Fig.6.PrecisionandaveragenumberoftypestatementsperresourcegeneratedonuntypedresourcesinDBpediaDOLCEontologiesinordertondappropriatetypes.Theauthorsreportanoverallrecallof0.74,aprecisionof0.76,andanF-measureof0.75.Theauthorsof[7]exploittypesofresourcesderivedfromlinkedresources,wherelinksbetweenWikipediapagesareusedtondlinkedresources(whicharepotentiallymorethanresourcesactuallylinkedinDBpedia).Foreachresource,theyusetheclassesofrelatedresourcesasfeatures,anduseknearestneighborsforpredictingtypesbasedonthosefeatures.Theauthorsreportarecallof0.86,aprecsionof0.52,andhenceanF-measureof0.65.Theapproachdiscussedin[15]addressesaslightlydierentproblem,i.e.,themappingDBpediaentitiestothecategorysystemofOpenCyc.Theyusedierentindicators{infoboxes,textualdescriptions,Wikipediacategoriesandinstance-levellinkstoOpenCyc{andapplyanaposterioriconsistencycheckusingCyc'sownconsistencycheckingmechanism.Theauthorsreportarecallof0.78,aprecisionof0.93,andhenceanF-measureof0.85.Theapproachesdiscussedabove,exceptfor[12],areusingspecicfeaturesforDBpedia.Incontrast,SDTypeisagnostictothedatasetandcanbeappliedtoanyRDFknowledgebase.Furthermore,noneoftheapproachesdiscussedabovereachesthequalitylevelofSDType(i.e.,anF-measureof88:5%ontheDBpediadataset).WithrespecttoDBpedia,itisfurthernoteworthythatSDTypeisalsocapableoftypingresourcesderivedfromWikipediapageswithverysparsein-formation(i.e.,noinfoboxes,nocategories,etc.){asanextremecase,wearealsocapableoftypinginstancesderivedfromWikipediaredlinksonlybyusinginformationfromtheingoinglinks.7ConclusionandOutlookInthispaper,wehavediscussedtheSDTypeapproachforheuristicallycom-pletingtypesinlarge,cross-domaindatabases,basedonstatisticaldistributions.Unliketraditionalreasoning,ourapproachiscapableofdealingwithnoisydataaswellasfaultyschemasorunforeseenusageofschemas. 15TheevaluationhasshownthatSDTypecanpredicttypeinformationwithanF-measureofupto88:9%onDBpediaand63:7%onOpenCyc,andcanbeappliedtovirtuallyanycross-domaindataset.ForDBpedia,wehavefurther-moreenhancedSDTypetoproducevalidtypesonlyforuntypedresources.Tothatend,wehaveusedatrainedpreclassiertellingtypeablefromnon-typeableinstancesatanaccuracyof91:8%,andareabletopredict2.4millionmissingtypestatementsataprecisionof0.95,or1.7millionmissingtypestatementsataprecisionof0.99,respectively.Wehaveshownthatwiththesenumbers,weoutperformtraditionalRDFSreasoningbothinprecisionandproductivity.TheresultsshowthatSDTypeisgoodatpredictinghigher-levelclasses(suchasBand),whilepredictingmorene-grainedclasses(suchasHeavyMetalBand)ismuchmoredicult.Onestrategytoovercomethislimitationwouldbetousequaliedrelationsinsteadofonlyrelationinformation,i.e.,acombinationoftherelationandthetypeofrelatedobjects.Forexample,linksfromamusicgrouptoaninstanceofHeavyMetalAlbumcouldindicatethatthismusicgroupistobeclassiedasaHeavyMetalBand.However,usingsuchfeaturesresultsinamuchlargerfeaturespace[13]andthuscreatesnewchallengeswithrespecttoscalabilityofSDType.ThetypestatementscreatedbySDTypeareprovidedinawebservicein-terface,whichallowsforbuildingapplicationsandservicesatauser-denedtrade-oofrecallandprecision,assketchedin[14].Thestatisticalmeasuresusedinthispapercannotonlybeusedforpredict-ingmissingtypes.Otheroptionswewanttoexploreinthefutureincludethevalidationofexistingtypesandlinks.Likeeachlinkcanbeanindicatorforatypethatdoesnotexistintheknowledgebase,itmayalsobeanindicatorthatanexistingtype(orthelinkitself)iswrong.Insummary,wehaveshownanapproachthatiscapableofmakingtypeinferenceheuristicallyonnoisydata,whichsignicantlyoutperformspreviousapproachesaddressingthisproblems,andwhichworksonlarge-scaledatasetssuchasDBpedia.TheresultinghighprecisiontypesforDBpediahavebeenaddedtotheDBpedia3.9releaseandarethuspubliclyusableviatotheDBpediaservices.Acknowledgements.TheauthorswouldliketothankChristianMeilickeforhisvaluablefeedbackonthispaper.References1.AlessioPalmeroAprosio,ClaudioGiuliano,andAlbertoLavelli.Automaticexpan-sionofdbpediaexploitingwikipediacross-languageinformation.In10thExtendedSemanticWebConference(ESWC2013),2013.2.ChristianBizer,JensLehmann,GeorgiKobilarov,SorenAuer,ChristianBecker,RichardCyganiak,andSebastianHellmann.DBpedia-AcrystallizationpointfortheWebofData.WebSemantics,7(3):154{165,2009.3.WilliamW.Cohen.Fasteectiveruleinduction.In12thInternationalConferenceonMachineLearning,1995.