BANKS Browsing and Keyword Searching in Relational Databases B - PDF document

Download presentation
BANKS Browsing and Keyword Searching in Relational Databases B
BANKS Browsing and Keyword Searching in Relational Databases B

BANKS Browsing and Keyword Searching in Relational Databases B - Description


Aditya Gaurav Bhalotia Soumen Chakrabarti Arvind Hulgeri Charuta Nakhe Parag S Sudarshan Computer Science and Engg Dept IIT Bombay badityasoumenaruparagsudarsha cseiitbacin bhalotiaeecsberkeley edu charutapsplcoin Abstract The BANKS system ID: 4289 Download Pdf

Tags

Aditya Gaurav Bhalotia Soumen

Embed / Share - BANKS Browsing and Keyword Searching in Relational Databases B


Presentation on theme: "BANKS Browsing and Keyword Searching in Relational Databases B"— Presentation transcript


BANKS:BrowsingandKeywordSearchinginRelationalDatabasesB.AdityaGauravBhalotiaSoumenChakrabartiArvindHulgeriCharutaNakheParagS.SudarshanComputerScienceandEngg.Dept.,I.I.T.Bombaybaditya,soumen,aru,parag,sudarsha@cse.iitb.ac.inbhalotia@eecs.berkeley.educharuta@pspl.co.inAbstractTheBANKSsystemenableskeyword-basedsearchondatabases,togetherwithdataandschemabrowsing.BANKSenablesusersto PartlysupportedbyanIBMFacultyFellowshipgrantand Mining Surprising Patterns UsingTemporal Description Length ChakrabartiSD98...Paper Tuple PaperIdPaperName ByronDChakrabartiSD9 8 Writes Tuple SunitaSChakrabartiSD98Writes Tuple ...Sunita SarawagiSunitaSAuthor Tuple A uthorIdAuthorName . .. Byron DomByronDAuthor Tuple Figure1:AFragmentoftheDBLPDatabaseBANKS,includingrelevancescorecomputation,isde-scribedinmoredetailinSection2.BANKSprovidesarichinterfacetobrowsedata,withautomaticgenerationofhyperlinks.Thebrows-ingcomponentofBANKSisdescribedinmorede-tailinSection3.TheBANKSsystemisdevelopedinJavausingservletsandJDBC,andcanberunonanydatabase,withoutanyprogramming.Wearealsode-velopingaversionofBANKSthathandlesXMLdata.ThegreatestvalueofBANKSliesinalmostzero-eortWebpublishingofrelationaldatawhichwouldotherwiseremaininvisibletotheWeb[2].Forexam-ple,BANKSmaybeusedtopublishorganizationaldata,bibliographicdata,andelectroniccatalogs.AdemooftheBANKSsystemisaccessibleovertheWebattheURL:http://www.cse.iitb.ac.in/banks/2KeywordsearchConsiderafragmentofabibliographicdatabaseshowninFigure1.Thisdatabasecontainspapertitles,theirauthorsandcitationsextractedfromtheDBLPreposi-tory.Aswecansee,duetonormalization,informationaboutasinglepaperisdistributedacrossseventuplesrelatedthroughforeignkeyreferences.AuserlookingforthispapermayusequerieslikesunitatemporalŽorsoumensunitaŽ.TheBANKSsystemmodelsthedatabaseasadi-rectedgraph,witheachtupleinthedatabasecorre-spondingtoanodeinthegraph.Eachforeign-key…primary-keylinkismodeledasadirectededgebetweenthecorrespondingtuples.(Thiscanbeeasilyextendedtoothertypeofconnections.)AkeywordqueryinBANKSconsistsof1searchterms,...,t.The“rststepistolocatethesetofnodesmatchingsearchterms;anodematchesasearchtermifitcontainsthesearchtermaspartofanat-tributevalueormetadata(suchascolumn,tableorviewnames).Letdenotethesetofnodesmatch- Figure2:ResultofquerysoumensunitaŽ;anodemaymatchmorethanonesearchterm,sothesmayoverlap.Intuitively,ananswertoaqueryisasubgraphconnectingsomesetofnodesthatcoverŽthekeywords,i.e.,eachkeywordmustmatchoneofthenodesin.Justbylookingatasubgraphitmaynotbeapparentwhatinformationitconveys.WewishtoalsoidentifyacentralŽnodeinthesubgraph,thatconnectsallthekeywordnodes,andstronglyre”ectstherelationshipamongstthem.Ananswertoaqueryisthereforemodeledasarooteddirectedtreecontainingatleastonenodefromeach;edgesaredirectedawayfromtheroot.Themotivationfordirectionalityisoutlinedlaterinthissection.NotethatthetreemayalsocontainnodesnotinanyandisthereforeaSteinertree.Fig-ure2showsasampleresultofaquerycontainingthekeywordssoumenandsunitaexecutedonthebiblio-graphicdatabase.Indentationisusedtodepictthetreestructure,andnodescontainingkeywordsaredis-tinguishedbytheircolor.2.1AnswerRelevanceIngeneral,theimportanceofalinkdependsuponthetypeofthelink,i.e.,whatrelationsitconnectsandonitssemantics;forexample,inabibliographicdatabase,thelinkbetweenthePapertableandtheWritestableisseenasastrongerlinkthanthelinkbetweenthePapertableandtheCitestable.ThelinkbetweenPaperandCitestablescancorrespondinglybegivenahigherweight.Theweightofatreeisproportionaltothetotalofitsedgeweights,andtherelevanceofatreeisinverselyrelatedtoitsweight.TheexampleinFigure1illustratesthatsomelinkspointtowardtherootofthetree,insteadofawayfromtherootasrequiredbyourmodel.Forinstance,theWritesrelationhasforeignkeystothePaperandAu-thorrelations,whereaswerequirepathsfromPaper Author,traversingaforeignkeyedgeintheop-positedirection.However,wecannotsimplyregardtheedgesasundirected.IgnoringdirectionalitywouldcauseproblemsbecauseofhubsŽthatareconnectedtoalargenumbersofnodes.Forexample,inauni-versitydatabaseadepartmentwithalargenumberoffacultyandstudentswouldactasahub.Asaresult,manynodeswouldbewithinashortdistanceofmanyothernodes,reducingtheeectivenessoftree-weightbasedscoringmechanism.Tosolvetheproblem,wecreateforeachedge(u,vbackwardedgev,u);intheexamplefromFigure1,thebackwardedgesensurethatthereisadirectedtreerootedatthepaper,withapathtoeachleaf.Wesettheweightof(v,u)totheweightof(u,v)multipliedbyafunctionofthenumberoflinkstofromthenodesofthesametypeas.Experimentswithdierentfunctionsindicatedthatthefunctionlog(1+),whereisthenumberofinlinks,providedgoodresults[3].(Iftherewasalreadyanedgefrom,wesettheedgeweighttotheloweroftheoriginaledgeweightandtheweightcomputedabove.)BANKSincorporatesanotherinterestingfeature,namelynodeweights,inspiredbyprestigerankingssuchasPageRankinGoogle[4].Withthisfeature,nodesthathavemultiplepointerstothemgetahighernodeweight(highernodeweightcorrespondstohigherprestige).E.g.,inabibliographydatabasecontainingcitationinformation,iftheusergivesaqueryQueryOptimizationourtechniquewouldgivehigherprestigetothepaperswithmorecitations.Asanotherexam-ple,inaTPCDdatabasestoringinformationaboutparts,suppliers,customersandorders,theordersin-formationcontainsreferencestoparts,suppliersandcustomers.Asaresult,ifaquerymatchestwoparts(orsuppliers,orcustomers)theonewithmoreorderswouldgetahigherprestige.Inthecurrentimplementationwesetthenodeweighttoafunctionofthein-degreeofthenode.Weexperimentedwithdierentfunctions,andgotgoodresultswiththefunctionlog(1+),whereisthein-degree.Ouruseoflogarithmsinedgeandnodeweightsissimilartotermweightingschemesininfor-mationretrieval.Nodeweightsandtreeweightsarecombinedtogetanoverallrelevancescore.Weexperimentedwithad-ditiveandmultiplicativecombinations,andfoundthatbothworkedwellwhentherelativeweightsforthetwoscoreswereappropriatelychosen.Detailsofthesearchalgorithmandtherelevancecomputation,alongwithapreliminaryperformancestudycanbefoundin[3].Althoughafewothersystemsimplementkeywordsearchondatabases(e.g.,[5,1,6])BANKSdiersfromallpriorworkinseveralways:notably,inthetech-niquesforedgeweightcomputationandprestigebasedranking,andtheuseofanin-memorygraphstructureforveryecientsearchwhilekeepingthebulkofthedatadiskresident.TheconnectionsofBANKStore-latedworkaredescribedinmoredetailin[3].2.2ExtensionsTheBANKSsystemsupportsiterativere“nementofIfmultiplenodesmatchakeyword,theusercanselectoneormorenodesasbeingrelevantandignoreothers;asanexample,twoauthorsintheDBLPdatabasematchthekeywordsudarshanŽ,andtheusercanchooseoneofthemandexecuteaquerymatchingitwithotherkeywords.Userscanrequestmoreanswerssimilartooneofthedisplayedanswers;similaritycanbede“nedonthebasisoftheanswertreestructure.Otherre“nementstotunenodeandedgeweightsarealsounderdevelopment.Insteadofdisplayingtreesconsistingofexplicittu-ples,systemdeveloperscanspecifyanswerformatsbasedonthetypeoftherootoftheanswertrees.Forinstance,onecanspecifyauthor,conference/journalandyearbedisplayedwhenevertherootnodeisfrompaperrelation.Wearecurrentlyworkingonimple-mentinganswerformatting,andonsupportingnega-tionanddisjunctioninqueries.3BrowsingTheBANKSsystemprovidesarichinterfacetobrowsedatastoredinarelationaldatabase.Thesystemau-tomaticallygeneratesbrowsableviewsofdatabasere-lationsandqueryresults;nocontentprogrammingoruserinterventionisrequired.Everydisplayedforeignkeyattributevaluebecomesahyperlinktothereferencedtuple.Inaddition,pri-marykeycolumnscanbebrowsedbackwards,to“ndreferencingtuples,organizedbyreferencingrelations(userscanselectaspeci“creferencingrelation).Eachtabledisplayedcomeswithavarietyoftoolsforinteractingwithdata.Columnscanbeprojectedaway(dropped),andselectionscanbeimposedonanycolumn.Forforeignkeycolumns,clickingonjoinŽresultsinthereferencedtablebeingjoinedin,anditscolumnsalsodisplayed.Thiseliminatestheneedforexplicitlywritingjoinqueriesforthenormalcaseofforeignkeyjoin.Thejoinfeaturecanalsobeusedintheotherdirection,fromaprimarykeytoareferencingforeignkey.Resultscanbegroupedbyacolumn;thisresultsinonlythedistinctvaluesforthatcolumnbeingdisplayed.Theusercanclickonanyofthevaluestoseethetuplesassociatedwiththatvalue.Tuplesinthedisplayedtablecanbesortedbyaspeci“edcolumn. Figure3:BrowsingExamples:(a)Samplebrowsingsession(b)PiechartControlsfortheseoperationscanbeaccessedbyclickingonthecolumnnamesinthetableheader.Inaddition,displayeddataispaginated,andschemabrowsingissupported.Figure3(a)showstheresultofbrowsingthethe-sisdatabasestartingwiththestudentrelation,usingapop-upmenuontherollnumberattributetoeectajoinwiththethesisrelationanddroppingseveralcolumns.Thejoinismadepossiblesincethethesislationhasaforeign-keyattributereferencingthedentrelation.Asamplepop-upmenuisshownforthefemailattributewhichreferencesthefacultytable.Hyperlinksinthedisplayeddataareautomaticallygeneratedbythesystem.EachhyperlinkcorrespondstoanSQLquerythatisexecutedwhenauserclicksonthelinks.Thus,allthepagesinthesystemaregen-eratedonthe”ybyexecutingcorrespondingqueriesagainsttheunderlyingrelationaldatabase.Nopre-computationisrequired.BANKStemplatesprovideseveralprede“nedwaysofdisplayingdata.Templateinstancesarecustomized,storedinthedatabase,andgivenahyperlinkname,whichisusedtoaccessthetemplate.TheBANKSsystemcurrentlyprovidesfourtypesoftemplates:Cross-tabs(similartoOLAPcross-tabs),withdrill-downfacilities.Thegroup-bytemplateprovidesahierarchicalviewofdata,byspecifyingasequenceofgroupingattributes.Folder-treeviews,whichprovideanotherhierar-chicalviewofdata.Thegraphicalinterfacetemplatepermitsinforma-tiontobedisplayedinbarchartlinechartpiechartformat.HyperlinksareprovidedonthegraphicaldataviaHTMLimagemaps,toallowdrilldownonthedata.Figure3(b)showsanex-amplepiechartgeneratedbyBANKS.Anotherinterestingfeatureoftemplatesisthattheycanbecomposedtogetherinahyperlinked,visualmanner.SeveralexampletemplatesareavailableontheBANKSwebsite.4ConclusionsTosummarize,wehavedevelopedanintegratedsys-temforkeywordsearchingandbrowsingofdatabases.Thesystemhasmanyusefulfeatureswhichallowca-sualuserstoaccessdatabaseinformationinanintu-itivemanner.BANKSenablesalmosteortlessWebpublishingofrelationalandXMLdatathatwouldoth-erwiseremain(atleastpartially)invisibletotheWeb.WehavealsodevelopedaprototypeversionofBANKSthatworksonXMLdata,supportingkey-wordsearchingandscalablebrowsingoflargeXMLdatasets.WeareworkingonintegratingXMLsearch/browsingwiththerestoftheBANKSsystem.References[1]SanjayAgrawal,SurajitChaudhuri,andGautamDas.DBXplorer:Asystemforkeyword-basedsearchoverrelationaldatabases.InProcs.ICDEFeb.2002.[2]PeterBailey,NickCraswell,andDavidHawking.DarkmatterontheWeb.InPosterProceedings,9thWorld-WideWebConference,2000.[3]GauravBhalotia,ArvindHulgeri,CharutaNakhe,SoumenChakrabarti,andS.Sudarshan.Key-wordsearchingandbrowsingindatabasesusingBANKS.InProcs.ICDE,Feb.2002.[4]SergeyBrinandLawrencePage.Theanatomyofalarge-scalehypertextualWebsearchengine.Com-puterNetworksandISDNSystems,30(1…7),1998.[5]ShaulDar,GadiEntin,ShaiGeva,andEranPal-mon.DTLsDataSpot:Databaseexplorationus-ingplainlanguage.InProcs.VLDB,1998.[6]VagelisHristidisandYannisPapakonstanti-nou.DISCOVER:Keywordsearchinrelationaldatabases.InProcs.VLDB,Aug.2002.

Shom More....
By: faustina-dinatale
Views: 146
Type: Public

Download Section

Please download the presentation after appearing the download area.


Download Pdf - The PPT/PDF document "BANKS Browsing and Keyword Searching in ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Try DocSlides online tool for compressing your PDF Files Try Now

Related Documents