3379 3370 3372 3373 3378 3377 3371 3374 3380 3369 3376 3381 3375 Figure2AnOntologyDrivenApproachforConversationalBIsystemsanaturalconversationinterfaceNCI2forsupportingBIapplicationsWecreateanon ID: 937656
Download Pdf The PPT/PDF document "terfacesegchatbotnaturallanguagesearchet..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
3379 3370 3372 3373 3378 3377 3371 3374 3380 3369 3376 3381 3375 terfaces(e.g.,chatbot,naturallanguagesearch,etc.)foranykindofdomain(e.g.,weather,music,nance,travel,healthcare,etc.).Thesecustomordomain-specicnatu-rallanguageassistantsusuallytargetarangeofdomainspecictasks,suchasbookinga ight,orndingadrugdosage.Suchtask-orientedagentslimitthescopeofthein-teractiontoaccomplishingthetaskathandandhencearemoretractabletodesignandbuild.However,thesetask-orientedagentsfailtoaddressthechallengesinvolvediniterativedataexplorationthroughconversationalinterfacestogaininformationandderivemeaningfulinsights.Recently,severalbusinessintelligencetools,suchasAskDataTableau[2],PowerBI[8]byMicrosoft,Microstrat-egy[6],andtheIBM'sCognosAssistant[3],alsoexploredexploitingnaturallanguageinterfaces.Theseearlysystemshavemanyrestrictionsintermsoftheconversationalin-teractiontheyprovide,astheyrelyontheusertospecifyseveralparameters,andonlyoeraxedsetofpatterns.ThereareseveralchallengesincreatingaconversationalinterfaceforaBIapplication.Therstchallengeiscreat-ingadatamodelthatcapturestheentities,andtheirrela-tionshipsandassociatedsemanticsthatarerelevanttotheunderlyingdataandthecommonsetofBIqueriesandop-erations.Wehavetwooptions:ModelingtheunderlyingdataintheRDBMS,ormodelingthecubedenition.Wechosethelatter,becauseacubedenitionprovidesimpor-tantBIspecicinformation,suchasmeasures,dimensions,dimensionhierarchies,andhowtheyarerelated.Thesecondchallengeisbuildingthenecessarycapabilityoftheconversationsystemtocaptureuserintent,recognizeandinterpretthedierentworkloadaccesspatterns.Weex-plorethreedierentapproaches,whichweexplainindetailinSection3.3.Thersttwoapproachesuseonlytheinfor-mationavailableintheontology,capturingthestructuralrelationshipsbetweenmeasuresanddimensions.Thethirdapproachalsotakesintoaccountuser'saccesspatterns.ThethirdandthenalchallengeistheintegrationwiththeunderlyingBIplatformtoissueappropriatestructuredqueriesandrendertheintendedvisualizations.Inthispaper,weexploretheuseofconversationalin-terfacesforBIapplications.Inearlierwork[22],wede-velopedanontology-basedapproachtodevelopingconver-sationalservicestoexploretheunderlyingstructureddatasets.Inparticular,wedevelopedtechniquestobootstraptheconversationworkspaceintermsonintents,entities,andtrainingsamples,byexploitingthesemanticinforma-tioninanontology.Inthispaper,weextendthatworkforBIapplications.Inparticular,weobservethatusersfollowcertainBIpatternsandoperationswhenanalyzingtheirdatausingBItools.Weexploitthisinformationintheconstructionoftheconversationworkspace,aswellastheconversationdesign.WehaveimplementedourtechniquesinHealthInsights(HI),anIBMWatsonHealthcareoer-ing,providinganalysisoverinsurancedataonclaims,andourinitialfeedbackfromusershasbeenverypositive.WedemonstratetheeectiveexploitationoftheBIaccesspatternstoprovideamoredynamicandintuitiveconversa-tionalinteractiontoderivebusinessinsightsfromtheunder-lyingdata,withoutbeingtiedtoaxedsetofpre-existingdashboardsandvisualizations.WeevaluateourapproachandshowthatourconversationalapproachtoBInotonlycoverstheusecasessupportedbypre-deneddashboards,butgoeswaybeyondtoassistusersinbetterunderstandingtheinsightsfromexistingvisualizationsaswellasdiscov-eringnewandusefulinsightsthatarenotcoveredbythepre-deneddashboardsthroughthedynamicgenerationofstructuredqueriesandintegrationwiththeunderlyingBIplatform.Themaincontributionsofthispapercanbesummarizedas:Weproposeanend-to-endontology-basedframework,andtoolstocreateaconversationserviceforBIappli-cations.Wecreateanontologyfromabusinessmodel,captur-ingallthekeyinformationfortheBIapplication,in-cludingmeasures,dimensions,dimensionhierarchies,andtheirrelationships.WeexploitcommonBIaccesspatternsandusetheon-tologytogenerateseveralconversationspaceartifactsautomatically,includingintents,entities,andtrainingexamples.WeadaptthedialogstructuretosupporttheBIAc-cesspatternsandoperationstoprovideanintuitiveconversationalinteractionforBIapplications.WeimplementanddemonstratetheeectivenessofourproposedtechniquesforHealthInsights,anIBMWat-sonHealthcareoering.Therestofthepaperisorganizedasfollows.Section2providesabriefoverviewofourontology-drivenapproachforbuildingconversationalinterfacesforBIapplications.Sec-tion3describesindetailourapproachfordatamodelingandgenerationofconversationalartifactsincludingintents,entitiesanddialog.WediscusstheimplementationofourproposedtechniquesinahealthcareusecaseHealthInsightsinSection4andprovideadetailedsystemevaluationinSec-tion5.W
ediscussrelatedworkinSection6andconcludeinSection7.2.SYSTEMOVERVIEWInthissectionweprovideabriefoverviewofourontologydrivenapproachtobuildingaconversationalBIsystemforderivingusefulinsightsfromdataindierentdomains.2.1OntologydrivenapproachInourpriorwork[22],wedemonstratetheviabilityofus-inganontology-basedapproachforbuildingconversationalsystemsforexploringknowledgebases.Ontologiesprovideapowerfulabstractionofrepresentingdomainknowledgeintermsofrelevantentities,datapropertiesandrelationshipsbetweentheentitieswhichismuchclosertoandintuitivefornaturallanguageinteraction.Wehaveshowntheeec-tivenessofcapturingpatternsintheexpectedworkloadandmappingthemagainstthedomainknowledgerepresentedusinganontologytogenerateartifactsforbuildingacon-versationalsystemin[22].InthispaperwebuildfurtheronthiseectiveapproachtocreateaConversationalBIsystemforsupportingnaturallanguageinterfacesforBIapplications,wheretheworkloadischaracterizedbyarichsetofaccesspatternsagainstanOLAP[13]businessmodeldenedovertheunderlyingdata.Figure2outlinesourontology-drivenapproachtobuilding Figure2:AnOntologyDrivenApproachforCon-versationalBIsystems.anaturalconversationinterface(NCI)2forsupportingBIapplications.Wecreateanontologyfromthebusinessmodeldenedovertherawdataintheformofanontologywhichprovidesrichsemantics,reasoningcapabilitiesandanentity-centricviewofthebusinessmodelwhichisclosertonaturallan-guageconversation.Inadditiontothis,theontologypro-videsthenecessaryformalismtocaptureandrepresentthestructureandcontentoftheinformationdenedinthebusi-nessmodelusingawellacceptedindustrystandard[1].Theontologyrepresentsacentralrepositoryforcapturingthedomainschemaandanychangestoitovertime,thusmak-ingthedesignofoursystemmoredynamicandenablingadaptabilitytodierentdomainswithadditionalinputfromsubjectmatterexperts(SMEs)(RefSection3.4.2).Morespecically,theontologycapturesthemeasuresanddimensionsdenedinthebusinessmodelasentities,theirtaxonomyorhierarchiesasdescribedinthecubedenition,intermsofparent-childrelationships.Measurescorrespondtoquantiableelementscomputedoveroneormoreele-mentsinthephysicalschemasuchascolumnsinarela-tionalschemaanddimensionsrepresentcategoricalorqual-ifyingattributes.Theontologycapturestheindividualrela-tionshipsbetweenthemeasures,dimensionsanddimensiongroups,theattributesdescribingindividualmeasures,di-mensionsasdataproperties.Inadditiontothis,wealsode-nespecialconceptsintheontologycalledMetaConcepts.Thesemetaconceptsrepresentahigherlevelgroupingofmeasures/dimensionsprovidedbySMEsorlearntfromtheunderlyingdatathroughmachinelearningordeeplearningtechniqueswhichwerefertoasontologyenrichment.Metaconceptsprovideapowerfulabstractionforreasoningatasemanticallyhigherlevelandenabletheconversationsys-temtosupportthequeryingneedsofamuchwiderrangeofpersonas(Section3.4.1).WemapthecommonBIaccesspatternsagainsttheontol-ogyanduseittodrivetheprocessofbuildingtheconversa-tionsystemthatallowsuserstointeractwiththeunderlyingdatausingaNCI.WeuseIBM'scloudbasedWatsonAssis-tant(WA)servicetobuildtheconversationsystem.2.1.1AutomatedWorkowTheautomatedwork owrepresentedinFigure2describestheprocessofautomaticallygeneratingthenecessaryarti-factsforbuildingadomainspecicconversationalBIsysteminadomainagnosticway. 2Weusethetermsnaturalconversationinterface,conver-sationalinterface,conversationalsysteminterchangeablyintherestofthepaper.Theautomatedwork owacceleratestheprocessofbuild-ingaconversationalBIsystemandiskeytoenablingrapidprototypingandsystemdevelopmentagainstdataindier-entdomains.Thework owhasthreedistinctsteps.Therststepinvolvesthegenerationoftheontologyfromthebusinessmodel.Inthesecondstep,theinformationcap-tured/modeledintheontologyisusedtodrivethegenera-tionoftherequiredartifacts/componentsoftheconversa-tionspace.Theconversationspaceconsistsofthreemaincomponentsthatenableittointeractwithusers:intents,entities,anddialogue.Intentsaregoals/actionsthatareexpressedintheuserutterances,whileentitiesrepresentrealworldobjectsrelevantinthecontextoftheuserut-terance.Typicallyconversationalsystemsuseaclassieroradeepneuralnetworktoidentifytheintentinauserutter-ance[15]andhencerequiretrainingexamplesintermsofsampleuserutterancesforeachintent.Thedialoguepro-videsaresponsetoauserconditionedontheidentiedin-tents,entitiesintheuser'sinputandthecurrentcontextoftheconversation.Thenalstepistheintegrationoftheconversationspacewithanexternaldatasourceoranalyt-icsplatformthatstoresandprocessesthedata.Thisin-tegrationisachievedthroughstructuredquerygeneration(Section3.7)againsttheanalytics
platformtoenablestheconversationalsystemtorespondtouserutteranceswithin-sightsintheformofcharts/visualizations.Ascanbeseen,theautomatedwork owutilizesthedomainspecicaspectsincludingthedomainontologyandthedomainvocabulary(entities)andenablesthecreationofadomainspeciccon-versationalsystemwhilemakingtheprocessitselfrepeatableacrossdierentdomains.AdetaileddescriptionofeachofthesestepsisprovidedinSection3.3.CONVERSATIONALBISYSTEM3.1DataModelingTheOLAP[13]businessmodelsdescribetheunderlyingdataintermsofmeasures,dimensions,theirrelationshipsandhierarchies.WehavedevelopedanontologygenerationmodulethatcreatesanOWLontology[1]givenanOLAPbusinessmoduleasinput.Foreachmeasureanddimen-sionspeciedinthebusinessmodule,theontologygenera-torcreatesanOWLconcept/classintheontology.Further,foreachmeasureanddimensionthatareconnectedinthebusinessmodule,itcreatesanOWLfunctionalobjectprop-ertywiththemeasureasthedomainandthedimensionastherangeoftheobjectproperty.TheattributesassociatedwiththemeasuresanddimensionsinthebusinessmodelareaddedasOWLdatapropertiesfortherespectivemea-sureanddimensionconceptsintheontology.Finally,allthedimensionalhierarchiesinthebusinessmodelarecapturedasisArelationshipsbetweenthedimensionscreatedintheontology.Theontologyisfurtherenrichedtodenemeta-conceptsasahierarchyoflogicalgroupingsofexistingmeasuresanddimensionsextractedfromthebusinessmodelwiththehelpofSMEs.Thesehierarchicalgroupingofmeasuresanddi-mensionscalledmeta-conceptsareannotatedassuchintheontologywithappropriatelabelsmarkingontologyconceptsasactualmeasures,dimensionsandmeta-concepts.Theg-uresbelowshowanexamplemeasurehierarchy(RefFig-ure3)anddimensionhierarchy(RefFigure4)captured intheenrichedontologyalongwithannotationsformeta-concepts.Aswedescribeinthenextsection,thisentity-centricmodellingofthebusinessmodeliskeytorepresentandreasonaboutthecommonBIworkloadaccesspatternsandoperations(BIAccessPatterns3)assubgraphsovertheontologyandgeneratethenecessaryartifactsfortheconver-sationsystem. Figure3:CapturedMeasureHierarchy. Figure4:CapturedDimensionHierarchy.3.2OntologydrivengenerationofconversationalartifactsThesecondstepintheautomatedwork ow(Figure2)consistsofgenerationofconversationalartifactsfromtheinformationcapturedintheontology.Thecentraltenantoftheartifactgenerationprocessrevolvesaroundsupport-ingtheBIaccesspatternstogainbusinessinsightsusingaconversationalinterface.Figure5describestheartifactsrequiredforconstructingaconversationspaceintermsofin-tents,entities,dialogandhowwemapthemtothespecicelementsrelevanttoBI.Morespecically,wemapIntentstoBIpatterns.Entitiesaremappedtothemeasuresanddimensionsdenedinthebusinessmodelandcapturedintheontology.ThedialogisespeciallydesignedtosupportinteractionwiththeuserbasedontheBIpattern/intentandentitiesdetectedintheuserutterancesandthecurrentcon-textofuserconversation.Integrationwithanexternaldatasourcesuchasananalyticsplatformisrequiredtosupportactionssuchasrespondingtouserrequestswithappropriateresultsincludingcharts/visualizations.Nextwedescribeindetailthemodelingandgenerationofintents,theirtrainingexamples(Section3.3)andentities(Section3.4.1)fortheconversationspace.ConstructionofdialogisdescribedindetailinSection3.6.IntegrationwithanexternaldatasourcerequiresstructuredquerygenerationwhichwedescribeinSection3.7. 3WeuseBIAccessPatternsandBIpatternsinterchangeablyintherestofthepaper. Figure5:ConversationSpaceandartifactsrequired.3.3IntentModelingforBIAsdescribedinFigure5,Intentscapturethepurposeorgoalintheuserquery/input.Whiledesigningtheconversa-tionalBIsystem,weconsideredthreedierentapproachesformodelingintents.Thersttwoapproachesapproachesarebasedonthestructuralrelationshipsbetweenthemea-suresanddimensionsintheontology.Thethirdapproachcombinestheuseraccesspatternsextractedfrompriororexpectedworkloadswiththestructuralinformationintheontology.Wedescribeeachoftheseapproachesbelowandprovideabriefevaluationtoascertaintheireectiveness.3.3.1ModelingintentsascombinationsofMeasuresandDimensionsInthisapproachwetraversetheontologyandcapturevalidcombinationsofindividualmeasuresanddimensionsasintents.Foreachidentiedmeasureintheontology,thealgorithmtraverseseachedgethatconnectsthemeasuretoadimension.Eachsuchidentiedpairthatisconnectedviaanedgeintheontologyisidentiedasavalidcombination.Thisisthenestgranularityofgeneratingintentsfortheconversationsystemwhichcapturestheuser'sgoal/purposeofobtaininginformationaboutaparticularmeasurewithrespecttoaparticulardimension.Theproblemassociatedwiththisapproachisbothintermsofscalabilityandaccuracy.Modeli
ngintentsatsuchnegranularityleadstoacombinatorialexplosionofthenumberofintentsandtheircorrespondingtrainingexam-ples.Further,asseveralintentsmaycontainoverlappingsetsofmeasuresandentitiestheclassicationaccuracyintermsofF1-scoredropsleadingtopooruserexperience.3.3.2ModelingMeasuresasintentsThisapproachmodelseachindividualmeasureasasep-arateintent.Suchanapproachallowsustocapturetheuser'sintentintermsofobtaininginformationaboutapar-ticularmeasure,irrespectiveofthedimension(s)itneedstobeslicedby.Inordertogeneratetrainingexamplesforeachintent,wetraversetheontologytodeterminethevalidcom-binationsofmeasuresanddimensionsandusethattocreatetrainingexamplesforeachintent.Thisapproachreducesthecombinatorialexplosionofthenumberofintentsascomparedtothepreviousapproachdis-cussedabove.Howeverasthenumberofmeasurescapturedintheontologyfromtheunderlyingbusinessmodelgrowlargerthenumberofintentsandtheirassociatedtrainingexamplesmaystillbequitelarge.Thisagainmayleadtosignicantscalabilityproblems.Anotherissuewiththisap-proachisthattheremightbeconsiderableoverlapbetweenthetrainingexamplesofcertainintentsleadingtolowac-curacy.Thisisduetothefactthatdierentmeasuresmay berelatedtothesamedimensions.Fore.g.#Admitsand#Dischargescanbothberelatedtodimensionssuchasyearorfacility.3.3.3ModelingBIpatternsasintentsInthisapproachweidentifythecommonBIworkloadac-cesspatternsfromprioruserexperienceandBIapplicationlogs.Eachsuchidentiedpatternismodeledasanintent.Wedevelopontologytraversalalgorithmsthatmaptheseidentiedpatternstosubgraphsovertheontology.Foreachsuchsubgraph,weidentifythemeasures,dimensionsandtheirassociatedinstancedatacrawledfromtheunderlyingdatastoretogeneratetrainingexamplesforeachintent.ModelingintentsasBIpatternshasthecriticaladvantageofcombininguseraccesspatternswiththedomainknowl-edgeintermsofthestructuralrelationshipsbetweenthemeasuresanddimensionsintheontology.Combiningthisinformationallowsustobettermodeltheintentsconversa-tionalBIapplications.Inourexperiencethisapproachpro-videsthemaximumcoverage(Recall)ofuserquerieswith-outhavingtodealwithacombinatorialexplosionintermsofthenumberofintents.Thismakesthisapproachmostscalable.EachintentisverywelldenedandhassucientdistinctionintermsofassociatedtrainingexamplestherebygivingthehighestaccuracyintermsofF-1scoresamongstalltheapproachesdiscussedabove.WedescribeeachofthesepatternsindetailwithexamplesinSection3.3.4.Table1showsasummaryofcomparisonofthedier-entapproachesformodelingintentsfortheHIdataset(RefSection4)containing64Measures,274dimensions,576re-lationships.Thecomparisonisdonetoassessthescalabilityofeachapproachintermsofnumberofintentsandtrainingexamplesthatwouldberequiredtocovercombinationsofuserutterancesinvolvingonaverageonemeasureandtwodimensions.AdetailedanalysisforaccuracyisprovidedinSection5.Table1:Comparisonofintentmodelingapproaches Modelingapproach #Intents #TrainingE.g.s Measure,Dimension 5763 (5763)10 combinationasintents Measuresasintents 12 125762 BIpatternsasintents 7 764274 3.3.4BIConversationPatternsInthissectionwedescribethecommonlyidentiedBIaccesspatternslearntfrompriorBIworkloadsandapplica-tionlogs.Eachofthesepatternsismodeledasanintentintheconversationspaceandrequiresthegenerationoftrain-ingexamplesforthesame.Aclassierintheconversationspaceistrainedusingtheseexamplestoclassifyuserutter-ancesintooneoftheBIpatterns.OncetheBIconversationpatternisidentied,theconversationalsystemextractstherelevantentitiesmentionedintheuserutteranceintermsofmeasures,dimensions,ltervalues.Thedialogstructureusestheseextractedintentsandentitiesandthecurrentcon-versationalcontextandprovidesappropriateresponses.WedescribebelowthecommonBIaccesspatternsthatwehaveusedinourHealthInsightsusecase(Section4).BIAnalysispattern.Thispatternisthemostcom-monBIpatternthatallowsuserstoseeameasure(s)slicedalongaparticulardimension(s)andoptionally Figure6:BIAnalysisquerypattern(Bestviewedincolor). Figure7:BIRankingpattern(Bestviewedincolor).applyingalter(s).Figure6showsthepatternalongwithanexample.BIOperationpatterns.TheBIoperationpatternscapturesomecommonBIoperationswhichusuallyfol-lowotherBIqueriessuchasaBIAnalysisqueryforfurtheranalysisontheresultsobtained.Wedescribetheseoperationsbelowandprovideexampleuserinter-actionsassociatedwiththeseoperationsinausecasethatwehaveimplemented,describedinSection4.{Drilldownoperationpattern:Accessmoregranularinformationbyaddingdimensionstothecurrentquery.{Rollupoperationpattern:Accesshigherlevelinformationbyaggregatingalongthedimensionhierarchywithrespecttothecurrentquery.{P
ivotoperationpattern:Accessdierentin-formationbyreplacingdimensionsinthecurrentquery.BIRankingPattern.TheBIrankingpatternallowsuserstoordertheresultsbyameasurevalue(oranaggregationappliedonameasure)generallytoobtainthetopkvalues.Figure7showsanexampleofarankingBIpattern.Theresultsaresortedby#AdmitsshownalongthedimensionMDC(MajorDiagnosticCategory).BITrendpattern.Thispatterncapturesthevari-ationofameasurealongdimensionssuchastimeorgeographytoascertainthetrendassociatedwiththemeasureofinterest.Figure8showsanexampleoftheBItrendpattern.TheresultsfortheexamplequerydisplaythevariationofthemeasureNetPaymentbyIncurredYearorPaidYearbothofwhicharetimedimensionsasinferredfromtheontology.ThepatternlookssimilartotheBIAnalysispattern,howeverwechosetomodelitasaseparatepatternasthelinguisticvariabilityofqueriesrequestingfortrendsvsstandardBIAnalysisquerieswassucientenoughtowarrantaseparateintent.Forexamplethe Figure8:BITrendpattern(Bestviewedincolor). Figure9:BIComparisonpattern(Bestviewedincolor).samequerycouldalsobeexpressedasShowmethetrendsinmynetpayment.ThisqueryisidentiedasaBItrendpatternandadefaultdimensionoftime(paidorincurredyear)ischosentoshowthevariationofthemeasurenetpayment.Wefurtherdescribethechoiceofdefaultinferencesformeasuresanddimensionsinsection3.4.1.BIComparisonpattern.AnothercommonBIpat-ternobservedistheBIcomparisonpatternwhichal-lowsuserstocomparetwoormoremeasuresagainsteachotheralongaparticulardimension(s)andoption-allyapplyingltervalue(s).Figure9showsanexam-pleBIpatternthatcomparesthenumberofadmitstodischargesbyhospital(dimension)fortheyear2017(altervalue).3.3.5GenerationofIntenttrainingexamplesWefollowasimilarprocessasdiscussedin[22]fortheautomaticgenerationoftrainingexamplesfortheidentiedintents.Theabove-mentionedBIconversationpatternsaremappedovertheontologyassubgraphsandusedastem-platesforgeneratingtrainingsamplesbyplugginginthemeasure,dimensionandltervaluesasdiscernedfromthedomainontologyandtheinstancevaluesthatmaptodier-entelementsintheontology.Morespecically,foreachBIpatternmodeledasanin-tent,thecorrespondingtemplate(examplesofwhichareshownabove)ispopulatedwiththeappropriatemeasure,dimensionandltervaluesusinganalgorithmthattra-versestheontologyanddiscoversappropriaterelationshipsbetweenmeasures,dimensiongroupsandtheirhierarchiesandpopulatesthetemplatesaccordingly.Thesegeneratedtrainingexamplesareusedtotraintheintentclassiermodelintheconversationspace.Figure10showsasampleoftrain-ingexamplesgeneratedfortheBIAnalysisQuerypattern.Theinitialphrasesforeachintent,suchasShowme,Givemethenumberof,etc.areprovidedasaninputtothealgo-rithmwhichpickstheseatrandomtogeneratethetrainingexamples.Theautomaticallygeneratedtrainingexamplesarealsofurtheraugmentedwithmoreexampleswiththe Figure10:GenerationofIntentTrainingExamples(Bestviewedincolor).helpofSMEsandfromqueriesseeninpriorworkloads/userexperiencesifavailable.3.4EntityModelingforBIThissectiondescribesindetailhowwemodelentitiesrel-evanttotheaccesspatternsandtheunderlyingbusinessmodel.Werstdiscusshowmeasures,dimensionsandtheirhierarchiesarecapturedandpopulatedasentities.Next,wedescribetheadditionofdomainspecicvocabularyandsynonymstotheconversationspacetoprovidegreater ex-ibilityandimprovetherecallofuserutterances.Finallywetalkabouttheuseofdefaultinferencesandtheirrelevanceinthesystemdesignforprovidingabetteruserexperience.3.4.1ModelingofMeasures,Dimensions,theirhierarchiesandrelationshipsConceptsintheontologygeneratedfromthebusinessmodelareannotatedasthefollowing:Measuresanddimensions.Theseentitiesarepartofthecubedenitionandaremappedtoappropriatecolumnsintheunderlyingrelationalschema.TheBIQueriesinvolvingthesemeasuresanddimensionsintheontologyaremappedtoappropriatestructuredqueriesagainstanexternaldatasourcetoprovidetherequiredresponse.Metaconcepts.Thesearepartofahierarchywhichrepresentslogicalgroupingsoftheunderlyingmea-suresordimensionsandarenotmappeddirectlytoanyelementsintheunderlyingrelationalschema(RefFigures3,4).Thesemetaconceptsmightbedenedandextractedfromthebusinessmodelifavailable,orareadditionalmetadatainformationprovidedbytheSMEsandincludedintheontologyasapostprocess-ingorenrichmentstep.Figure11showsexamplequeriesthatdemonstratetheeectivenessofmodelingmetaconceptsintheontol-ogy.ThesequeriesconformtotheBIAnalysisQuerypatternandrefertocostsasameasurethatisametaconcept.OndetectingastandardBIanalysisquerywithametaconcept'costs'asanentity,theconver-sationspaceutilizesthemappingsfromtheontologybetweencostsandtheactualmeasuressuchas#Ad-mits,NetPayment
s,etc.andprovidesuserswiththeoptionstochoosefromtheactualsetofmeasuresas-sociatedwithcostsorprovidesresultsforallthemea-suresassociatedwithcostdependingonuserprefer-encesinthedomain. Additionally,inferenceofthemetaconceptCostsisalsodrivenbythecurrentcontextofuserconversa-tion.Forexample,iftheuserhasbeentalkingaboutAdmissionsinhisprioruserutterances,measuresasso-ciatedwithadmissionswouldbecapturedinthecur-rentconversationalcontext.BasedonthiscostsmaybemappedtothemeasureAllowedAmountAdmit.Clearly,weseethatthemechanismwebuiltaroundthecreationandutilizationofmeta-conceptgroupingsormappingsinourconversationalsystemdesignpro-videsapowerfulmechanismtosupportmorecomplexandhigherlevelqueries(Figure11).Thishelpsin-creasetheapplicabilityofoursystemforawidevari-etyofpersonasthatareinterestedingainingbusinessinsightsatdierentlevelsfromtheunderlyingdata. Figure11:ExampleQueriesreferringtoameasureMeta-Concept.3.4.2DomainspecicvocabularyandsynonymsDomainspecicvocabularyandsynonymsallowuserstoexpressqueriesusingterminologythatiscommontothedomainanddoesnotrestrictuserstousequerytermsthatarespecictoeithertheterminology/vocabularyusedintheontologyorinstancesofdatacorrespondingtotheon-tology.OursystemincorporatesdomainspecicvocabularyandsynonymscollectedfromSMEs/domainexpertsinclud-ingstandardtaxonomiessuchasSNOMEDinthemedi-caldomainaswellastaxonomiesdevelopedbySMEssuchasthoserelatedtodiagnosis,therapeuticdrugclasses,etc.Thesedictionaries/taxonomieshelpmapthesynonymsandothervocabularytermstoentitiesintheontologyandhelpinincreasingtherecallofentitiesthatcanbeinferredbytheconversationalsystemfromuserutterancetherebyallowingusersa exiblemechanismtosupportavarietyofqueriesagainsttheunderlyingdata.3.5Defaultsandlearningfromexperience3.5.1DefaultinferencesAnimportantaspectofconversationalsystemdesignes-peciallyrelevanttouserexperienceistheuseofdefaultin-ferences.Theseareusedforinferringmissingparametersinaquerythattheusersassumethesystemwouldinferautomaticallygiventhecontextoftheconversation.Theseoftenincludeinferringdefaultmeasuresforaparticulardi-mensionandvice-versainaconversationalthreadwithauser.Fore.g.Showmethetop-KDRGsforpregnancyre-quirestoshowthe#AdmitsorAllowedAmountasinferredmeasures(notexplicitlymentionedintheuserutterance)forthedimensionDRG(DiagnosisRelatedGroup)andsorttheresultsbythemeasurevalue.ThesedefaultinferencesaremadebyintegratingdictionariescontainingthisinformationobtainedfromSMEsintotheconversationalworkspace.Usingdefaultinferenceshelpsimproveuserexperiencebyavoidingaskingtoomanyfollow-upquestionsandcanbedynamicallyadjustedasuserseitheracceptthedefaultinfer-enceorprovidefeedbackthatenablesustoupdate/modifythedefaultinferencesusedbythesystem.3.5.2LearningfromfeedbackAsaconversationalsystemistestedwithrealusers,alter-nativephrasingsofknownintentsandsynonymsofknownentitieswillemerge.Asthesearediscoveredthroughtest-ing,theyareaddedtointenttrainingexamplesorentitysynonymlistssothatthesystemlearnsovertime.Utter-ancesthatwerenotrecognizedbythesystemareobtainedfromtheapplicationlogs4,andalternativephrasingsorsynonymsareidentiedtobeaddedtothesystem.Thisadditionofnewdatacanbeautomatedthroughparticularconversationpatternsthatenablethesystemandusertoidentifyanewbitofdataandthenaddittothetrainingcorpuswithouttheinterventionofadeveloperordesigner.3.6BuildingtheDialogNaturallanguageinteractionplatforms,suchasIBM'sWatsonAssistant,enablemanydierentstylesofinterac-tion.Attheircore,theyconsistofintentsandentities,forunderstandingusers'naturallanguageinputs,andadia-logmanager,withcontextvariables,fordecidinghowtorespond.Inthissection,webrie ydenetheparticularin-teractionmodelweusedandthendetailhowwebuiltitandadapteditforBI.3.6.1QueryModelThesimplestinteractionmodelforanaturallanguage-basedsystemisperhapsthequerymodel.Underthismodel,userssubmitqueriesandthesystemrespondswithanswers,muchlikeasearchengine.Example01U:ShowmeadmitsbyDRGfor201702A:HereareAdmitsbyDiagnosisRelatedGroupfor2017:03((chartappears)) Example01U:Showmeadmits02A:I'mafraidthatisaninvalidquery. Whilethissimplequerymodelcanbepowerfulinenablingaccesstodomaininformation,itisnotconversational.Thesystemonlyproducesoneoftwopossibleresponses:An-swerorNoAnswer.Inaddition,each2-utterancesequenceisindependent.Iftheuseruttersasecondquery,itwillbeinterpretedwithoutanycontextfromthepreviousqueryoranswer.Finally,thesystemdoesnotrecognizeconver-sationalutterances,suchasdisplaysofappreciationorre-questsforrepeats. 4Allapplicationlogsusedforlearningareanonymizedandaredevoidofanypersonalinformationformaintainingdataprivacy. 3.6.2NaturalConversat
ionModelOnealternativetoaquery-orientedinteractionmodelisanaturalconversationmodel.Althoughtheterm"conversa-tion"isusedformanydierentkindsofinteraction,wede-neanaturalconversationinterfaceornaturalconversationagentasonethatexhibitstheabilityfornaturallanguageinteraction(understandingandrespondinginnaturallan-guage),persistingcontextacrossturnsofconversationandconversationmanagement[21].WecreatedournaturalconversationinterfacebyusingtheNaturalConversationFramework(NCF)[21].TheNCFprovidesapatternlanguageofover100reusableinteractionpatterns,whichwehaveimplementedontheWatsonAs-sistantplatform.WereusedprimarilytheNCF'sOpenRe-questforenablingseriesofcomplexrequests,inadditiontosomeofitsconversationmanagementmodules.TheOpenRequestmoduleenablesstandard"slot-lling,"oragent-initiateddetailelicitation,butitalsoincludesmultiplefea-turesformakingtheinteractionwiththeusermoreconver-sational.Itallowsusersgreat exibilityinthewaystheyexpresstheirrequests,anditremembersthecontextacrossutterancessothesystemdoesnotforgetwhatitistalkingaboutbasedonprioruserutterances.WeprovideasampleinteractionforsuchaninteractionpatterninSection4.3.6.3NaturalConversationforBIInordertoadaptournaturalconversationinterfacetotheusecaseofbusinessintelligence(BI),wecreatedadialoglogictable[22](RefSection4(Table2)).Thetablespeci-estherelationshipsamongeachoftheparticularintents,entitiesandresponses.Forexample,itspecieswhichpa-rametersarerequiredforeachintent,orrequesttype,andwhichareoptional,aswellasspecifyingthenaturallan-guageframingforboththeusers'andagent'sutterances.Fromsuchatable,itiseasytobuildacorrespondingdialogtreethatencodestheinteractionpatterns.IntentandEntityExtractor.ThecoreoftheNCF'sOpenRequestmodule[21]istheintentandentityextractor,whichallowsforamorenaturalandconversationalinterac-tion.Everyuserutteranceisfunneledthroughtheextractorsothatnouse-case-specicintentorentityismissed.Thisenablesuserstoproducetheirqueryincrementally,acrossmultipleutterances,insteadofrequiringthemtoproduceitinasingleutteranceortorepeatthesameentitiesforanewintent(asinslot-lling).Figure12showsanexampleofourintententityextractor,whichcaptureseachrequesttype,suchasanalysisqueryortrendquery,anddetail,suchasadmitsorincurredyear(notshown),toacontextvariable. Figure12:IntentsandEntityExtractor.DialogStructureforhandlingBIquerypatterns.Figure13showsanexampledialogtreestructureinwhicheachBIQuerypattern,modelledasanintent,isassignedaseparatedialognode(s)totriggeranappropriateresponsetotheuserortoelicitfurtherinformationifrequired.Modelingthedialogstructureinsuchamannerallowstheconversa-tionsystemtorespondtoeachBIquerypatternuniquely,aswellastoassistinappropriatestructuredquerygeneration(Section3.7).QueryCompletenessandDetailElicitors.Weincor-porateaquerycompletenesscheckmechanismusingaspe-cialnodeCompleteRequestinthedialogtreetoverifythecompletenessofeachBIQuerypattern(intent)identiedintheuserutterance.Thecompletenesscheckisatwostepprocess.Firstthesystemcheckswhethertheuserutterancehasalltherequiredentitiesfortheidentiedintentasperthedialoglogictable.Ifnot,thesystemchecksthecurrentconversationalcontexttoseeiftherequiredentitieshavealreadybeenprovidedbytheuserinaprioruserutterance.Ifyes,thenthequeryismarkedcomplete.Ifnot,weusethedetailelicitors(orsometimescalledslots),mechanismtoelicitfurtherinformationfromtheuserwhichheorshemighthavefailedtoprovideintheinitialuserutterance,throughoversightorlackofknowledge.Whenthequeryismarkedascompletethesystemprovidesanappropriateresponsewhichmightinvolvetheuseofstructuredquerygenerationtoobtainresultsorvisualizationsfromanexter-naldatasource. Figure13:DialogStructureforhandlingBIquerypatternsQueryValidation.QueryvalidationisanadditionalstepweintroducetoverifythesemanticcorrectnessoftheuserquerythatconformstoaparticularBIpattern/operationandismarkedascompletebythequerycompletenesscheckmechanismdescribedabove.Thevalidationofthequeryisdoneusinginformationcapturedintheontology.Fore.g.Auserutterance/querymightconformtotheBIQueryAnaly-sispattern(Figure6)andcontaintherequiredentitiessuchasameasure,adimensionandaltervalueasperthedialoglogictable2.Thevalidationprocesstraversestheontologytoverifyifthereisavalidrelationship(s)betweenthemea-sure,dimensionandltervaluesspeciedintheBIAnalysisQuerybytheuser.Ifso,astructuredqueryisgeneratedagainstanexternaldatasourcetorespondtotheuser.Ifnot,theuserisinformedoftheincorrectnessobservedandaskedtomodifythequery.SupportforBIOperationPatterns.AsmentionedinSection3.3.4,BIoperationpatternscapturetypicalBIop- erationsthatallowuserstofurtherinvestiga
tetheresultsobtainedfromotherBIpatternssuchasaBIAnalysispat-tern.BIoperationsaresupportedusinganincremental(offollow-up)requestmechanism.Theinitialsetofmeasures,dimensionsandltervaluesspeciedinsayaBIAnalysisQueryarecapturedinthecurrentconversationalcontextandallfurtherBIOperationsareexecutedonthisset.BIoperationssuchasDrillDown,RollUpalongadimensionhierarchyorPivotaresupportedincrementallybychangingtheappropriatevaluesintheconversationalcontext.3.7StructuredQueryGenerationInthissectionwebrie ydescribeourmechanismforstruc-turedquerygenerationagainstAPIsexposedbyanexternaldatasource(oranalyticalplatform)suchasCognos5[4],toprovideappropriateresponsestouserqueriesincludingchartsandvisualizations.Weuseasimpletemplatebasedmechanismforstructuredquerygeneration.FortheCognosanalyticsplatformawid-getactsasatemplatewhichispopulatedusingtheinfor-mationintheconversationcontexttoformtheactualstruc-turedquery.EachBIQuerypattern(orintent)ismappedtoaspecicwidgettemplate.Althoughinourcurrentimple-mentationweuseCognos,ourtechniquesanddesignarenotspecictoanyparticularexternaldatasourceoranalyticsplatform.Thetemplate-basedquerygenerationmechanismis exibleandcanbeusedtosupportanyback-endanalyticsplatform.Thewidgettemplatesallowthespecicationofthein-formationrequiredintermsofmeasures,dimensions,ag-gregations,lters,etc.asgatheredfromtheconversationalcontext.Thechoiceoftheactualformatoftheresponseorvisualization(suchasabarchart,scatterplot,linechart,etc.)appropriatefortherequestedinformationisdeferredtotheanalyticsplatformwhichusesotherinternalrecommen-dationtoolsandlibrariestomaketheappropriatechoice.OthermoresophisticateddeeplearningbasedtechniquessuchasSeq2Seqnetworks[24]couldbeemployedingeneralforstructuredquerygenerationconditionedontheavail-abilityofenoughtrainingdatafortheappropriateanalyt-icsplatform.WehoweverobservethatsincetheworkloadforBIapplicationsismostlycharacterizedbytheBIQuerypatterns,atemplatebasedmechanismasdescribedaboveissucienttoaddresstherequirementsofstructuredquerygenerationforthemajorityofpracticallyobservedwork-loads.Weleavethedetailedexplorationofotherdeeplearn-ingbasedtechniquesandtheireectivenessforsupportingstructuredquerygenerationforBIapplicationsasfuturework.4.USECASE:HEALTHINSIGHTSInthissectionwedescribethebuildingofaConversationalBIapplicationusingourontologydrivenapproachforHealthInsights,anIBMWatsonHealthcareoering[5].4.1HealthInsightsOverviewTheHealthInsights(HI)product,anIBMWatsonHealth-careoeringwhichincludesvedierentcurateddatasetsofhealthcareinsurancedatarelatedtoclaimsandtransac-tionsfromapopulationcoveredbyaninsurance'shealthcare 5CognosisaregisteredtrademarkofIBM.plans.Theintegrateddataacrossvedierentdatasetsin-cludesbasicinformationaboutparticipants'drugprescrip-tionsandadmissions,service,keyperformancefactorssuchasservicecategories,dataonindividualpatientepisodes,whichisacollectionofclaimsthatarepartofthesameincidenttotreatapatient.Finally,HIalsoincludestheIBMMarketScandataset[7]contributedbylargeemployers,managedcareorganizations,hospitals.Thedatasetcontainsanonymizedpatientdataincludingmedical,drugandden-talhistory,productivityincludingworkplaceabsence,lab-oratoryresults,healthriskassessments(HRAs),hospitaldischargesandelectronicmedicalrecords(EMRs).HIBusinessModuleandOntologygeneration.TheHIdataacrossseveraldierentdatastoresisattachedtotheCognosanalyticsplatformusingRestAPIs.Abusi-nessmodelwasdenedoverthisdatathatmodelsthein-formationintheunderlyingdatasetintheformofmea-sures,dimensions,theirrelationshipsandhierarchies.Thebusinessmodeldenedatotalof64Measures,274dimen-sionsand576distinctrelationshipsbetweenthedierentmeasuresanddimensions.ThebusinessmodelwasfurtherenhancedusingSMEdomainknowledgetogrouptheunder-lyingmeasuresanddimensionsintologicalgroupstocreateahierarchicalstructure.Thehierarchicaltreestructureforthemeasuresgroupedthe64leaflevelmeasuresinto12mea-suresatthesecondlevelandthese12measuresinturnweregroupedinto3toplevelmeasures.Similarlythe274leafleveldimensionsweregroupedinto8secondleveldimen-siongroupsand5topleveldimensiongroups.Figures3,4captureasnapshotofthisgroupingwhereeachhigherlevelgroupingofameasureordimensionisreferredtoasametaconcept.WeautomaticallygenerateonontologyfromthisbusinessmodelusingthemechanismdescribedinSection3.1inanOWLformatthusprovidinganentity-centricviewofthebusinessmodel.HIConversationartifactgeneration.WederivedtheconversationalartifactsforHIfromthegeneratedontologyincludingatotalof7intentsonecorrespondingtoeachBIQueryPatternandabout20intentstosupportconver
sa-tionmanagement.Automaticallygeneratedtrainingexam-ples(Section3.3.5)foreachoftheseintentswerealsoin-cludedintheconversationspacetotraintheintentclassier.Eachidentiedmeasure,dimensionandmeta-conceptwasaddedasanentity.Instancevaluesoftheleaflevelmeasuresanddimensionscrawledfromtheunderlyingdatawerealsoaddedasentitiestotheconversationspace.SMEknowledgewasutilizedtoaddsynonymsforeachofthepopulateden-titiesforbetterrecallanduserexperience.HIDialogStructure.Table2andTable3showversionsofthedialoglogictablethathavebeenadaptedspecicallyfortheBIQuerypatterns.Table2illustratesanexampleofhowthreekindsofBIQuerypatterns,canberepresented:BIAnalysisQuerypattern,BITrendpatternandBICom-parisonpattern(column1).Oneexampleisgivenofeachintent(column2),althoughinpractice,thiswouldcontainmanyvariationsforthesameintent.Alistofrequiredenti-tiesthatissharedacrosstheseintentsisgiven(column3),alongwithagentelicitations(column4)foreachrequiredentity.Sharedoptionalentitiesarealsoprovided(column5).Agentresponsestoeachintentareprovided(column6). Table2:DialogueLogicTablewithBIQueriesforHI. IntentName IntentExample RequiredEntities AgentElicitation OptionalEntities AgentResponse BIAnalysis Showmepeople Measure(s), Bywhichdimension? Filtervalue, Herearetheadmits Query admittedin2017 Dimension(s) Forwhichtimeperiod? Facilities treatfor2017... BITrend Howdoesnetpay Measures, ForwhichMeasure? Geographies, Hereisthenetpaycost Query costvaryovertime? Time Forwhichtimeperiod? facilities byincurredyear... BIComparison Showmeadmitsvs Measures, ForwhichMeasures? Filtervalue Hereareadmitsvs Query dischargesfor2017 dimensions Bywhichdimension dischargesbyhospital... Table3:DialogueLogicTablewithBIOperationsforHI. IntentName IntentExample RequiredEntities AgentElicitation OptionalEntities AgentResponse DrillDown Showmeby Dimensions, Whatdoyou NA Herearetheadmitsby Request MDC ranks,time wanttoadd? MedicaldiagnosticCodefor2017... RollUp RemoveMDC Dimensions Whatdoyou NA Hereareadmits Request ranks,time wanttoremove? for2017... Pivot Howaboutby Dimensions, Whatdoyou NA Herearetheadmitsby Request DRG? ranks,time wanttoreplace? DiagnosticRelatedGroupfor2017... Meta-conceptsarecapturedasoptionalentities(column5),whicharethenusedtotriggeragentelicitations(column4)formorespecic,requiredentities(column3).Table3showsthespecicationsofadditionaldialogstruc-turesthatgobeyondthestandard"slot-lling"structure,butstillfallwithintheOpenRequeststructure.ItcontainsoperationsonapriorBIrequestanditsassociatedcontext.Forexample,takingtheAnalysisQueryfromTable2,theusercanperformthreeBIoperations(Table3)ontheagentresponse:DrillDown,RollUpandPivot.Theyenabletheusertoadd,removeorreplacedimensionsinthecurrentBIquery,whichispersistedintheconversationalcontextfrompriorturnsintheinteraction.Inthisway,theusercanarticulatenewBIqueriesincrementally.OnceallofthepartsspeciedintheDialogLogicTablesabove(Tables1and2),adialogstructureofBIconversationpatternscanbecreatedeasily.Example:BINaturalConversation01User:Showmepeopleadmittedin2017?02Agent:Bywhichdimension?03User:DRG04Agent:HereisAdmitsbyDiagnosisRelatedGroupfor2017:05(chart#1appears)06User:HowaboutMDC?07Agent:HereisAdmitsbyMedicalDiagnosticCodefor2017:08((chart#2appears))09User:Showdistributionbyhospital10Agent:HereisAdmitsbyMedicalDiagnosticCodeandbyFacilityfor2017:11((chart#3appears))12User:removefacility13Agent:HereisAdmitsbyMedicalDiagnosticCodefor2017:14((chart#4appears))15User:okthanks16Agent:You'rewelcome!17User:Showmetop5Admits18Agent:Herearethetop5AdmitsbyMedicalDiagnosticCodefor2017:19((chart#5appears))20User:ok21User:Howdoesmynetpaycostvaryovertime?22Agent:HereisNetPaymentCostbyIncurredYear:23((chart#6appears))24User:ok25User:Showmeadmitsvsdischargesbyhospitalfor2018?26Agent:HereadmitsvsdischargesbyFacilityfor2018: 27((chart#7appears))28User:okthankyouverymuch29Agent:You'rewelcome!30((pause))31Agent:Anythingelse? TheaboveexampleillustratesaconversationalinteractionusingaNaturalConversationInterface(NCI)thathasbeenadaptedforBIanalysistosupporttheHIapplication.Inthisexample,thereisaBIAnalysisQuerysequence(lines01-05)followedbythreeBIoperations:pivot(lines06-08),drilldown(lines09-11)androllup(lines12-14).ThereisthenanexampleofaBITrendQuerysequence(lines21-23)andaBIComparisonQuerysequence(lines25-27).Wealsoseetheuseofsynonyms(hospital,facility)here.AroundtheseBI-relatedconversationalsequencesareexamplesofgenericconversationmanagementsequences(lines15-16,20,24and28-31).AlthoughthisexampleopenswiththeAnal-ysisQuerypattern,whichfollowsthestandard\slot-lling"patternindialogdesign,itproceeds
todemonstrateaddi-tionalinteractionpatterns,incrementalrequests(orBIop-erations)andconversationmanagement,whichgobeyondsimpleslotlling.5.SYSTEMEVALUATIONOurconversationalBIapplication,implementedinHealthInsights(HI),washostedintheIBMcloudandutilizessev-eralothercloudservicesincludingIBMWatsonAssistantforbuildingtheconversationalspace.Werstdescribetheeval-uationofourontology-drivengenerativeapproachforcre-atingconversationalartifacts.MorespecicallyweevaluatetheeectivenessofourproposedintentmodelingtechniquesforBIapplicationsintermsoftheircoverageandaccuracy.Next,wedescribeadetaileduserstudythatwasconductedtoascertaintheoveralleectivenessofourproposedconver-sationalBIsystem.Ourprototypefortheuserstudyusedanon-premisedeploymentoftheCognosanalyticsplatformthatwasloadedwithasubsetofdatafromHealthInsights.Finally,wesummarizethesectionwithsomelessonslearnedfromourexperienceofbuildingtheconversationalBIappli-cation.5.1IntentModelingevaluationWeevaluatedourontology-drivenintentmodellingap-proachbasedon(1)thecoverageitprovidesforaccessinga staticallydenedsetofdashboardsforHIand(2)theaccu-racywithwhichthesystemcanidentifythecorrectintentsfromtheuserutterancesbasedonauserstudy(Section5.2).5.1.1IntentmodelingcoverageevaluationWeevaluatethecoverageofourontology-basedintentmodelingapproachintermsofthesubsetofstaticallyde-neddashboardvisualizationsforHIthatcanbeaccessedusingourconversationalBIinterface.Havingsaidthat,wewouldliketonotethatoursystemisnotlimitedtoaccess-ingtheinformationfromthesestaticallydenedvisualiza-tions.OurproposedconversationalinterfacecansupportnewqueriesandexplorationsthatconformtooneofthecommonBIpatternsmodelledasintents.Forthepurposesoftheevaluation,wedenedatotalof150dierentvisualizationsstaticallyacross37dashboardsgroupedunder4dierentanalysisthemeswiththehelpofSMEs.Foreachofthesestaticallydenedvisualizationswecharacterizedthemintothreecomplexitycategoriesbasedontheirinformationcontent:(1)Simplevisualizationsthatrequiredasinglequerytobeissuedbytheanalysisplatformagainstadatabase.(2)Complexvisualizationsthatrequiremultiplequeriestobeissuedagainstthedatabasetocreatethevisualizationand(3)VisualizationsthatrequiredomainspecicinferenceandexpertiseofSMEstoconstructthequery,suchasSavingsfroma25%reductioninpotentiallyavoidableERvisits.Suchaquerywouldrequiredomainexpertisetoclassifywhichvisitswerepotentiallyavoidable,andwhatunderlyingmeasureswouldbeusedtocalculatethepotentialsavings.Figure14showsthedistributionofthe150staticallyde-nedvisualizationsbythedierentanalysisthemesandtheircomplexitycategory. Figure14:VisualizationDistributionByAnalysisTheme(Bestviewedincolor).Figure15providesConversationalBIcoveragebyanalysistheme.Thiscoverageiscomputedintermsofthenumberofvisualizationsthatcanbeaccessedthroughtheconver-sationalinterfacewithuserinteractionacrossoneormoreturnsofconversation(oriterations).Informationforvisu-alizationsthathavebeenstaticallydenedusingmultiplequeries(category2complexity)canbeaccessedovermulti-pleturnsoftheconversationoneforeachqueryaslongasthequeryfallsunderoneoftheidentiedBIpatternsusedtomodeltheintents.Mostlyvisualizationsthatrequiredo-mainspecicinferenceorexpertisefromSMEs(Complexitycategory3)arenotcoveredbythecurrentimplementationofoursystem.ThefocusofourcurrentworkisonsupportingthetypicalBIpatternswhichcoverthevastmajorityoftheworkloadforBIapplications.Outofatotalof150staticallydenedvisualizationsourconversationalBIsystemcovers125(83.34%)andtheremaining16.66%arevisualizationsthatrequireinference.Weleavefurtherexplorationofvi-sualizationsthatrequiredomaininferenceandcustomizedqueriestogeneratethesameusingSMEs,asfuturework. Figure15:ConversationalBIcoveragebyanaly-sisthemeforstaticallydenedvisualizations(Bestviewedincolor).5.2UserstudyWeconductedadetaileduserstudyonthepre-releaseversionofIBM'sHealthInsightsproductwithrealclients,toevaluatetheoveralluserexperienceandassesstheusagefrequencyofdierentBIquerypatterns,andtheaccuracyofoursystem'sintentclassiertoidentifythesepatternsasintents.Theuserstudyalsoprovidedusvaluablefeedbackwhichwecaptureaslessonslearnt(Section5.3).Weconductedtheuserstudyoverseveralsessionswherethefocusofeachdataexplorationsessionwaslimitedtoasubsetoftheontologyrelevanttodierentaspectsofthein-formationsupportedbytheHIproductsuchasAdmissions,Enrollment,etc.Withineachsuchsession,wefocusedonidentifyingtherelevantsubsetofdatatovisualizeusingap-propriatelters,suchaslteringthedatasetforspecicdrugsinthetherapeuticclassfordiabetes.Table4showsthere
sultsoftheintentusagefrequencyandtheirF1-scores.Asshowninthetable,wehavehighF1-scoresformostpatterns,excepttheBIOperationspat-tern,inourinitialuserstudy.Wetracedthecausetotheautomaticallygeneratedtrainingexamples:TheywerenotcoveringthedierentwaysactualusersexpressedtheBIoperationsquery.Learningfromthisexperience,weintro-ducedanumberofvariationsofinitialphrasestoourauto-maticallygeneratedtrainingexamplesforthisintenttohelpimproveitsclassicationaccuracyandrecall.Alargenumberofourusersforthestudyspecializedinthehealthcareinsurancedomain,andwerenotfamiliarwithwritingstructuredqueriesagainsttheCognosorotherbusinessintelligencetools/platforms.Throughtheconver-sationalinterface,participantsareabletointuitivelyaccessaseriesofcharts/visualizationswithoutspecicknowledgeofCognos,orwritingstructuredqueries.Ourconversationsystemwasabletoguideusersthroughclarifyingpromptstocollectnecessaryinformationtocreateachart/visualization. Table4:BIQueryPatternDetectionEectiveness. BIQueryPattern UsageFrequency F1-Score BIAnalysisQuery 32% 0.97% BIComparisonQuery 12% 0.98% BITrendQuery 21% 0.93% BIRankingQuery 18% 0.98% BIOperation 17% 0.85% 5.3LessonslearnedWelearnedseveralvaluablelessonsthroughourexperi-enceofbuildingandevaluatingoursystemthroughauserstudy.First,wereceivedaverypositivefeedbackfromtheusersintermsofaeaseofuseandtheabilitytoquerythesystemusingnaturallanguagewithoutknowledgeofschemaoraprogramming/queryinglanguage.Second,ourbootstrappingmechanismisveryeectiveincreatingarichandeectiveconversationalworkspaceforBIapplications.OurBIpatternscover83.34%ofstaticallydenedvisualizationsandinadditionenableuserstoaccessvisualizationsthathavenotbeenpre-dened.Third,werealizedthatalthoughtheontology-basedau-tomationofbuildingaconversationalBIsystemacceleratestheprocesstobuildingaprototype,extensivetestingintherealworldhelpsimprovethesystemthroughfeedback.Morespecically,thefeedbackconsistedofimprovingthedomainvocabularyofthesystemforbetterrecallbyaddingnewvariationoftermsassynonymsthatusersactuallyusetorefertospecicentities.Similarly,weaddedvariationsofstartphrasesfortrainingexamplesforseveralintentstoim-provetheirclassicationaccuracy.Particularly,werealizedthatusersarenotaccustomedtoexpressingBIoperationssuchasroll-up,drilldownandpivotinnaturallanguage,anduseawidevarietyofvariationsforexpressingthesame.Finally,anotherimportantlessonlearnedforbetteruserex-periencewasthatuserspreferredthesystemnottoasktoomanyclarifyingquestionsandinsteadpreferredthesystemtousedefaultsformissinginformationwhichwehaveincor-porated(Section3.5.1).6.RELATEDWORKWecoverrelevantrelatedworkinthissectionunderthreedierentcategoriesdescribedbelow.NaturallanguagesupportinexistingBItoolsSev-eralexistingbusinessintelligencetools,suchasAskDataTableau[2],PowerBI[8]byMicrosoft,Microstrategy[6],andtheIBM'sCognosAssistant[3],supportanaturallan-guageinterface.However,thesesystemsarerestrictedintermsoftheconversationalinteractiontheyprovide.Ama-jorityofthesesystemsrelyheavilyontheusertodrivetheconversation.Morespecically,theyleavetheonusontheusertoselectfromalargenumberofoptionsandparam-etersthroughuserinterfacesforgettingtoanappropri-atevisualizationwithoutmuchsystemsupport.Oursys-tem,ontheotherhand,usesinformationintheontologytoguidetheuserthroughmeaningfulconversationalinter-actionsandelicitsfurtherinformationtoaccessappropriatevisualizations.Further,unlikethesesystemsourontology-drivenapproachprovidesaformalmechanismfordeningasemanticallyrichentity-centricviewofthebusinessmodelcapturingbothactualmeasures,dimensionsandhigherlevelgroupingstosupportmorecomplexqueriescateringtothequeryingneedsofawiderrangeofpersonas.Further,ournovelautomatedwork owforconstructingaconversationalBIsystem,enablesrapidprototypingandbuildingconver-sationalBIsystemsfordierentdomains.CurrentconversationalsystemsExistingconversa-tionalsystemscanbeclassiedintothreedierentcate-gories[15]basedonthekindofnaturallanguageinteractiontheysupport.First,areoneshotquestionanswersystems,secondaregeneralpurposechatbotssuchasMicrosoftCor-tana[19],AppleSiri[10],AmazonAlexa[9],etc.thatcanconverseonarangeofdierenttopicssuchasweather,mu-sic,newsorcanbeusedtoaccomplishgeneraltaskssuchascontrollingdevices,timersetc.andareagnostictoanypar-ticulardomain.Thethirdcategoryaretask-orientedagentsthattargettasksinspecicdomainssuchastravel,nance,healthcareandarelimitedinscopetospecictaskssuchasbookinga ight,ndingaccountbalance,etc.Thesetaskorientedchatbotshoweverfailtoaddressthechallengesin-volvedindataexplorationandd
erivationofmeaningfulin-sightsespeciallyforbusinessapplications.Weproposeanontology-basedapproachforbuildingconversationalsystemsforsupportingBIapplicationsthroughnaturallanguagein-terfaces.ApproachesfordialoguemanagementRecentad-vancesinmachinelearning,particularlyinneuralnetworks,haveallowedforcomplexdialoguemanagementmethodsandconversation exibilityforconversationalinterfaces.Theapproachesthatarecommonlyusedinbuildingthedialoguestructureforaconversationalinterfaceare:(1)Rule-basedapproaches[18,17]usedinnite-statedialoguemanagementsystemsaresimpletoconstructfortasksthatarestraight-forwardandwell-structured,buthavethedisadvantageofrestrictinguserinputtopredeterminedwordsandphrases.(2)Frame-basedsystems[14,11,16]addresssomeofthelimitationsofnitestatedialoguemanagementbyenablingamore exibledialogue.Frame-basedsystemsenabletheusertoprovidemoreinformationasrequiredbythesys-temwhilekeepingtrackofwhatinformationisrequiredandaskquestionsaccordingly.(3)Agent-basedsystems[12,25,23,20].Agent-basedmethodsfordialoguemanagementaretypicallystatisticalmodelsandrequiretobetrainedonacorporaofprioruserinteractionsforbetteradaptation.Wefoundtheframebasedsystemsmostsuitableforadapta-tionforbuildingaconversationalBIsystemstosupportthecommonlyobservedBIquerypatterns.7.CONCLUSIONSInthispaper,wedescribeanend-to-endontology-drivenapproachforbuildingaconversationalinterfacetoexploreandderivebusinessinsightsforawiderangeofpersonasrangingfrombusinessanalysts,todatascientiststotoplevelexecutivesandownersofdata.Wecapturethedomainse-manticsinanontologycreatedfromthebusinessmodel,andexploitthepatternsintypicalBIworkloadstoprovideamoredynamicandintuitiveconversationalinteractiontoderiveBIinsightsfromtheunderlyingdataindierentdo-mains.Usingtheontology,weprovideanautomatedwork- owtobootstraptheconversationspaceartifacts,includingintents,entities,andtrainingexamples,whileallowingtheincorporationofuserfeedbackandSMEinputs.Weimple-mentedourtechniquesinHealthInsights(HI),andprovidedlessonslearned,aswellasadetailedevaluation. 8.REFERENCES[1]OWL2webontologylanguagedocumentoverview.https://www.w3.org/TR/owl2-overview/.[2]AskData|TableauSoftware.https://www.tableau.com/products/new-features/ask-data,March2020.[3]CognosAssistant.https://tinyurl.com/u3sdaxa,March2020.[4]IBMCognosAnalytics.https://www.ibm.com/products/cognos-analytics,March2020.[5]IBMHealthInsights.https://www.ibm.com/us-en/marketplace/health-insights,March2020.[6]Kb442148:Naturallanguagequeryinanutshellinmicrostrategyweb.https://community.microstrategy.com/s/article/Natural-Language-Query-in-A-Nutshell-MicroStrategy-11-0?language=enUS;March2020:[7]Marketscan.https://www.ibm.com/products/marketscan-research-databases,March2020.[8]PowerBI|MicrosoftPowerPlatform.https://powerbi.microsoft.com/en-us/,March2020.[9]Amazon[US].Amazonalexa.https://developer.amazon.com/alexa,2018.[10]Apple[US].Siri.https://www.apple.com/ios/siri/,2018.[11]M.BeveridgeandJ.Fox.Automaticgenerationofspokendialoguefrommedicalplansandontologies.J.ofBiomedicalInformatics,39(5):482{499,2006.[12]Bing-HwangJuangandS.Furui.Automaticrecognitionandunderstandingofspokenlanguage-arststeptowardnaturalhuman-machinecommunication.ProceedingsoftheIEEE,88(8):1142{1165,2000.[13]S.ChaudhuriandU.Dayal.Anoverviewofdatawarehousingandolaptechnology.SIGMODRec.,26:65{74,1997.[14]K.K.Fitzpatrick,A.Darcy,andM.Vierhile.Deliveringcognitivebehaviortherapytoyoungadultswithsymptomsofdepressionandanxietyusingafullyautomatedconversationalagent(woebot):Arandomizedcontrolledtrial.JMIRMentHealth,4(2):e19,2017.[15]J.Gao,M.Galley,andL.Li.NeuralapproachestoconversationalAI.CoRR,abs/1809.08267,2018.[16]T.Giorgino,I.Azzini,C.Rognoni,S.Quaglini,M.Stefanelli,R.Gretter,andD.Falavigna.Automatedspokendialoguesystemforhypertensivepatienthomemanagement.InternationalJournalofMedicalInformatics,74(2):159{167,2005.[17]S.MalliosandN.G.Bourbakis.Asurveyonhumanmachinedialoguesystems.InIISA,pages1{7,2016.[18]M.F.McTear.Spokendialoguetechnology:Enablingtheconversationaluserinterface.ACMComput.Surv.,34(1):90{169,2002.[19]Microsoft[US].Microsoftcortana.https://www.microsoft.com/en-us/windows/cortana,2018.[20]A.S.Miner,A.Milstein,S.Schueller,etal.Smartphone-BasedConversationalAgentsandResponsestoQuestionsAboutMentalHealth,InterpersonalViolence,andPhysicalHealth.JAMAInternalMedicine,176(5):619{625,2016.[21]R.J.MooreandR.Arar.ConversationalUXDesign:APractitioner'sGuidetotheNaturalConversationFramework.ACM,NewYork,NY,USA,2019.[22]A.Quamar,C.Lei,D.Miller,F.Ozcan,J.Kreulen,R.J.Moore,andV.Efthymiou.Anontology-basedconversationsystemforknowledgebases.InSIGMOD,2
020.[23]N.M.RadziwillandM.C.Benton.Evaluatingqualityofchatbotsandintelligentconversationalagents.CoRR,abs/1704.04579,2017.[24]I.Sutskever,O.Vinyals,andQ.V.Le.Sequencetosequencelearningwithneuralnetworks.InZ.Ghahramani,M.Welling,C.Cortes,N.D.Lawrence,andK.Q.Weinberger,editors,AdvancesinNeuralInformationProcessingSystems27,pages3104{3112.CurranAssociates,Inc.,2014.[25]S.J.Young,M.Gasic,B.Thomson,andJ.D.Williams.Pomdp-basedstatisticalspokendialogsystems:Areview.ProceedingsoftheIEEE,101(5):1160{1179,2013. ConversationalBI:AnOntologyDrivenConversationSystemforBusinessIntelligenceApplicationsAbdulQuamar1,Fatma¨Ozcan1,DorianMiller2,RobertJMoore1,RebeccaNiehus2,JeffreyKreulen21IBMResearchAI,2IBMWatsonHealth1ahquamar|fozcan|rjmoore@us.ibm.com,2millerbd|rniehus|kreulen@us.ibm.comABSTRACTBusinessintelligence(BI)applicationsplayanimportantroleintheenterprisetomakecriticalbusinessdecisions.Conversationalinterfacesenablenon-technicalenterpriseus-erstoexploretheirdata,democratizingaccesstodatasignif-icantly.Inthispaper,wedescribeanontology-basedframe-workforcreatingaconversationsystemforBIapplicationstermedasConversationalBI.WecreateanontologyfromabusinessmodelunderlyingtheBIapplication,andusethisontologytoautomaticallygeneratevariousartifactsoftheconversationsystem.Theseincludetheintents,entities,aswellasthetrainingsamplesforeachintent.Ourapproachbuildsuponourearlierwork,andexploitscommonBIac-cesspatternstogenerateintents,theirtrainingexamplesandadaptthedialogstructuretosupporttypicalBIop-erations.WehaveimplementedourtechniquesinHealthInsights(HI),anIBMWatsonHealthcareoering,provid-inganalysisoverinsurancedataonclaims.Ouruserstudydemonstratesthatoursystemisquiteintuitiveforgainingbusinessinsightsfromdata.Wealsoshowthatourap-proachnotonlycapturestheanalysisavailableinthexedapplicationdashboards,butalsoenablesnewqueriesandexplorations.PVLDBReferenceFormat:AbdulQuamar,FatmaOzcan,DorianMiller,RobertJMoore,RebeccaNiehusandJereyKreulen.AnOntology-BasedConver-sationSystemforKnowledgeBases.PVLDB,13(12):3369-3381,2020.DOI:https://doi.org/10.14778/3415478.34155571.INTRODUCTIONBusinessIntelligence(BI)toolsandapplicationsplayakeyroleintheenterprisetoderivebusinessdecisions.BIdash-boardsprovideamechanismforthelineofbusinessownersandexecutivestoexplorekeyperformancemetrics(KPIs)viavisualinterfaces.Thesedashboardsareusuallycreatedbytechnicalpeople.Infact,therearemanytechnicalpeo-pleinvolvedinthepipelinefromthedatatothedashboards,includingthedatabasedesigners,DBAs,businessanalysts.ThisworkislicensedundertheCreativeCommonsAttributionNonCommercialNoDerivatives4.0InternationalLicense.Toviewacopyofthislicense,visithttp://creativecommons.org/licenses/byncnd/4.0/.Foranyusebeyondthosecoveredbythislicense,obtainpermissionbyemailinginfo@vldb.org.Copyrightisheldbytheowner/author(s).PublicationrightslicensedtotheVLDBEndowment.ProceedingsoftheVLDBEndowment,Vol.13,No.12ISSN21508097.DOI:https://doi.org/10.14778/3415478.3415557 Figure1:TraditionalBISystemArchitectureetc.Figure1showsatypicalarchitectureofaBIstack.TheunderlyingdataresidesinatraditionalRDBMS,andabusinessmodeliscreatedintermsofanOLAPcubedef-inition[13]thatdescribestheunderlyingdataintermsofMeasures(numericorquantiablevalues),Dimensions(cat-egoricalorqualifyingattributes),andthehierarchiesandrelationshipsbetweenthem.Then,businessanalystscreatetheBIreportsanddashboardsusingtheBImodel(cubedef-inition)1.Thereportsandthedashboardsaresupportedbystructuredqueriesthatrunagainsttheunderlyingdatabasetorenderthevisualizationstotheuser.Toobtainanswerstoquestionsthatarenotcontainedintheexistingdashboardvisualizations,usersneedtoenlistthehelpoftechnicalpeople,andtheturnaroundtimeforsuchcyclescanbeprohibitivelytime-consumingandexpen-sive,delayingkeybusinessinsightsanddecisions.Today'senterprisesneedfasteraccesstotheirKPIsandfasterdeci-sionmaking.Conversationalinterfacesenableawiderangeofpersonasincludingnon-technicallineofbusinessownersandexec-utivestoexploretheirdata,investigatevariousKPIs,andderivevaluablebusinessinsightswithoutrelyingonexternaltechnicalexpertisetocreateadashboardforthem.Assuch,conversationalinterfacesdemocratiseaccesstodatasignif-icantly,andalsoallowdynamicandmoreintuitiveexplo-rationsofdataandderivationofvaluablebusinessinsights.Today'schatbotandvoiceassistantplatforms(e.g.,GoogleDialog ow,FacebookWit.ai,MicrosoftBotFramework,IBMWatsonAssistant,etc.)allowuserstointeractthroughnat-urallanguageusingspeechortext.Usingtheseplatforms,developerscancreatemanykindsofnaturallanguagein- 1Inthispaper,weusethetermscubedenitionandbusinessmodelinterchangeably