com Abstract All askers who post questions in Communitybased Question Answer ing CQA sites such as Yahoo Answers Quora or Baidus Zhidao expect to re ceive an answer and are frustrated when their questions remain unanswered We propose to provide a typ ID: 56522
Download Pdf The PPT/PDF document "Will my Question be Answered Predicting ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
WillmyQuestionbeAnswered?PredictingQuestionAnswerabilityinCommunityQuestion-AnsweringSitesGideonDror,YoelleMaarekandIdanSzpektorYahoo!Labs,MATAM,Haifa31905,Israelfgideondr,yoelle,idang@yahoo-inc.comAbstract.AllaskerswhopostquestionsinCommunity-basedQuestionAnswer-ing(CQA)sitessuchasYahoo!Answers,QuoraorBaidu'sZhidao,expecttore-ceiveananswer,andarefrustratedwhentheirquestionsremainunanswered.Weproposetoprovideatypeofheadsuptoaskersbypredictinghowmanyan-swers,ifatall,theywillget.Givingapreemptivewarningtotheaskeratpostingtimeshouldreducethefrustrationeffectandhopefullyallowaskerstorephrasetheirquestionsifneeded.Tothebestofourknowledge,thisistherstattempttopredicttheactualnumberofanswers,inadditiontopredictingwhetherthequestionwillbeansweredornot.Tothiseffect,weintroduceanewpredictionmodel,specicallytailoredtohierarchicallystructuredCQAsites.Weconductedextensiveexperimentsonalargecorpuscomprising1yearofansweringactivityonYahoo!Answers,asopposedtoasingledayinpreviousstudies.Theseexper-imentsshowthattheF1weachievedis24%betterthaninpreviouswork,mostlyduethestructurebuiltintothenovelmodel.1IntroductionInspiteofthehugeprogressofWebsearchenginesinthelast20years,manyusers'needsstillremainunanswered.Queryassistancetoolssuchasquerycompletion,andrelatedqueries,cannot,asoftoday,dealwithcomplex,heterogeneousneeds.Inaddi-tion,therewillalwaysexistsubjectiveandnarrowneedsforwhichcontenthaslittlechancetohavebeenauthoredpriortothequerybeingissued.Community-basedQuestionAnswering(CQA)sites,suchasYahoo!Answers,Quora,StackOveroworBaiduZhidao,havebeenpreciselydevisedtoanswerthesedifferentneeds.TheseservicesdifferfromtheextensivelyinvestigatedFactoidQuestionAn-sweringthatfocusesonquestionssuchasWhenwasMozartborn?,forwhichunam-biguousanswerstypicallyexist,[1].ThoughCQAsitesalsofeaturefactoidquestions,theytypicallyaddressotherneeds,suchasopinionseeking,recommendations,open-endedquestionsorveryspecicneeds,e.g.WhattypeofbirdshouldIget?orWhatwouldyouchooseasyourlastmeal?.Questionsnotonlyreectdiverseneedsbutcanbeexpressedinverydifferentstyles,yet,allaskersexpecttoreceiveanswers,andaredisappointedotherwise.Unansweredquestionsarenotararephenomenon,reaching13%ofthequestionsintheYahoo!Answersdatasetthatwestudied,asdetailedlater,anduserswhosequestionsremain unansweredareconsiderablymorepronetochurningfromtheCQAservice[2].Onewaytoreducethisfrustrationistoproactivelyrecommendquestionspotentialanswer-ers,[36].However,theaskerhaslittleornoinuenceontheanswerers'behavior.Indeed,aquestionbyitselfmayexhibitsomecharacteristicsthatreduceitspotentialforanswerability.Examplesincludeapoororambiguouschoiceofwords,agiventypeofunderlyingsentiment,thetimeofthedaywhenthequestionwasposted,aswellassheersemanticreasonsifthequestionreferstoacomplexorrareneed.Inthiswork,wefocusontheaskers,investigateavarietyoffeaturestheycancon-trol,andattempttopredict,basedonthesefeatures,theexpectednumberofanswersanewquestionmightreceive,evenbeforeitisposted.Withsuchpredictions,askerscanbewarnedinadvance,andadjusttheirexpectations,iftheirquestionshavelittlechancestobeanswered.Thisworkrepresentsarststeptowardsthemoreambitiousgoalofassistingaskersinpostinganswerablequestions,bynotonlyindicatingtheex-pectednumberofanswersbutalsosuggestingadequaterephrasing.Furthemore,wecanimagineadditionalusagesofourpredictionmechanism,dependingonthesitepriori-ties.Forinstance,aCQAsitesuchasYahoo!Answersthatattemptstosatisfyallusersmightdecidetopromotequestionswithfewpredictedanswersinordertoachieveahigheransweringrate.AlternativelyasociallyorientedsitelikeQuora,mightprefertopromotequestionswithmanypredictedanswersinordertoencouragesocialinteractionbetweenanswerers.Wecasttheproblemofpredictingthenumberofanswersasaregressiontask,whilethespecialcaseofpredictingwhetheraquestionwillreceiveanyansweratallisviewedasaclassicationtask.WefocusonYahoo!Answers,oneofthemostvisitedCQAsiteswith30millionsquestionsandanswersamonthand2.4askedquestionspersecond[7].ForeachquestioninYahoo!Answers,wegenerateasetoffeaturesthatareextractedonlyfromthequestionattributesandareavailablebeforequestionsubmission.Thesefeaturescaptureasker'sattributes,thetextualcontentofthequestion,thecategorytowhichthequestionisassignedandthetimeofsubmission.Inspiteofthisrichfeatureset,off-the-shelfregressionandclassicationmodelsdonotprovideadequatepredictionsinourtasks.Therefore,weintroduceaseriesofmodelsthatbetteraddresstheuniqueattributesofourdataset.Ourmaincontributionsarethreefold:1.weintroduceanoveltaskofpredictingthenumberofexpectedanswersforaques-tionbeforeitisposted,2.wedevisehierarchicallearningmodelsthatconsiderthecategory-drivenstructureofYahoo!Answersandreecttheirassociatedheterogeneouscommunities,eachwithitsownansweringbehavior,andnally,3.weconductthelargestexperimenttodateonanswerability,aswestudyayear-longquestionandansweractivityonYahoo!Answers,asopposedtoaday-longdatasetinpreviouswork.2BackgroundWithmillionsofactiveusers,Yahoo!Answershostsaverylargeamountofquestionsandanswersonawidevarietyoftopicsandinmanylanguages.Thesystemiscontent- centric,asusersaresociallyinteractingbyengaginginmultipleactivitiesaroundaspecicquestion.Whenauserasksanewquestion,shealsoassignsittoaspeciccat-egory,withinapredenedhierarchyofcategories,whichshouldbestmatchthegeneraltopicofthequestion.Forexample,thequestionWhatcanIdotoxmybumper?wasassignedtothecategory`Cars&TransportationMaintenance&Repairs'.Eachnewquestionremainsopenforfourdays(withanoptionforextension),orlessiftheaskerchoseabestanswerwithinthisperiod.Registeredusersmayansweraquestionaslongasitremainsopen.OneofthemainissuesinYahoo!Answers,andincommunity-basedquestionan-sweringingeneral,isthehighvarianceinperceivedquestionandanswerquality.Thisproblemdrewalotofresearchinrecentyears.Somestudiesattemptedtoassessthequalityofanswers[811],orquestions[12,13],andrankthemaccordingly.Otherslookedatactiveusersforvarioustaskssuch,scoringtheirreliabilityasasignalforhighqualityanswersorvotes[1416],identifyingspammers[17],predictingwhethertheaskerofaquestionwillbesatisedwiththereceivedanswers[18,19]ormatchingquestionstospecicusers[35].Ourresearchbelongstothesamegeneralschoolofworkbutfocusesonestimatingthenumberofanswersaquestionwillreceive.Priorworkthatanalyzesquestions,diditinretrospect,eitherafterthequestionshadbeenanswered[9],orasarankingtaskforagivencollectionofquestions[12,13].Incontrast,weaimatpredictingthenumberofanswersforeverynewquestionbeforeitissubmitted.Inarelatedwork,RichardsonandWhite[20]studiedwhetheraquestionwillre-ceiveananswerornot.Yet,theyconductedtheirstudyinadifferentenvironment,anIM-basedsynchronoussystem,inwhichpotentialanswerersareknown.Giventhisen-vironmenttheycouldleveragefeaturespertainingtothepotentialanswerers,suchasreputation.Inaddition,theyconsideredthespecicstyleofmessagessentoverIM,includingwhetheranewlinewasenteredandwhethersomepolitewordsareadded.Theirexperimentwasofasmallscaleon1,725questions,forwhichtheyshowedim-provementoverthemajoritybaseline.WenotethattheirdatasetislessskewedthaninYahoo!Answers.Indeedtheirdatasetcountedabout42%ofunansweredquestions,whileYahoo!Answersdatasetstypicallycountabout13%ofunansweredquestions.Wewilllaterdiscussthechallengesinvolvedindealingwithsuchaskeweddataset.AmorerelatedpriorworkthatinvestigatedquestionanswerabilityisYangetal.[21],whoaddressedthesametaskofcoarse(yes/no)answerabilityasabovebutinthesamesettingsasours,namelyYahoo!Answers.Yangetal.approachedthetaskasaclassi-cationproblemwithvariousfeaturesrangingfromcontentanalysis,suchascategorymatching,politewordsandhiddentopics,toaskerreputationandtimeofday.Theyusedaone-daydatasetofYahoo!Answersquestionsandobservedthesameratioofunansweredquestionsaswedidinourone-yeardataset,namely13%.Failingtocon-structaclassierforthisheavilyskeweddataset,Yangetal.resortedtolearningfromanarticiallybalancedtrainingset,whichresultedinimprovementsoverthemajoritybaseline.Inthispaper,wealsoaddressthisclassicationtask,withthesametypeofskeweddataset.However,unlikeYangetal.,weattempttoimproveoverthemajoritybaselinewithoutarticiallybalancingthedataset. Finallyanothermajordifferentiatorwiththeabovepreviousworkisthatwedonotstopatsimplypredictingwhetheraquestionwillbeansweredornot,butpredicttheexactnumberofanswersthequestionwouldreceive.3PredictingQuestionAnswerabilityOnekeyrequirementofourwork,aswellasadifferentiatorwithtypicalpriorworkonquestionanalysis,isthatwewanttopredictanswerabilitybeforethequestionisposted.Thisimposesconstraintsonthetypeofdataandsignalswecanleverage.Namely,wecanonlyusedatathatisintrinsictoanewquestionbeforesubmission.InthecaseofYahoo!Answers,thisincludes:(a)thetitleandthebodyofthequestion,(b)thecategorytowhichthequestionisassigned,(c)theidentityoftheuserwhoaskedthequestionand(d)thedateandtimethequestionisbeingposted.Weviewthepredictionoftheexpectednumberofanswersasaregressionprob-lem,inwhichatargetfunction(a.k.athemodel)^y=f(x)islearned,withxbeingavector-spacerepresentationofagivenquestion,and^y2Ranestimatefory,thenum-berofanswersthisquestionwillactuallyreceive.Allthedifferentmodelswepresentinthissectionarelearnedfromatrainingsetofexamplequestionsandtheirknownnumberofanswers,D=f(xi;yi)g.Thepredictiontaskofwhetheraquestionwillreceiveanansweratallisaddressedasaclassicationtask.Itissimilarlymodeledbyatargetfunction^y=f(x)andthesamevectorspacerepresentationofaquestion,yet,thetrainingtargetisbinary,withanswered(unanswered)questionsbeingthepositive(negative)examples.Tofullypresentourmodelsforthetwotasks,wenextspecifyhowaquestionrepre-sentationxisgenerated,andthenintroduceforeachtasknovelmodels(e.g.f(x))thataddresstheuniquepropertiesofthedataset.3.1QuestionFeaturesInourapproach,eachquestionisrepresentedbyafeaturevector.Foranynewquestion,weextractvariousattributesthatbelongtothreemaintypesofinformation:questionmetadata,questioncontent,anduserdata.Intherestofthispaperweusethetermfea-turefamilytodenoteasingleattributeextractedfromthedata.Questionattributesmaybenumerical,categoricalorset-valued(e.g.thesetofwordtokensinthetitle).Hence,inordertoallowlearningbygradient-basedmethods,wetransformedallcategoricalattributestobinaryfeatures,andbinnedmostofthenumericattributes.Forexample,thecategoryofaquestionisrepresentedas1287binaryfeaturesandthehouritwaspostedisrepresentedas24binaryfeatures.Tables3.1,2and3describethedifferentfeaturefamiliesweextract,groupedaccordingtotheirinformationsource:thequestiontext,theaskerandquestionmetadata.3.2RegressionModelsFollowingthedescriptionofthefeaturesextractedfromeachquestion,wenowintro-ducedifferentmodels(byorderofcomplexity)thatusethequestionfeaturevectorin Table1.Featuresextractedfromtitleandbodytexts FeatureFamily Description #Features Titletokens Thetokensextractedfromthetitle,notincludingstopwords 45,011 Bodytokens Thetokensextractedfromthebody,notincludingstopwords 45,508 Titlesentiment Thepositiveandnegativesentimentscoresofthetitle,calculatedbytheSentiStrengthtool[22] 2 Bodysentiment Themeanpositiveandnegativesentimentscoresofthesentencesinthebody 2 SupervisedLDA ThenumberofanswersestimatedbysupervisedLatentDirichletAllocation(SLDA)[23],whichwastrainedoverasmallsubsetofthetrainingset 1 TitleWH WH-words(what,when,where...)extractedfromthequestion'stitle 11 BodyWH WH-wordsextractedfromthequestion'sbody 11 Titlelength Thetitlelengthmeasuredbythenumberoftokensafterstopwordremoval,binnedonalinearscale 10 Bodylength Thebodylength,binnedonanexponentialscalesincethislengthisnotconstrainted 20 TitleURL ThenumberofURLsthatappearwithinthequestiontitle 1 BodyURL ThenumberofURLsthatappearwithinthequestionbody 1 ordertopredictthenumberofanswers.Weremindthereaderthatourtrainingsetcon-sistsofpairsD=f(xi;yi)g,wherexi2RFistheFdimensionalfeaturevectorrepresentationofquestionqi,andyi2f0;1;2:::gistheknownnumberofanswersforqi.BaselineModelYangetal.[21]comparetheperformanceofseveralclassiers,linearandnon-linear,onasimilardataset.TheyreportthatalinearSVMsignicantlyoutper-formsallotherclassiers.Giventhesendings,aswellasthefactthatalinearmodelisbothrobust[24]andcanbetrainedveryefcientlyforlargescaleproblems,wechosealinearmodelf(xi)=wTxi+basourbaselinemodel.FeatureAugmentationModelOneoftheuniquecharacteristicsoftheYahoo!An-swerssiteisthatitconsistsofquestionsbelongingtoavarietyofcategories,eachwithitscommunityofaskersandanswerers,temporalactivitypatterns,jargonetc.,andthatthecategoriesareorganizedinatopicaltaxonomy.Thisstructure,whichisinherenttothedata,suggeststhatmorecomplexmodelsmightbeusefulinmodelingthedata.Oneeffectivewayofincorporatingthecategorystructureofthedatainaregressionmodelistoenrichthefeatureswithcategoryinformation.Specically,weborrowedtheideafrom[25],whichoriginallyutilizedsuchinformationfordomainadaptation.Toformallydescribethismodel,weconsidertheYahoo!Answerscategorytaxon-omyasarootedtreeTwithAllCategoriesasitsroot.Whenreferringtothecategorytreewewilluseinterchangeablythetermnodeandcategory.Wedenotethecategory Table2.Featuresextractedbasedontheasker FeatureFamily Description #Features AskerID Theidentityoftheasker,ifitaskedatleast50questionsinthetrainingset.WeignoreaskerswhoaskedfewerquestionssincetheirIDstatisticsareunreliable 175,714 Mean#ofanswers Thepastmeannumberofanswerstheaskerreceivedforherques-tions,binnedonanexponentialscale 26 #ofquestions Thepastnumberofquestionsaskedbytheasker,binnedonalinearscaleandonanexponentialscale 26 Log#ofquestions Thelogarithmofthetotalnumberofquestionspostedbytheaskerinthetrainingset,andthesquareofthelogarithm.Forbothfeaturesweadd1totheargumentofthelogarithmtohandletestuserswithnotrainingquestions. 2 Table3.Featuresextractedfromthequestion'smetadata FeatureFamily Description #Features Category TheIDofthecategorythatthequestionisassignedto 1,287 ParentCategory TheIDoftheparentcategoryoftheassignedcategoryforthequestion,basedonthecategorytaxonomy 119 Hour Thehouratwhichthequestionwasposted,capturingdailypat-terns 24 Dayofweek Theday-of-weekinwhichthequestionwasposted,capturingweeklypatterns 7 Weekofyear Theweekintheyearinwhichthequestionwasposted,capturingyearlypatterns 51 ofaquestionqibyC(qi).WefurtherdenotebyP(c)thesetofallnodesonthepathfromthetreeroottonodec(includingcandtheroot).Fornotationalpurposes,weuseabinaryrepresentationforP(c):P(c)2f0;1gjTj,wherejTjisthenumberofnodesinthecategorytree.Thefeatureaugmentationmodelrepresentseachquestionqibybxi2RFjTjwherebxi=P(C(qi)) xiwhere representstheKroneckerproduct.Forexample,givenquestionqithatisassignedtocategory`Dogs',therespectivenodepathinTis`AllQuestions/Pets/Dogs'.Thefeaturevectorbxiforqiisallzerosexceptforthreecopiesofxicorrespondingtoeachofthenodes`AllQuestions',`Pets'and`Dogs'.Therationalebehindthisrepresentationistoallowaseparatesetoffeaturesforeachcategory,therebylearningcategoryspecicpatterns.Theseincludelearningpatternsforleafcategories,butalsolearninglowerresolutionpatternsforintermediatenodesinthetree,whichcorrespondtoparentandtopcategoriesinYahoo!Answers.Thispermitsagoodtradeoffbetweenhighresolutionmodelingandrobustness,obtainedbythehigherlevelcategorycomponentssharedbymanyexamples. (a)Treeofcategories (b)TreeofmodelsFig.1.Anillustrationofthesubtreemodelstructure.Shadednodesin(a)representcategoriespopulatedwithquestions,whileunshadednodesarepurelynavigational.SubtreeModelAnalternativetothefeatureaugmentationmodelistotrainseverallinearmodels,eachspecializingonadifferentsubtreeofT.LetusnoteasubsetofthedatasetasDc=f(xi;yi)jqi2S(c)g,whereS(c)isthesetofcategoriesinthecategorysubtreerootedatnodec.WealsonoteamodeltrainedonDcasfc.Sincethereisaone-to-onecorrespondencebetweenmodelsandnodesinT,thesetofmodelsfccanbeorganizedasatreeisomorphictoT.Modelsindeeperlevelsofthetreearespecializedonfewercategoriesthanmodelsclosertotheroot.Figure1illustratesacategorytreeanditscorrespondingmodeltreestructure.Theshadednodesin1(a)representcategoriestowhichsometrainingquestionsareassigned.Onesimplisticwayofusingthemodeltreestructureistoapplytherootmodel(fAinFigure1(b))toalltestquestions.Notethatthisisidenticaltothebaselinemodel.Yet,therearemanyotherwaystomodelthedatausingthemodeltree.Specically,anysetofnodesthatalsoactsasatreecutdenesaregressionmodel,inwhichthenumberofanswersforagivenquestionqiispredictedbytherstmodelinthesetencounteredwhentraversingfromthecategoryci,assignedtoqi,totherootofT.Inthiswork,weshalllimitourselvestothreesuchcuts:TOP:qiispredictedbymodelfTop(ci),whereTop(c)isthecategoryinP(c)directlyconnectedtotheroot.PARENT:qiispredictedbymodelfParent(ci)NODE:qiispredictedbymodelfciInFigure1theTOPmodelreferstoffB,fCg,thePARENTmodelreferstoffB,fC,fF,fHgandtheNODEmodelreferstoffD,fE,...fMg.EnsembleofSubtreeModelsInordertofurtherexploitthestructureofthecategorytaxonomyinYahoo!Answers,thequestionsineachcategorycareaddressedbyallmodelsinthepathbetweenthiscategoryandthetreeroot,underthesubtreeframe-workdescribedabove.Eachmodelinthispathintroducesadifferentbalancebetweenrobustnessandspecicity.Forexample,therootmodelisthemostrobust,butalsotheleastspecicintermsoftheidiomaticattributesofthetargetcategoryc.Attheotherendofthespectrum,fcisspecicallytrainedforc,butitismoreproneforoverttingthedata,especiallyforcategorieswithfewtrainingexamples. Insteadofpickingjustonemodelonthepathfromctotheroot,theensemblemodelforclearnstocombineallsubtreemodelsbytrainingametalinearmodel:f(xi)=Xc02P(c)cc0fc0(xi)+bc(1)wherefc0arethesubtreemodelsdescribedpreviouslyandtheweightscc0andbcareoptimizedoveravalidationset.Forexample,theensemblemodelforquestionsassignedtocategoryEinFigure1(a)aremodeledbyalinearcombinationofmodelsfA,fBandfE,whicharetrainedontrainingsetsD,DBandDErespectively.3.3ClassicationModelsThetaskofpredictingwhetheraquestionwillbeansweredornotisanimportantspe-cialcaseoftheregressiontask.Inthisclassicationtask,wetreatquestionsthatwerenotansweredasnegativeexamplesandquestionsthatwereansweredasthepositiveexamples.Weemphasizethatourdatasetisskewed,withthenegativeclassconstitutingonly12.68%ofthedataset.Furthermore,asalreadynotedin[26],thedistributionofthenumberofanswersperquestionisveryskewed,withalongtailofquestionshavinghighnumberofanswers.Asdescribedinthebackgroundsection,thistaskwasstudiedbyYangetal.[21],whofailedtoprovideasolutionfortheunbalanceddataset.Instead,theyarticiallybal-ancedtheclassesintheirtrainingsetbysampling,whichmayreducetheperformanceoftheclassieronthestillskewedtestset.UnlikeYangetal.,whousedoff-the-shelfclas-siersforthetask,wedevisedclassiersthatspecicallyaddresstheclassimbalanceattributeofthedata.Wenoticedthataquestionthatreceivedoneortwoanswerscouldhaveeasilygoneunanswered,whilethisisunlikelyforquestionswithdozenanswersormore.Whenprojectingthenumberofanswersyiintotwovalues,thisdifferencebe-tweenpositiveexamplesislostandmayproduceinferiormodels.Thefollowingmodelsattempttodealwiththisissue.BaselineModelYangetal.[21]foundthatlinearSVMprovidessuperiorperformanceonthistask.Accordingly,wechooseasbaselinealinearmodel,f(xi)=wTxi+btrainedwithhingeloss.WetrainthemodelonthebinarizeddatasetD0=f(xi;sign(yi1=2))g(seeourexperimentformoredetails).FeatureAugmentationModelInthismodel,wetrainthesamebaselineclassierpresentedabove.Yetthefeaturevectorfedintothemodelistheaugmentedfeaturerepresentationintroducedfortheregressionmodels.EnsembleModelInordertocapturetheintuitionthatnotallpositiveexamplesareequal,weuseanideacloselyrelatedtoworksbasedonErrorCorrectingOutputCod-ingformulti-classclassication[27].Specically,weconstructaseriesofbinaryclas-sicationdatasetsDt=f(xi;zti)gwherezti=sign(yi1=2t)andt=0;1;2;:::.Inthisseries,D0isadatasetwherequestionswithoneormoreanswersareconsidered positive,whileinD10onlyexampleswithmorethan10answersareconsideredposi-tive.Wenotethatthesedatasetshavevaryingdegreesofimbalancebetweenthepositiveandthenegativeclasses.DenotingbyfttheclassiertrainedonDt,weconstructthenalensembleclassierbyusingalogisticregressionf(x)=(Xttft(x)+b)(2)where(u)=(1+eu)1andthecoefcientstandbarelearnedbyminimizingthelog-likelihoodlossonthevalidationset.EnsembleofFeatureAugmentationModelsInthismodel,wetrainthesameensem-bleclassierpresentedabove.Yetthefeaturevectorfedintothemodelistheaugmentedfeaturerepresentationintroducedfortheregressionmodels.ClassicationEnsembleofSubtreeModelsAsourlastclassicationmodel,wedi-rectlyutilizeregressionpredictionstodifferentiatebetweenpositiveexamples.Weusethesamemodeltreestructureusedintheregressionbyensembleofsubtreemodels.Allmodelsarelinearregressionmodelstrainedexactlyasintheregressionproblem,inordertopredictthenumberofanswersforeachquestion.Thenalensemblemodelisalogisticregressionfunctionoftheoutputsoftheindividualregressionmodels:f(xi)=(Xc02P(c)cc0fc0(xi)+bc)(3)wherefc0arethesubtreeregressionmodelsandtheweightscc0andbcaretrainedusingthevalidationset.4ExperimentsWedescribeheretheexperimentsweconductedtotestourregressionandclassicationmodels,startingwithourexperimentalsetup,thenpresentingourresultsandanalyses.4.1ExperimentalSetupOurdatasetconsistsofauniformsampleof10millionquestionsoutofallnon-spamEnglishquestionssubmittedtoYahoo!Answersin2009.Thequestionsinthisdatasetwereaskedbymorethan3milliondifferentusersandwereassignedto1;287categoriesoutofthe1;569categories.Asignicantfractionofthesampledquestions(12.67%)remainedunanswered.Theaveragenumberofanswersperquestionis4:56(=6:11).Thedistributionofthenumberofanswersfollowsapproximatelyageometricdistribu-tion.Thedistributionsofquestionsamongusersandamongcategoriesareextremelyskewed,withalongtailofuserswhopostedoneortwoquestionsandsparselypop-ulatedcategories.ThesedistributionsaredepictedinFigure2,showingapowerlaw (a)Questionsperuser (b)QuestionspercategoryFig.2.Distributionofnumberofquestionsdepictedasafunctionofranksbehaviorforthequestionsperaskerdistribution.Alargefractionofthecategorieshavequiteafewquestions,forexample,abouthalfofallcategoriesinourdatasetcountlessthan50examples.Werandomlydividedourdatasetintothreesets:80%training,15%testand5%validation(forhyper-parametertuning).Veryfewquestionsinthedatasetattractedhun-dredsofanswers.Toeliminatetheilleffectofthesequestionsonmodeltraining,wemodiedthemaximumnumberofanswersperquestionto64.Thisresultedinchangingthetargetofabout0:03%ofthequestions.Duetoitsspeed,robustnessandscalability,weusedtheVowpalWabbittool1when-everpossible.Allregressionmodelsweretrainedwithsquaredloss,exceptforensem-bleofsubtreemodels,Eq.1,whosecoefcientswerelearnedbyaleastsquarest.AllclassicationmodelsweretrainedusingVowpalWabbitwithhingeloss,exceptfortheensemblemodels,Eq.2and3,whosecoefcientswerelearnedbymaximizingthelog-likelihoodofthevalidationsetusingStochasticGradientDescent.Wenotethatforanodec,whereensemblemodelsshouldhavebeentrainedbasedonlessthan50val-idationexamples,werefrainedfromtrainingtheensemblemodelandusedtheNODEsubtreemodelofcasasinglecomponentoftheensemblemodel.Table4comparesbetweenthevarioustrainedmodelswithrespecttothenumberofbasiclinearmodelsusedincompositemodelsandtheaveragenumberoffeaturesobservedperlinearmodel.Themeta-parametersoftheensemblemodels(Eq.1and3)werenotincludedinthecounting.4.2ResultsTheperformanceofthedifferentregressionmodelsonourdatasetwasmeasuredbyRootMeanSquareError(RMSE)[28]andbythePearsoncorrelationbetweenthepre-dictionsandthetarget.Table5presentstheseresults.Ascanbeseen,allourmodelsoutperformthebaselineoff-the-shelflinearregressionmodel,withthebestperforming 1http://hunch.net/vw/ Table4.Detailsoftheregressionandclassicationmodels,includingthenumberoflinearmod-elsineachcompositemodel,andtheaveragenumberoffeaturesusedbyeachlinearmodel Regression Model #linear features models permodel Baseline 1 267,781 Featureaugmentation 1 12,731,748 Subtree-TOP 26 88,358 Subtree-PARENT 119 26,986 Subtree-NODE 924 9,221 Ens.ofsubtreemodels 955 13,360 Classication Model #linear features models permodel Baseline 1 267,781 Featureaugmentation 1 12,731,748 Ensemble 7 267,781 FeatureaugmentationEns. 7 12,731,748 Ens.ofsubtreemodels 955 13,360 Table5.Testperformancefortheregressionmodels Model RMSE Pearson Correlation Baseline 5.076 0.503 Featureaugmentation 4.946 0.539 Subtree-TOP 4.905 0.550 Subtree-PARENT 4.894 0.552 Subtree-NODE 4.845 0.564 Ens.ofsubtreemodels 4.606 0.620 Table6.Testperformancefortheclassica-tionmodels Model AUC Baseline 0.619 Featureaugmentation 0.646 Ensemble 0.725 Featureaugmentationensemble 0.739 Ensembleofsubtreemodels 0.781 modelachievingabout10%relativeimprovement.Theseresultsindicatetheimpor-tanceofexplicitlymodelingthedifferentansweringpatternswithintheheterogeneouscommunitiesinYahoo!Answers,ascapturedbycategories.Interestingly,thefeature-augmentationmodel,whichattemptstocombinebetweencategoriesandtheirances-tors,performsworsethananyspecicsubtreemodel.Oneofthereasonsforthisisthehugenumberofparametersthismodelhadtotrain(seeTable4),comparedtotheensembleofseparatelytrainedsubtreemodels,eachrequiringconsiderablyfewerpa-rameterstotune.At-testbasedonthePearsoncorrelationsshowsthateachmodelinTable5issignicantlybetterthantheprecedingone,withP-valuesclosetozero.TheperformanceofmodelsfortheclassicationtaskwasmeasuredbytheareaundertheROCCurve(AUC)[29].AUCisapreferredperformancemeasurewhenclassdistributionsareskewed,sinceitmeasurestheprobabilitythatapositiveexampleisscoredhigherthananegativeexample.Specically,theAUCofamajoritymodelisalways0.5,independentlyofthedistributionofthetargets.InspectingtheclassicationresultsinTable6,wecanseethatallthenovelmodelsimproveoverthebaselineclassier,withthebestperformingensembleofsubtreesclassierachievinganAUCof0:781,asubstantialrelativeimprovementof26%overthebaseline'sresultof0:619.At-testbasedontheestimatedvarianceofAUC[30] showsthateachmodelinTable6isstatisticallysignicantlysuperiortoitspredecessorwithP-valuespracticallyzero.Wenextexamineinmoredepththeperformanceoftheensembleofclassiersandtheensembleofsubtreeregressors(thethirdandfthentriesinTable6respectively).Weseethattheensembleofclassiersexplicitlymodelsthedifferencesbetweenques-tionswithmanyandfewanswers,signicantlyimprovingoverthebaseline.Yet,theensembleofsubtreeregressorsnotonlymodelsthispropertyofthedatabutalsothedifferencesinansweringpatternswithindifferentcategories.Itshigherperformancein-dicatesthatbothattributesarekeyfactorsinprediction.Thusthetaskofpredictingtheactualnumberofanswershasadditionalbenets,itallowsforabetterunderstandingofthestructureofthedataset,whichalsohelpsfortheclassicationtask.Finally,wecomparedourresultstothoseofYangetal.[21].TheymeasuredtheF1valueonthepredictionsoftheminorityclassofunansweredquestions,forwhichtheirbestclassierachievedanF1of0:325.Ourbestmodelforthismeasurewasagaintheensembleofsubtreemodelsclassier,whichachievedanF1of0:403.Thisisasubstantialincreaseof24%overYangetal.'sbestresult,showingagainthebenetsofastructuredclassier.4.3ErrorAnalysisWeinvestigatedwhereourmodelserrbymeasuringtheaverageperformanceofourbestperformingmodelsasafunctionofthenumberofanswerspertestquestion,seeFigure3.Wesplitthetestexamplesintodisjointsetscharacterizedbyaxednum-berofanswersperquestionandaveragedtheRMSEofourbestregressoroneachset(Figure3(a)).SinceourclassierisnotoptimizedfortheAccuracymeasure,wesetaspecicthresholdontheclassieroutput,choosingthe12:672percentileoftestexam-pleswithlowestscoresasnegatives.Figure3(b)showstheerrorrateforthisthreshold.Wenotethattheerrorrateforzeroanswersreferstofalsepositivesrateandforallothercasesitreferstothefalsenegativesrate.Figure3(a)exhibitsaclearminimumintheregionmostpopulatedwithquestions,whichshowsthattheregressorisoptimizedforpredictingvaluesnear0.AlthoughtheRMSEincreasessubstantiallywiththenumberofanswers,itisstillmoderate.Ingen-eral,theRMSEweobtainedisapproximatelylineartothesquarerootofthenumberofanswers.Specically,forquestionswithlargenumberofanswers,theRMSEismuchsmallerthanthetruenumberofanswers.Forexample,forquestionswithmorethan10answers,whichconstituteabout13%ofthedataset,theactualnumberofanswersisap-proximatelytwicetheRMSEonaverage.Thisshowsthebenetofusingtheregressionmodelsasinputtoaanswered/unansweredclassier,aswedidinourbestperformingclassier.Thisisreected,forexample,intheverylowerrorrates(0.0064orless)forquestionswithmorethan10answersinFigure3(b).Whiletheregressionoutputeffectivelydirectstheclassiertothecorrectdecisionforquestionswitharound5ormoreanswers,Figure3(b)stillexhibitssubstantialerrorratesforquestionswithveryfewornoanswers.Thisisduetotheinherentrandomnessintheansweringprocess,inwhichquestionsthatreceivedveryfewanswerscouldhave 2Thisisthefractionofnegativeexamplesinourtrainingset. (a)Regressiontask (b)ClassicationtaskFig.3.PerformanceoftheSubtreeEnsemblemodelsasafunctionofthenumberofanswerseasilygoneunansweredandviceversaandarethusdifculttopredictaccurately.Infuturework,wewanttoimprovetheseresultsbyemployingboostingapproachesandconstructingspecializedclassiersforquestionswithveryfewanswers.4.4TemporalAnalysisIntuitively,thetimeatwhichaquestionispostedshouldplayaroleinasocial-mediasite,therefore,likeYangetal.[21],weusetemporalfeatures.Ourdataset,whichspansoveroneyear,conrmstheirreportedpatternsofhourlyanswering:questionspostedatnightaremostlikelytobeanswered,whileafternoonquestionsareabout40%morelikelytoremainunanswered.Toextendthisanalysistolongertimeperiods,weanalyzedweeklyandyearlypat-terns.Werstcalculatedthemeannumberofanswersperquestionandthefractionofunansweredquestionsasafunctionofthedayofweek,asshowninFigure4.Aclearpatterncanbeobserved:questionsaremoreoftenansweredtowardstheendoftheweek,withasharppeakonFridaysandasteepdeclineovertheweekend.Thedif-ferencesbetweenthedaysarehighlystatisticallysignicant(t-test,twosidedtests).ThetwographsinFigure4exhibitextremelysimilarcharacteristics,indicatingthatthefractionofunansweredquestionsisnegativelycorrelatedwiththeaveragenumberofanswersperquestion.Thissuggeststhatbothphenomenaarecontrolledbyasupplyanddemandequilibrium.Thiscanbeexplainedbytwohypotheses:(a)bothphenomenaaredrivenbyanincreaseinquestions(Yangetal.'shypothesis)or(b)bothphenomenaaredrivenbyadecreaseinthenumberofanswers.Totesttheabovetwohypotheses,weextractedthenumberofquestions,numberofanswersandfractionofunansweredquestionsonadailybasis.EachdayisrepresentedinFigure5asasinglepoint,asweplotthedailyfractionofunansweredquestionsasafunctionofthedailyaveragenumberofanswersperquestion(Figure5(a))andasafunctionofthetotalnumberofdailyquestions(Figure5(b)).Wenotethatwhilesomeanswersareprovidedonawindowoftimelongerthanaday,thisisararephenomenon. (a)Meananswers (b)Frac.unansweredFig.4.Meannumberofanswersandfractionofnumberofanswersasafunctionofthedayoftheweek,where'1'correspondstoMondayand'7'toSunday (a) (b)Fig.5.ThedailyfractionofunansweredquestionsasafunctionofthedailymeannumberofanswersandasafunctionofthetotalnumberofquestionsThevastmajorityofanswersareobtainedwithinabouttwentyminutesfromtheques-tionpostingtime[5],henceourdailyanalysis.Figure5(a)exhibitsastrongnegativecorrelation(Pearsoncorrelationr=0:631),whilealmostnoeffectisobservedinFigure5(b)(r=0:010).Wefurthertestedthecorrelationbetweenthedailytotalnumberofanswersandthefractionoffractionofunansweredquestions,andhereaswellasignicantnegativecorrelationwasobserved(r=0:386).Thesendingssupportthehypothesisthatdeciencyinanswerersisthekeyfactoraffectingthefractionofunansweredquestions,andnottheoverallnumberofquestions,whichwasYangetal'shypothesis.Thisresultisimportant,becauseitimpliesthatmorequestionsinacommunity-basedquestionansweringsitewillnotreducetheperformanceofthesite,aslongasanactivecommunityofanswerersstrivesatitscore. 5ConclusionsInthispaper,weinvestigatedtheanswerabilityofquestionsincommunity-basedques-tionansweringsites.Wewentbeyondpreviousworkthatreturnedabinaryresultofwhetherornotthequestionwillbeanswered.Wefocusedonthenoveltaskofpre-dictingtheactualnumberofexpectedanswersfornewquestionsincommunity-basedquestionansweringsites,soastoreturnfeedbacktoaskersbeforetheyposttheirques-tions.Weintroducedaseriesofnovelregressionandclassicationmodelsexplicitlydesignedforleveragingtheuniqueattributesofcategory-organizedcommunity-basedquestionansweringsites.Weobservedthatthesecategorieshostdiversecommunitieswithdifferentansweringpatterns.OurmodelsweretestedoveralargesetofquestionsfromYahoo!Answers,showingsignicantimprovementoverpreviousworkandbaselinemodels.Ourresultsconrmedourintuitionthatpredictinganswerabilityatanergrainedlevelisbenecial.Theyalsoshowedthestrongeffectofthedifferentcommunitiesinteractingwithquestionsonthenumberofanswersaquestionwillreceive.Finally,wediscoveredanimportantandsomehowcounter-intuitivefact,namelythatanincreasednumberofquestionswillnotnegativelyimpactanswerability,aslongasthecommunityofanswerersismaintained.Weconstructedmodelsthatareperformantatscale:eventheensemblemodelsareextremelyfastatinferencetime.Infuturework,weintendtoincreaseresponsetimeevenfurtherandconsiderincrementalaspectsinordertoreturnpredictionsastheaskertypes,thusproviding,inreal-time,dynamicfeedbackandamoreengagingexperience.Tocomplementthisscenario,wearealsointerestedinprovidingquestionrephrasingsuggestionsforafullassistancesolution.References1.Voorhees,E.M.,Tice,D.M.:Buildingaquestionansweringtestcollection.In:SIGIR.(2000)2.Dror,G.,Pelleg,D.,Rokhlenko,O.,Szpektor,I.:Churnpredictioninnewusersofyahoo!answers.In:WWW(CompanionVolume).(2012)8298343.Li,B.,King,I.:Routingquestionstoappropriateanswerersincommunityquestionanswer-ingservices.In:CIKM.(2010)158515884.Horowitz,D.,Kamvar,S.D.:Theanatomyofalarge-scalesocialsearchengine.In:Proceed-ingsofthe19thinternationalconferenceonWorldwideweb.WWW'10,NewYork,NY,USA,ACM(2010)4314405.Dror,G.,Koren,Y.,Maarek,Y.,Szpektor,I.:Iwanttoanswer;whohasaquestion?:Yahoo!answersrecommendersystem.In:KDD.(2011)110911176.Szpektor,I.,Maarek,Y.,Pelleg,D.:Whenrelevanceisnotenough:Promotingdiversityandfreshnessinpersonalizedquestionrecommendation.In:WWW.(2013)7.Rao,L.:Yahoomailandimusersupdatetheirstatus800milliontimesamonth.TechCrunch(Oct282009)8.Jeon,J.,Croft,W.B.,Lee,J.H.,Park,S.:Aframeworktopredictthequalityofanswerswithnon-textualfeatures.In:Proceedingsofthe29thannualinternationalACMSIGIRconferenceonResearchanddevelopmentininformationretrieval.SIGIR'06,NewYork,NY,USA,ACM(2006)2282359.Agichtein,E.,Castillo,C.,Donato,D.,Gionis,A.,Mishne,G.:Findinghighqualitycontentinsocialmedia,withanapplicationtocommunity-basedquestionanswering.In:Proceed-ingsofACMWSDM.WSDM'08,Stanford,CA,USA,ACMPress(February2008) 10.Shah,C.,Pomerantz,J.:Evaluatingandpredictinganswerqualityincommunityqa.In:Proceedingsofthe33rdinternationalACMSIGIRconferenceonResearchanddevelopmentininformationretrieval.SIGIR'10,NewYork,NY,USA,ACM(2010)41141811.Surdeanu,M.,Ciaramita,M.,Zaragoza,H.:Learningtorankanswersonlargeonlineqacollections.In:ACL.(2008)71972712.Song,Y.I.,Lin,C.Y.,Cao,Y.,Rim,H.C.:Questionutility:Anovelstaticrankingofquestionsearch.In:AAAI.(2008)1231123613.Sun,K.,Cao,Y.,Song,X.,Song,Y.I.,Wang,X.,Lin,C.Y.:Learningtorecommendquestionsbasedonuserratings.In:CIKM.(2009)75175814.Jurczyk,P.,Agichtein,E.:Discoveringauthoritiesinquestionanswercommunitiesbyusinglinkanalysis.In:ProceedingsofthesixteenthACMconferenceonConferenceoninforma-tionandknowledgemanagement.CIKM'07,NewYork,NY,USA,ACM(2007)91992215.Bian,J.,Liu,Y.,Zhou,D.,Agichtein,E.,Zha,H.:Learningtorecognizereliableusersandcontentinsocialmediawithcoupledmutualreinforcement.In:WWW.(2009)516016.Lee,C.T.,Rodrigues,E.M.,Kazai,G.,Milic-Frayling,N.,Ignjatovic,A.:Modelforvoterscoringandbestanswerselectionincommunityq&aservices.In:WebIntelligence.(2009)11612317.Lee,K.,Caverlee,J.,Webb,S.:Uncoveringsocialspammers:socialhoneypots+machinelearning.In:SIGIR.(2010)43544218.Liu,Y.,Agichtein,E.:You'vegotanswers:Towardspersonalizedmodelsforpredictingsuccessincommunityquestionanswering.In:ACL(ShortPapers).(2008)9710019.Agichtein,E.,Liu,Y.,Bian,J.:Modelinginformation-seekersatisfactionincommunityquestionanswering.ACMTransactionsonKnowledgeDiscoveryfromData3(2)(April2009)10:110:2720.Richardson,M.,White,R.W.:Supportingsynchronoussocialq&athroughoutthequestionlifecycle.In:WWW.(2011)75576421.Yang,L.,Bao,S.,Lin,Q.,Wu,X.,Han,D.,Su,Z.,Yu,Y.:Analyzingandpredictingnot-answeredquestionsincommunity-basedquestionansweringservices.In:AAAI.(2011)22.Thelwall,M.,Buckley,K.,Paltoglou,G.,Cai,D.,Kappas,A.:Sentimentinshortstrengthdetectioninformaltext.J.Am.Soc.Inf.Sci.Technol.61(12)(December2010)2544255823.Blei,D.,McAuliffe,J.:Supervisedtopicmodels.InPlatt,J.,Koller,D.,Singer,Y.,Roweis,S.,eds.:AdvancesinNeuralInformationProcessingSystems20.MITPress,Cambridge,MA(2008)24.Draper,N.R.,Smith,H.:AppliedRegressionAnalysis(WileySeriesinProbabilityandStatistics).Thirdedn.Wiley-Interscience(April1998)25.DaumeIII,H.:Frustratinglyeasydomainadaptation.In:Proceedingsofthe45thAnnualMeetingoftheAssociationofComputationalLinguistics,Prague,CzechRepublic,Associ-ationforComputationalLinguistics(June2007)25626326.Adamic,L.A.,Zhang,J.,Bakshy,E.,Ackerman,M.S.:Knowledgesharingandyahooan-swers:everyoneknowssomething.In:Proceedingsofthe17thinternationalconferenceonWorldWideWeb.WWW'08,NewYork,NY,USA,ACM(2008)66567427.Dietterich,T.G.,Bakiri,G.:Solvingmulticlasslearningproblemsviaerror-correctingoutputcodes.JournalofArticialIntelligenceResearch2(1995)28.Bibby,J.,Toutenburg,H.:PredictionandImprovedEstimationinLinearModels.JohnWiley&Sons,Inc.,NewYork,NY,USA(1978)29.Provost,F.J.,Fawcett,T.:Analysisandvisualizationofclassierperformance:Comparisonunderimpreciseclassandcostdistributions.In:KDD.(1997)434830.Cortes,C.,Mohri,M.:Condenceintervalsfortheareaundertheroccurve.In:NIPS.(2004)