/
UnderstandingandPromotingMicroFinanceActivitiesinKivaorgJaegulChooGeo UnderstandingandPromotingMicroFinanceActivitiesinKivaorgJaegulChooGeo

UnderstandingandPromotingMicroFinanceActivitiesinKivaorgJaegulChooGeo - PDF document

abigail
abigail . @abigail
Follow
343 views
Uploaded On 2021-08-07

UnderstandingandPromotingMicroFinanceActivitiesinKivaorgJaegulChooGeo - PPT Presentation

Figure1AnoverviewofhowKivaworks1Aborrowerrequestsaloantoaeldpartnerandaloanisdisbursed2ThepartneruploadsaloanrequesttoKivaandlendersfundtheloan3TheborrowermakesrepaymentsthroughthepartnerandKivathenre ID: 858865

lender loan partner sloan loan lender sloan partner borrower rst text info positive inproc temporal description alender loans team

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "UnderstandingandPromotingMicroFinanceAct..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1 UnderstandingandPromotingMicro-FinanceAc
UnderstandingandPromotingMicro-FinanceActivitiesinKiva.orgJaegulChooGeorgiaInstituteofTechnologyjaegul.choo@cc.gatech.eduChanghyunLeeGeorgiaInstituteofTechnologyclee407@gatech.eduDanielLeeGeorgiaTechResearchInstitutedaniel.lee@gtri.gatech.eduHongyuanZhaGeorgiaInstituteofTechnologyzha@cc.gatech.eduHaesunParkGeorgiaInstituteofTechnologyhpark@cc.gatech.eduABSTRACTNon-pro tMicro- nanceorganizationsprovideloaningop-portunitiestoeradicatepovertyby nanciallyequippingim-poverished,yetskilledentrepreneurswhoareindesperateneedofaninstitutionthatlendstothosewhohavelittle.Kiva.org,awidely-usedcrowd-fundedmicro- nancialser-vice,providesresearcherswithanextensiveamountofpub-liclyavailabledatacontainingarichsetofheterogeneousinformationregardingmicro- nancialtransactions.Ourob-jectiveinthispaperistoidentifythekeyfactorsthaten-couragepeopletomakemicro- nancingdonations,andulti-mately,tokeepthemactivelyinvolved.Inourcontributiontofurtherpromoteahealthymicro- nanceecosystem,wedetailourpersonalizedloanrecommendationsystemwhichweformulateasasupervisedlearningproblemwherewetrytopredicthowlikelyagivenlenderwillfundanewloan.WeconstructthefeaturesforeachdataitembyutilizingtheavailableconnectivityrelationshipsinordertointegratealltheavailableKivadatasources.Forthoselenderswithnosuchrelationships,e.g., rst-timelenders,weproposeanovelmethodoffeatureconstructionbycomputingjointnonnegativematrixfactorizations.Utilizinggradientboost-ingtreemethods,astate-of-the-artpredictionmodel,weareabletoachieveupto0.92AUC(areaunderthecurve)value,whichshowsthepotentialofourmethodsforprac-ticaldeployment.Finally,wepointoutseveralinterestingphenomenaonlenders'socialbehaviorsinmicro- nanceac-tivities.CategoriesandSubjectDescriptorsH.3.3[InformationSearchandRetrieval]:Information ltering;I.2.6[Arti cialIntelligence]:LearningPermissiontomakedigitalorhardcopiesofallorpartofthisworkforpersonalorclassroomuseisgrantedwithoutfeeprovidedthatcopiesarenotmadeordistributedforprotorcommercialadvantageandthatcopiesbearthisnoticeandthefullcita-tionontherstpage.CopyrightsforcomponentsofthisworkownedbyothersthanACMmustbehonored.Abstractingwithcreditispermitted.Tocopyotherwise,orre-publish,topostonserversortoredistributetolists,requirespriorspecicpermissionand/orafee.Requestpermissionsfrompermissions@acm.org.WSDM'14,February24–28,2014,NewYork,NewYork,USA.Copyright2014ACM978-1-4503-2351-2/14/02...$15.00.http://dx.doi.org/10.1145/2556195.2556253. Figure1:AnoverviewofhowKivaworks.1.Aborrowerrequestsaloantoa( eld)partner,andaloanisdisbursed.2.ThepartneruploadsaloanrequesttoKiva,andlendersfundtheloan.3.Theborrowermakesrepaymentsthroughthepartner,andKivathenrepaysthelenders.Theycanmakeanotherloan,donatetoKiva,orwithdrawthemoneytotheirPayPalaccount.KeywordsRecommendersystems;cold-startproblem;micro nance;crowdfunding;jointmatrixfactorization;gradientboostingtree;heterogeneousdata1.INTRODUCTIONKivawasfoundedbyMattFlanneryandJessicaJack-leywhobasedtheirconceptontheinspirationofMuham-madYunus'lectureontheGrameenBank.TheGrameenBank,whichwontheNobelPeacePrizein2006foritsim-pactinhelpingtheimpoverished,wasfoundedbyYunusin1977toaddressthelackofpracticalcreditavailabletotheunder-utilized,yetskillfulentrepreneursinimpoverishedcountries[32].InYunus'Book,hedocumentedhowhecameupwiththeconceptbynoticingthattheverypoorcouldbarelysustainthemselves,letaloneworktheirtrade,sincemanytimesthepoorweretakingloanstobuythemateri-als,onlytoselltheir nishedproductbackasrepayment.Inresponse,Yunusbeganhiscredit-loaningprogramwhich 0 100 200 300 400 500 600 700 (a) #days taken for a loan to be paid back#loans #loans m= 5 #loans m=15 0 100 200 300 400 500 600 700 (b) #days taken to fund the next loan#loans m=50 Figure2:Temporallendingpatternsfordi erentlendergroupswithaspeci clendingcountmprovidedloanswithoutcollateralandinterest,andwithaneasyrepaymentplan.TherearemultitudesofsuccessstoriesinYunus'bookaswellasontheKivablog1thatportrayhowmicro- nancinghasgivenopportunitytochangethelivesoftheborrowers,theirbusinesses,andtheirlocalareas.Sinceitsinceptionin2005,Kivaanditsgenerouslendersnowimpactthelivesofcourageous,hardworkingborrowersacross72countries.Kiva2isanon-pro tmicro- nancialor-ganizationwhichactsasanintermediaryservicetoprovidepeoplewiththeopportunitytolendmoneytounderprivi-legedentrepreneursindevelopingcountries.Kiva'slendingmodelisbasedonacrowd-fundingmodelinwhichanyindi-vidualcanfundaparticularloanbycontributingtoaloanindividuallyorasapartofalenderteam.TheKivaloanprocessissummarizedinFig.1.3OpenpublicaccesstoKiva'sdata,providedthroughdailysnapshotsandanAPI,isapartofKiva'scharitableinitia-tivetoprovidetheworkingpoorwithaninfrastructurethatKivahopeswillencouragelife-changinglending.ThisleveloftransparencyliesatthecoreofKiva'ssuccessfulgrowthasMattFlanneryputsit:\Transparencyinthisnextperiodwillbeourbestweaponagain

2 stthechallengesofgrowth.Thismodelthrives
stthechallengesofgrowth.Thismodelthrivesoninformation,notmarketing"[13].Kivadatacontainawealthysetofheterogeneousinfor-mationaboutlenders,loans,lenderteams,borrowers,and eldpartners.AsofJune2013,thepubliclyavailableKivadatasetcontainedover1,100,000lenders,500,000loans,and150,000journalentriesforover4,000,000transactionsthatresultedinover400,000,000USDofloansissued.Therearealsomultipletypesofmany-to-manyrelationshipsbetweeneachofthedataentities.Forexample,lendersmaybeapartofmultiplelenderteamswhilelendersmaychooseanynumberofloanstoparticipatein.Furthermore,borrowers 1http://pages.kiva.org/kivablog23mayoptionallyupdatetheirprogresstotheir eldpartnerforentryintotheKivawebsiteasajournalentry.Thisdatasetincludesgeospatial,temporal,andfree-textdataalongwithavarietyofothernumericalandcategoricalinforma-tion,consequentlyformingafascinatingsetofdataformanydataminingandsocialmediaresearchers.Loanrecommendationanddiverselenderbehav-iors.Kivaasanon-pro torganizationencourageslendingbypromotingtheideathatthoseinneedcancreatebetterlivesforthemselvesandtheirfamilieswhengiventheop-portunity,i.e.,capital.Thus,onecannaturallyrealizethatlenders,whoarealsoregardedasdonorsduetothelackofanyinterestorrewardtheyreceiveinreturnfortheirloan,areapivotalcomponenttotheKivamodel.Consequently,oneofthekeystoahealthyKivaecosystemreliesonkeep-ingtheirlendersinterestedincontinuingintheirgenerousdonations.Thisiswhereactiverecommendationcanplayamajorrolebymatchingthelenderwithloansthattheywouldbesincerelyinterestedin.Inaddition,whatmakesloanrecommendationaninter-estingproblemisthediversityoflenders'behaviors.Howdolendersdi erintheirlendingbehaviorsandwhatarethemajorfactorstodrivethesedi erences?Fig.2displaysanexampleshowingtemporallendingpatterns.Forapartic-ularloantobefullypaid,itusuallytakesfromahalftoafullyear(Fig.2(a)).Incaseofpassivelenderswithasmallnumberoflendingexperiences(asmallerminFig.2(b)),thetimetakenbetweentwoconsecutivelendingactivitiesshowarelativelyhighcorrelationwiththetimerequiredforaloantobepaid,comparedtotheothercases.Thisbehav-iorismostlikelyexplainedbythenotionthatsomepassivelendersparticipateinanotherloanwhentheirinitialloanispaidback,ratherthancontributingmoremoneyoftheirown.However,activelenderswithmorelendingexperiencescontinuetheirlendingactivitiesmainlywithinashorttimeinterval,asshowninastrongpeakwithalmostnotailintheexampleswithalargerminFig.2(b).Challengesinloanrecommendation.Theproblemofloanrecommendationpresentsvariouschallengescomparedtoothertraditionalrecommendationproblems.The rstisthetransientnatureofloans.Standardrec-ommendationtechniquesbasedoncollaborative lteringpri-marilyutilizeothersimilarusers'ratingsorpreferencesontheitemsforrecommendation.Thekeynotionisthattheitemsbeingrecommendedsuchasbooksormoviesareper-sistentandreusable,i.e,anitem(oracopythereof)canservemanyusers.Loans,ontheotherhand,aretransientandaparticularloancanonlyserveasingleborrower.Moreimportantly,loansareonlyavailableforashortamountoftimeuntiltheloanrequestisfullymet,oftenleavinglittleornoinformationavailabletoutilizefrompreviouslenders.Thesecondchallengeisthebinaryratingstructure.Mostratingsystemsarecomposedofamulti-gradesetofratingsfromwhichausercanselect,yetinKiva,theonlyinfor-mationavailablesimilartoaratingiswhetherornots/hefundedtheloan.4Furthermore,thefactthatthefundingdidnothappenmaynotnecessarilymeanthats/hedidnotlikeit.Thischallengeisoftenfoundinothersettingswheretherecommendationreliesonlyonpreviouspurchases,viewingofitempages,etc.Suchlimitedinformationandambigu- 4Individualloanamountcouldbeutilizedsimilarlytoratinginformation,butsuchinformationisnotavailablefromKivaAPIforlenderprivacy. ityrequiremorethanjuststandardcollaborative lteringapproaches.Finally,anotherchallengeistheheterogeneityofdata.TheKivadatasetcomprisesavarietyofintertwineden-titiesgivingrisetoarichsetofheterogeneousinformation.Mergingandfusingthisdiversesetofinformationinauni- edpredictiveframeworkforloanrecommendationpresentsanon-trivialproblem.Overviewofproposedapproaches.Inordertobet-terhandlethesechallengesanddeeplyanalyzevariouslend-ingpatternsamongKivausers,weproposeasupervisedlearningapproachtotacklethisuniqueloanrecommenda-tionproblem.Thatis,weformulateitasabinaryclassi -cation/regressionproblem,where,givenalenderandloanpair,thetrainedmodelcomputesthescorethatrepresentsthelikelihoodoffunding.Inordertotrainourmodelwithalltheavailableinformation,weproposetwomainfeaturegenerationmethods:(1)graph-basedfeatureintegration(forlenderswithpreviousloans)(2)featurealignmentviajointnonnegativematrixfactorization(forlenderswithnoprevi-ousloans).Theformerprovidesuswithageneralframeworkforincorporatingalltheavailableheterogeneousinformationtorepresentalender-loanpair.Ontheotherhand,thelat-teralleviatesthelackofinformationfornewcomers,whichisawell-knownissuereferredtoasthecold-startproble

3 minmanyrecommendationapplications.Utiliz
minmanyrecommendationapplications.Utilizingtheproposedapproachesalongwithagradientboostingtree,astate-of-the-artlearnermodel,weachieveapracticallyusefullevelofperformanceuptoaround0.92AUC(areaunderthecurve)value.Furthermore,wepresentin-depthanalysisoftheresultingmodelanditsoutput,re-vealingvariousinterestingknowledgeaboutlenders'socialbehaviorsinmicro- nanceactivities.Therestofthispaperisorganizedasfollows.Section2describesourbasicpreprocessingstepstohandlethehetero-geneityofKivadata;inaddition,wehavemadethepost-processeddatareadilyavailableonthewebforotherre-searchers.Section3describesourmainapproachesforloanrecommendation,andSection4presentsthepredictionper-formancesaswellasvarious ndingsfromouranalysis.Sec-tion5discussesrelatedwork.Finally,Section6concludesthepaperanddiscussesfuturework.2.BASICDATAREPRESENTATIONTheKivadatasetiscomposedofvariousentities,eachofwhichhasitsownsetofrichinformationincludingunstruc-tureddata(e.g.,text,image,andvideo)aswellasstructureddata(e.g.,geo-spatial,numerical,categorical,andordinaldata).Lenderentitiescontainbasicwebpro ledata,i.e.,pro leimage,registrationtimestamp,location,loancount,andother elds,inadditiontolinkstovarioustypesofenti-ties.Forexample,alenderwillhavelinkstoloansthats/hehasfundedandtoanynumberoflenderteamswithwhichs/heisaliated.Fieldpartnersmanageloanswithintheirlocalregion,whileborrowersrequestloansfromtheirlocal eldpartnerinrespecttotheirlackofaccesstoacomputerwithinternetaccess.KivaprovidesarecentsnapshotofitsdatasetinJSONandXMLformats,5.Forourwork,weuseda2.9GBJSONsnapshotwhichwascollectedon5/31/2013.Weprepro-cessedittoobtainthenumericalrepresentationsofeach 5http://build.kiva.org/docs/data/snapshotsavailable eld.Particularly,thepreprocessingoftempo-ral,categorical,andtextual eldsallrequiredanontrivialamountofwork.Fortemporaldata,suchastheloan'spost-ingdateandlender'ssign-update,weconvertedittoaserialdatenumberusingMatlab'sdatenumfunction,whichrepre-sentsthewholeandfractionalnumberofdaysfroma xedpresetdateofJanuary0inyear0000.Forcategoricaldata,suchasalender'sgenderandaloan'scountrycode,weusedadummyencodingschemewhichconvertsavariablewithmcategoriesintoanm-dimensionalbinaryvectorwhereonlythevaluesinthecorrespondingcategoriesaresettoones.Finally,fortextualdata,weencodedeachtextual eldseparatelyasabag-of-wordsvectorwhereanindividualdi-mensioncorrespondstoauniqueword.Afterwardswere-ducedthedimensionalityusingnonnegativematrixfactor-ization6(NMF)[21,19]to100foreachtextual eld.Weperformeddimensionreductionfortworeasons.First,al-thoughtheencodedrepresentationsmaybeinsparseformat,theentiredimensioneasilyamountsuptothehundredsofthousandsrequiringenormouscomputationaltimeinlearn-ingourpredictionmodel.Second,thereduceddimensions,whicharecomposedofagroupofwords,aremoresemanti-callymeaningfulthanindividualtermdimensions,andthus,theycanbeversatileforbothgoodpredictionperformanceanddata/modelunderstanding[10,30].Thereduceddimen-sionwassetto100becauselargervaluesdidnotimprovethepredictionperformancesreportedinSection4.Asa nalpreprocessingstepwecreatedmappingsbetweenentitiesfromthedi erenttables.Forexample,alenderen-tityfoundinthetablecontainingmetadataforlendersmayhaveadi erentidenti erinanothertableaboutthelender-loangraph,andevenworse,itmayexistinonlyonetable,meaningthatsomeinformationaboutitwillbecompletelymissing.Themappingswecreatedallowtheseissuestobehandledwithease.WemadetheprocessedformatteddataasMatlab lesavailableatedu/processed-kiva-data.3.METHODOLOGYInthissection,wedescribeourmethodologyforpromot-ingnon-pro tmicro- nanceactivitiesinKiva.Weformulatethistaskasabinaryclassi cation/regressionproblem.Thatis,weconsiderapair(u;l)ofalenderu7andaloanlasanindividualdataitem,andgivensuchapair,weintendtopredicthowlikelys/hewillfundtheloan,whichwede-noteasf(u;l).Theassociatedlabelissetto1iffundingoccurredforthepairand0otherwise.Oncethelearnermodelistrainedbasedonasetofdataitemsalongwiththeselabels,itcanthenpredictthelikelihoodoffundingforanygivenlender-loanpair.Suchacapabilityisbroadlyapplicableinvariousloanrecommendationproblems.Forexample,itallowsonetoidentifythebestmatchinglenderforaparticularloanbysolvingargmaxuf(u;l)fora xedlaswellasthemostappropriateloantorecommendgivenaparticularlenderbysolvingargmaxlf(u;l)fora xedu.Inthisapproach,thekeyprocedurea ectingtheoverallperformanceisfeaturegeneration,i.e.,howwecharacter-izeandrepresentaparticularlender-loanpair.Thisises-peciallychallengingconsideringthecomplexityoftheKivadatasetwhichinvolvesheterogeneousentities,suchasbor- 6http://www.cc.gatech.edu/~hpark/nmfsoftware.php7Weuseanacronymubyviewingalenderasakiva`u'ser. Figure3:Agraph-basedfeatureintegrationforalender-loanpair(grey-colored).rowers, eldpartners,loans,lenders,andlenderteams,withtheirownvarioussetofinformationandcomplexrelation-shipsamongthem.Toproperlyhandlethisissue,weactappropria

4 telyfortwosituationssplitbywhetherornota
telyfortwosituationssplitbywhetherornotalenderhashadpreviousfundingexperiences.Inthefollow-ing,wepresentourfeaturegenerationprocedureforeachcaseindetail.3.1Graph-basedFeatureIntegrationWheninformationaboutpreviousfundingexperiencesofaparticularlenderisavailable,weutilizerelationshiplinksbetweendi erententitiestotakeintoaccountalltheinfor-mationavailablefromthelinkedentities.AssummarizedinFig.3,givenalender-loanpair(u;l),we rstretrieveallthelinkedentitiesfromboththelenderandtheloan.Speci -cally,alenderuwillcontainlinkstothelistofteamss/heisaliatedwith,loanss/hefundedpreviously,andpartnersandborrowershis/herpreviousloanswereassociatedwith.Similarly,aloanwillcontainthelinkstotheassociatedpart-nerandthelistsofborrowers,lenders(excludingthelenderofinterest),andlenderteamsthatlendersarealiatedwith.Lender-andloan-speci cfeatures.Eachentitytype,e.g.,thei-thtypeamongaborrower,apartner,aloan,alender,andalenderteam,composestheentity-type-wisefeature(column)vectors,vuiandvli,torepresentalenderuandaloanl,respectively,which,inturn,formalender-speci cfeaturevectorvuvu1vu5Tandaloan-speci conevlvl1vl5T(circlesinFig.3).Inthisprocess,oneissueisthatwemayhaveavariablenumberoflinkedentitiesofthesametype.Forinstance,onelendermayhavefundedfourloansinthepast,yetanothermayhavefunded fteen.Tomaintaina xednumberofdimensionsforvui(orvli)givenavariablenumberofentities,weaggregatethemintoasinglesetoffeaturesbyaddingupallthefeaturevectorsofindividualentities.Supposethei-thentitytypeisaloanandalenderuisassociatedwithasetofentities(loans)n(eui)j:j=1;;nowhereanentity(eui)jisrepresentedasafeaturevector(vui)j.Thefeaturevectorvui(ofthei-thentitytype)foruisrepresentedasvuiXj(vui)j:(1)Forexample,aloan'srequestedamount(indollars)willcor-respondtothesummationofthevaluesfrommultipleloans,asinglevalueindicatingatotalrequestedamount.Forcat-egoricalvariables,suchasalender'sgenderwhichisrep-resentedasabinaryvectorintwodimensions,aftersum-mingupthefeaturevectorsoflendersforaparticularloan,thevaluescorrespondingtothetwodimensionsbecomethenumberofmaleandfemaleslenders,respectively.Thesameideacanalsobeappliedtotextualfeatures,whicharenon-negativerepresentationscomputedbyNMF.Inaddition,eveniftherearenolinkstoentitiesofapar-ticularentitytype,e.g.,noassociatedloansforaparticularlender,Eq.(1)stillholdssinceitwillproduceanequal-dimensionalfeaturevectorcontainingallzeros.Lender-loanmatchingfeatures.Wehavedescribedhowwegeneratelender-andloan-speci cfeaturesbyin-cludinginformationfromeachofthelinkedentities.Wenotethatalthoughtheresultingdataincludelinkstohet-erogeneousentitytypes,bothalenderandaloannowhavecounterpartsgeneratedfromthesameentitytype,whichcanbedirectlycomparedwitheachother.Inotherwords,bothlendersandloanswillhaveallthefeaturesetsassociatedwithborrowers, eldpartners,loans,lenders,andlenderteams.Intuitively,iftheentitiesfromalendersideandaloansidearesimilar,ourpredictorf(u;l)shouldgiveahighscoreaboutthelikelihoodoffunding.Toleveragethisinourfeaturerepresentation,wegenerateanadditionalsetoffeaturesvulthatindicatehowwelltheentitiesofthesametypematchesinanindividualfeaturelevel.Tothisend,wecomputetheproductofindividualfeaturesreferringtothemaslender-loanmatchingfeatures(hexagonsinFig.3),i.e.,vulvuvl,whererepresentsanelement-wiseprod-uct.Giventhenonnegativityofvuandvl,vulindicateshowstronglythevaluesofaparticulardimensionarerepresentedin`both'thelenderandtheloansides;thiscanbeconsideredasthedegreeofmatchingatanindividualfeaturelevel.Thesematchingfeatures,whichareoriginallythesecond-ordertermsofexistingfeatures,maybeinherentlyutilizedinnonlinearorkernelmodels,buttheyarepotentiallycriticalinformationtomanyothermodelssuchaslinearmodelsandothertree-basedmodelsthatdealwithonlyonevariableatatime,aswillbedescribedinSection4.1.Temporalfeatures.Inspiredfromtheanalysisdis-cussedearlierinSection1,wegenerateadditionalfeaturesusingtemporalinformationaboutalenderandaloan.Avail-abletemporalinformationincludesalender'smember sinceandaloan'sposted date,funded date,andpaid date.Byconsideringtherelativetimedi erencesbetweenaloanlandthemostrecentloan,lr,thatalenderfundedinthepast,weconstructsixtemporalfeatureshavingtheformofxywherexisoneofl'sposted dateandfunded dateandyisoneoflr'sposted date,funded date,andpaid date.Thesefeaturesbasicallyre\rectthetemporalpatternsofconsecu-tivelendingactivities.3.2FeatureAlignmentviaJointNonnegativeMatrixFactorizationCold-startproblem.Thefeaturegenerationproceduredescribedpreviouslyisquitegeneraland\rexiblewhenincor-poratingalltheinformationfromeachoftheheterogeneousentities,butthemainlimitationofthisapproachariseswhenlittleornorelationshiplinkbetweenalenderand/oraloanexists.Althoughdetailsmaydi er,thisproblem,whichisoftenreferredtoasacold-startproblem,iscommoninmanyrecommendationapplications.Forinstance,supposeanewKivauserconsidersfundingaloanfort

5 he rsttimeandwewouldliketorecommendthemo
he rsttimeandwewouldliketorecommendthemostappropriateloantheywouldbelikelytofund.Itisverylikelythattheymaynothaveanyconnectionswithlenderteams,previousloans,and Figure4:AnoverviewofhowjointNMFworks.Givenahigh-dimensionalspaceoflenders'andloans'textualdata(`'-marked)alongwiththeirlinkedinformation(dashedlines),jointNMFgeneratesacommonalignedspacewherelinkeddatapointsarecloselyplaced.First-timelendersandfreshloans(`'-marked)arethenmappedtothealignedspacesothattheresultingrepresentationsrevealtheirhid-denlinkedrelationships(dottedellipses).accordingly,anypartnersorborrowers.Ontheotherhand,supposeanewloanwebpagehasjustlaunchedontheKivawebsiteanditcurrentlyhasnotsecuredalender.Inthisscenariowewouldnothaveanyavailablelinkstolendersandtheirlenderteamsthatcanbeutilizedinthefeaturegenerationprocessontheloan'sside.Thesecold-startprob-lemsmakeourloanrecommendationtaskchallengingsinceanumberoffeatureblocksdepictedinFig.3wouldbezerovectors,leavinglittleinformationusefulforrecommenda-tion.HowjointNMFworks.Asawaytoalleviatethisproblem,weproposeanovelfeaturegenerationapproachbasedonjointnonnegativematrixfactorization(NMF)fora rst-timelenderandafreshloanthathavenoavailablelininformation.AsshowninFig.4,themainideabehindthisapproachistotransformthefeaturesgeneratedfromhet-erogeneoussources,oneofwhichcomesfromalender'ssideandtheotherfromaloan'sside,intoacommonspacewherethevectorsrepresentingalenderandaloanwithwhichitislinkedcanbeplacedclosetoeachother.Onceweob-tainthevectorrepresentationsofalenderandaloanintheresultingcommonspace,onecanalsoeasilygeneratethecor-respondinglender-loanmatchingfeatureswhichwouldplayasigni cantroleinestimatingthelikelihoodoffunding.InputmatricesforjointNMF.Tobegin,westartwithtextual elds,e.g.,alender'sloan because,alender'soccupationalinfo,whichalender llsoutwhensigningupatKiva.org,andaloan'sloan description.AsdescribedinSec-tion(2),eachofthesetextual eldsisinitiallyrepresentedasabag-of-wordsvectorbasedonitsownvocabulary.Notethatthevocabularysetofaparticulartextual eldisinde-pendentofthatofanyother,makingeachofthemrepre-sentedinaseparatespace.Now,weformtwoterm-documentmatricesAuandAlus-ingthetextual eldfromalenderandaloan,respectively.Thatis,Auencodeseitheralender'sloan becauseoroccupa-tionalinfowhileAlencodesaloan'sloan description.Ad-ditionally,weassumethecolumnsofAuandAlarealignedbasedonthelinkedrelationshipsbetweenlendersandloans.Forexample,the rstcolumnofAuandthatofAlrepresentalenderandaloan,respectively,thathavealink.Follow-ingthisassumption,weexcludethoselendersandloansthathavenolinkswhenformingAuandAl.Whenaparticularloan,i.e.,acolumnofAl,haslinkstomultiplelenders,wesumupthetextualvectorsofthecorrespondinglendersandputthissinglevectorinthecorrespondingcolumnofAu.Inthismanner,wemaintainaone-to-onemappingbetweenthecolumnsofAuandAl.Formulation.GiventhetwomatricesAu2RunandAl2Rmln,aninteger,andaparameter ,jointNMFsolvesminWu;Hu;Wl;Hl\r\r\rAuWuHTu\r\r\r2F\r\r\rAlWlHTl\r\r\r2F kHuHlk2F;(2)whereWu2Rmuk,Hu2Rnk,Wl2Rlk,Hl2Rnkarenonnegativefactors.Intheaboveequation,the rstandthesecondtermcorrespondtostandardNMFformulations,butatthesametime,thethirdtermenforcesHuandHltobeclosetoeachother.Asaresult,therowsofHuandHlcanbeconsideredasnewvectorrepresentationsinacommon-dimensionalspacewherethelinkedlenderandloanvectorsarecloselyplaced.Joint-NMFfeaturesfora rst-timelenderandafreshloan.Uptonow,wehavecomputedjointNMFusingthetextualinformationoflendersandloansbyus-ingtheirlinkedrelationships,leadingtoacommonspacewheretheserelationshipsarerevealed.However,westillneedtorepresenta rst-timelenderandafreshloantoajoint-NMFspace.Toachievethistask,weutilizetheresult-ingfactormatricesWuandWl,whichprovideamappingforanarbitrarybag-of-wordsrepresentationinanoriginalspacetothejoint-NMFspace.Indetail,givenu2Ru1andl2Rml1correspondingtoa rst-timelenderandafreshloan,respectively,wesolvethefollowingnonnegativity-constrainedleastsquaresproblem,minhu0\r\r\ruWuhTu\r\r\r2andminhl0\r\r\rlWlhTl\r\r\r2(3)wherehTu2Rk1andhTl2Rk1areournewrepresentationsinthejoint-NMFspace,i.e.,joint-NMFfeatures.Thesejoint-NMFfeaturesmainlyhavetwoadvantages.First,eventhougha rst-timelenderandafreshloanhavenoexplicitlinks,onecanexpecttheirjoint-NMFfeaturesthatgeneratedinthiswaytobetterrevealtheirproximityowingtothelearntfactormatricesWuandWl.Second,sincetheyareconsideredtobeinthecommonspace,wecannowgeneratetheirlender-loanmatchingfeaturesinasimilarwaypresentedintheprevioussubsection.4.EXPERIMENTSANDFINDINGSInthissection,wepresentourexperimentsandanalysisontwoloanrecommendationcasesdependingonwhetheralenderofinteresthaspreviousfundinghistory.4.1ExperimentalSetupLearner.Consideringtheheterogeneityofourdataandthecomplexityoftheproblem,itiscrucialtousethemostsuitableandpowerfulpredictionmodeltodate.Tothisend,wehavechosenagradientboostingtree(GBtree)8[17,14].AGBtreeisanensem

6 blemethodwhereanindividuallearnerisadeci
blemethodwhereanindividuallearnerisadecisiontree[6].ThereasonforchoosingaGBtreeforourproblemisasfol-lows:Firstofall,anensemblemethodisknownforitssupe-riorgeneralizationcapabilityforunseendata.Moreimpor-tantly,adecisiontree,ourbaselearner,usesonevariableateachnodewhenitistrained/constructedaswellaswhenit 8TheGBtreeimplementationweusedisavailableathttps://sites.google.com/site/carlosbecker/resources/gradient-boosting-boosted-trees 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 False positive rateTrue positive rate y=x +Lender/loan text +Lender/loan info +Loan delinquency +Partner +Borrower +Temporal +Lender Team (a)m=5 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 False positive rateTrue positive rate y=x +Lender/loan text +Lender/loan info +Loan delinquency +Partner +Borrower +Temporal +Lender Team (b)m=10 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 False positive rateTrue positive rate y=x +Lender/loan text +Lender/loan info +Loan delinquency +Partner +Borrower +Temporal +Lender Team (c)m=15 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 False positive rateTrue positive rate y=x +Lender/loan text +Lender/loan info +Loan delinquency +Partner +Borrower +Temporal +Lender Team (d)m=20 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 False positive rateTrue positive rate y=x +Lender/loan text +Lender/loan info +Loan delinquency +Partner +Borrower +Temporal +Lender Team (e)m=25 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 False positive rateTrue positive rate y=x +Lender/loan text +Lender/loan info +Loan delinquency +Partner +Borrower +Temporal +Lender Team (f)m=50Figure5:TheROCcurveresultsfordi erentlendergroupswithvariousnumbersofpreviousloansm.isappliedtotestdata.Thischaracteristicpreventsusfromworryingaboutheterogeneityinthefeatureswegenerated.Thedownsidetootherlearners,suchaslogisticregressionandsupportvectormachines,isthatheterogeneousfeatureshavetobenormalizedvia,say,standardizationoftheirdis-tributions,whichtransformseachfeaturetohavezeromeanandunitvariance.Suchnormalizationdoesnotalwaysmakesenseforbinaryandintegerfeatures,andfurthermoreitre-movesthenonnegativityofourfeaturerepresentationthato ersintuitiveinterpretationofthem.Lendergroupsanddataselection.Aspreviouslyhighlighted,itisimportanttohandledi erentuserbehav-iorsproperly.Therefore,we rstselectedlendersthathaveaspeci clendingcountm,wherewevariedmfrom5to50,indicatingthedegreeofhowactivelylendersparticipatedinloans.Then,weconductedourexperimentsseparatelyoneachoftheselendergroups.Wefeltthatlenderswithinthisrangeofmcontainedthesetoflendersnottooactivenortoopassive,andthusweexpectthemtobemoresigni cantlyin\ruencedwhengivenarecommendationforanappropriateloan.Next,weformedalender-by-loanadjacencymatrixwhereonlythecomponentswhosecorrespondinglendersfundedthecorrespondingloansaresetto1and0otherwise.Fromthisgraph,werandomlyselected5,000positive(1-valuedcomponents)and5,000negative(0-valuedcomponents)sam-plesandgeneratedtheirfeaturevectors,asdescribedinSec-tion3.9Thesesamplesarethenusedasourtrainingandtestsetsundera10-foldcross-validationsetup. 9Notethatweusedbalanceddatasetsintermsofpositivevs.negativesampleswhileoriginaldataareseverelyunbal-anced.However,theROC-basedperformancemeasuredoesnotdependonthebalancedness[12].Featuregroups.Forlenderswithfundinghistory,weutilizedvariousfeaturespresentedinSection3.1andcon-structedseveralfeaturegroupsasfollows:(1) Loan/lender text (600 dimensions):Textualfeaturesfromalender'sloan becauseandaloan'sloan description,whosedimensionisreducedbyNMF(Section2).(2) Loan/lender info (183 dimensions):Featuresfromalender'sandaloan'svarious elds.(3) Loan delinquency (13 dimensions):Featuresindicatinghowmanypreviousloansforalenderhavebeennon-paidordelinquent.(4) Partner (33 dimensions):Featuresabout eldpartners.(5) Borrower (12 dimensions):Featuresaboutborrowers,e.g.,aborrower'sgenderandpictured.(6) Temporal (6 dimensions):Timedi erencesbetweenanewloanandalender'smostrecentlyfundedloan(Section3.1).(7) Lender team (15 dimensions):Featuresaboutlenderteamsalenderisassociatedwith.Usingthisstructure,alender-loanpair,whichisourdataitem,isrepresentedasan862-dimensionalvector.Forlenderswithoutfundinghistory,manyofthesefea-turesarenotavailable.Thus,thetwosetsofjoint-NMFfeaturesdescribedinSection3.2weremainlyused:thosegeneratedfromaligning(1-a))alender'sloan becausever-susaloan'sloan description(300-dimensional)and(1-b))alender'soccupational infoversusaloan'sloan description(300-dimensional),respectively.Next,weincluded(2)loan/lenderinfo(61-dimensional),(3)partner(11-dimensional),and(4)borrower(4-dimensional)information.Performancemeasure.Althoughourexperimentalset-tingisabinaryclassi cation,thedesiredcapabilityfromlearningthefunctionf(u;l)byaGBtreeistocomputethelikelihoodoffunding,whichallowsustorankthemostap- Lender/loan text Lender/loan info Loan delinquency Partner Borrower Temporal Len

7 der Team 0 0.1 0.2 0.3 Var. importance
der Team 0 0.1 0.2 0.3 Var. importance m= 5 m=10 m=15 m=20 m=25 m=50 (a)TheAUCvalueimprovementover.5whenusingonlyaparticularfeaturegroup Lender/loan text Lender/loan info Loan delinquency Partner Borrower Temporal Lender Team 0 0.05 0.1 Feature groupsVar. importance (b)TheAUCvaluedegradationduetotheexclusionofaparticularfeaturegroupFigure6:TheanalysisonthevariableimportanceTable1:ThecumulativeAUCvalueinFig.5 Thenumberofpreviousloansm 5 10 15 20 25 50 Lender/loantext .6938 .5930 .5524 .5594 .5788 .6730 Lender/loaninfo .7010 .5974 .5601 .5572 .5793 .6679 Loandelinquency .8416 .7453 .6438 .6000 .6265 .6691 Partner .8646 .7610 .6600 .6222 .6391 .6778 Borrower .8879 .7852 .6760 .6275 .6415 .6909 Temporal .9179 .8415 .7736 .7675 .7449 .7802 Team .9209 .8420 .7839 .7923 .7900 .8318 propriateloansforaparticularlenderaswellasthemostappropriatelendersforaparticularloan.Therefore,weareinterestedinthequalityintermsoftheresultingrankingofagiventestsetoflender-loanpairs,ratherthantheclassi- cationaccuracy.Inthisrespect,wereportareceiverop-eratingcharacteristic(ROC)curveanditsareaunderthecurve(AUC)value,whichmeasureshowmuchhigherposi-tivesamplesarerankedthannegativesamples.4.2PredictivePerformance4.2.1LenderswithavailablefundinghistoryOverallperformance.Incaseswherepreviousfundinginformationofalenderisavailable,wegraduallyincorpo-ratedadditionalfeaturesdescribedinSection4.1fordi er-entlendergroups.TheperformanceresultsareshownbytheROCcurvesinFig.5alongwiththeirAUCvaluessum-marizedinTable1.ThebestAUCvaluesrangedfrom.78to.92,whichisasigni cantimprovementoverabaselinevalueof.5.Theseresultsweregenerallyachievedonlywhenusingallthefeaturesavailable,indicatingtheadvantageofourfeatureintegrationframework.Amongdi erentlendergroups,lenderswith15m25werethemostdicultinpredictingtheirlikelyloanstofundwhilelenderswithalowerorhighermwererelativelyeasier.Analysisonfeaturegroups.Theanalysisonthevari-ableimportanceofeachfeaturegroup,asshowninFig.6,revealsvariousinterestingknowledgeaboutmicro- nanceactivities,asfollows:(1) The relative time with respect to the last funded loan plays an important role.Temporalfeatures,whichcontainelapsedtimeinformationsincethemostrecentlyfundedloan,e.g.,whenitwaspostedand/orwhenitwasre-paid,consistentlyimprovetheperformancebyanon-trivialamountforallcases.(2) Loan delinquencies discourage passive lenders although they do not impact active lenders as much.Theperformanceincreaseduetotheloandelinquencyfea-turesissubstantialforlendergroupswithm15,butthatincreasedropssigni cantlyforlendergroupswithm=50.Ourfurtherinvestigationshowedthesefeatureswerenega-tivelycorrelatedwiththelabels.Forexample,whenm=5,only36%ofthelenderswhopreviouslyexperiencedloandelinquencyhadpositivelabelswhile53%ofthelenderswithoutsuchexperienceshadpositivelabels.Ontheotherhand,whenm=50,thesetworatioswere49.7%and50.1%,respectively,showingalmostnocorrelation.(3) Lender teams exhibit greater in\ruence on active lenders.Theperformanceduetotheinclusionoflenderteamfeaturesimprovesasmincreases.Weconjecturethatitispartlybecausepassivelendersdidnotjointeamsyet.Infact,wefoundthattheaveragenumberofteamsofeachlenderwithm=50was.72whilethatwithm=5wasonly.25.Inaddition,fromFigs.5(e)(f),thesefeaturespulluptheROCcurvemainlyatthefalsepositiveratevalue(anxaxis)from.4to.7.Thisindicatesthattheyarehelpfulincorrectlyclassifyingthosesomewhatambiguouslender-loanpairs.4.2.2First-timelendersandfreshloansAbaselineapproach.Toevaluatethee ectivenessofjoint-NMFfeatures,wedesignedabaselineapproachtocom-pareourmethodagainst,asfollows.Inthebaselineap-proach,eachpairoftextual elds,(alender'sloan because,aloan'sloan description)and(alender'soccupational info,aloan'sloan description),hasbeenaggregatedintoasingledocumentcorpus,whichisencodedasalistofbag-of-wordsvectorsbasedonacommonvocabularyset.Next,weappliedstandardNMFinordertoobtaintheirreduced-dimensionalvectors.Notethat,similartothejointNMFapproach,theresultingvectorrepresentationsoflenders'andloans'tex-tualdataexistinacommonspace.Nonetheless,themaindi erenceisthatjointNMFutilizesadditionallinkinfor-mationandenforceslinkedlendersandloanstobeclosetoeachotherinthecommonspace(Fig.4).Performancecomparison.For rst-timelendersandfreshloans,Fig.7showsthecomparisonsintermsofAUCmeasuresbetweenthejointNMFandthebaselineapproaches.Foreachcase,jointNMFwascomputedbasedonadi er- Text +Lender/loan +Partner +Borrower 0.5 0.55 0.6 0.65 0.7 0.75 Feature groupsAUC value Joint NMF Baseline (a)4m6 Text +Lender/loan +Partner +Borrower 0.5 0.55 0.6 0.65 0.7 0.75 Feature groupsAUC value Joint NMF Baseline (b)15m20 Text +Lender/loan +Partner +Borrower 0.5 0.55 0.6 0.65 0.7 0.75 Feature groupsAUC value Joint NMF Baseline (c)50m100Figure7:TheAUCvaluesfor rst-timelendersandfreshloanswhentrainingjointNMFwithlenderswithvariousnumbersofpreviousloansmandtheirassociatedloa

8 ns.Table2:Therepresentativekeywordsoftwo
ns.Table2:TherepresentativekeywordsoftwotopicpairsalignedbyjointNMF Topic1 alender'soccupational info aloan'sloan description teacher,preschool,math, children,school,family, librarian,school married,husband Topic2 alender'soccupational info aloan'sloan description student,mba,college, business,activities, graduate,university entrepreneur,revenue entlendergroupanditsassociatedloansdependingonthevaluerangeofm.Notethatallthetraining/testdatainthesupervisedlearningexperimentshavebeenselectedonlyfrom rst-timelendersandfreshloansandthatloandelin-quencyandtemporalfeatureswereexcludedsincetheyarenotavailablefor rst-timelenders.Inalltheresults,thejoint-NMFapproachshowssignif-icantimprovementoverthebaselineapproach,indicatingthatjoint-NMFfeaturesareclearlyhelpfulinrevealinghid-denlinksbetween rst-timelenderandfresh-loans.Com-binedwithotherfeaturesavailable,thebestAUCresult,whichisabout.72,wasfoundwhenusingtheactivelendergroupwith50m100.Thisobservationissomewhatcounter-intuitivesince rst-timelenderswouldbeexpectedtohavesimilarbehaviorstothoseofpassivelenderswithasmallervalueofm.However,itcanstillbeexplainedinasensethatactivelenderslikelyprovidedetailedinformationaboutthemselvesinalender'stextual elds,whichwouldhaveprovidedjointNMFwithvitalcluesinlearningthemappingbetweenlendersandloans.Alignedtopics.Thequalitativeanalysisoftheresult-ingmappingofjointNMFsuggestsin-depthunderstand-ingoflendingbehaviors.Table2showstheexamplesofalignedtopicsbetweenalender'soccupationalinfoandaloan'sloan description.Theserepresentativekeywordswereobtainedasthemosthighlyweightedtermsinthecorre-spondingcolumnsofthetwomatricesWlandWuinEqs.(2)and(3).BothtopicsinTable2arerelatedtolenderswithschool-orientedoccupations.LendersinTopic1areshowntohaveprofessionaljobsinaeducationenvironment,suchasteach-ersandlibrarians,whilethoseinTopic2mainlyconsistofstudents.Byexaminingtheassociatedtopickeywordsinaloan'loan description,onecanseethattheformergrouptendstoparticipateinfamily-relatedloans,e.g.,helpingchil-drengotoschoolandsupportingafamilyand/orahusband.Onthecontrary,thelattergroup(students)likestolendtoentrepreneurswithaparticularbusinesssuchasrunningarestaurant.4.3FurtherDiscussionsOuranalysisonloanrecommendationandlendingbehav-iorssuggestsseveralimportantdirectionsthatKivashouldtaketopromotemicro- nancialactivities.First,asseenfromthesigni cantimportanceoftemporalfeatures,performingloanrecommendationatarighttimewouldbecrucialinkeepinglendersactivelyinvolved.AsshowninFig.2,Kivacangiverecommendation(1)soonafteralenderfundedaloanaswellas(2)whenone'spre-viousloanshavebeenpaidback.Otherwise,peopletendtograduallyloseinterestinmicro- nanceactivitiesastimegoeson.Second,Kivashouldhelplenders,especiallypassiveornovicelenders,avoidpotentiallyriskyloans.Fromouranal-ysis,non-paidand/ordelinquentloansseemtobethemajorcauseforpassivelenderstostoptheirlendingactivities,andthusitwouldbeimportanttoleadthemtoloanswithahighchanceofrepayment.Finally,inordertosecureactivelendersasmuchaspos-sible,Kivashouldencouragepassivelenderstojointeamssincelenderteamsseemtobeoneofthedrivingfactorsforactivelenders.5.RELATEDWORKInthissection,wemainlydiscussrelatedworkabout(1)recommendersystems(relevanttoSection3.1),(2)mani-foldalignment(relevanttoSection3.2),and(3)analysisonmicro- nancialactivities.Recommendersystems.Basically,arecommendersys-tem,anactiveinformation lteringsystem[5],aimsates-timatingtheso-calledutilityfunctionforagivenitemandauser,whichisanalogoustoourfundinglikelihoodfunc-tionf(u;l).Arecommendersystemtypicallyfallwithintwomethods:content-basedmethodswhichmatchuserstoproductsbymatchingauser'spro letotheproduct'schar-acteristicsandcollaborative lteringmethodswhichrecom-mendproductsthatotheruserswithsimilarpreferenceshavechoseninthepast[26,1].Numerousstudiesonrecom-mendersystemshavefocusedoncollaborative lteringap-proaches.Thesemethodsaregenerallycategorizedaccord-ingtowhethertheyarememory-basedandmodel-based. Foracomprehensivesummaryofcollaborative lteringtech-niques,thereaderisreferredtothesurveyarticles[28,1].DuetothediscussedchallengesinSection1,whichmakecollaborative lteringmethodsinapplicable,ourworkpartlyfollowsthecontent-basedapproachinthattheproposedlender-speci cfeaturescanbeviewedasauser'spro lewhiletheloan-speci cfeaturesrepresenttheproductcontent.How-ever,thetypicalcontent-basedapproach,mainlyoriginatingfrominformationretrievalliterature[4],focusesonlyontex-tualinformation.Inordertointegratealltheotherinfor-mationavailable,ourapproachextendsitinthecontextofad-hocinformationretrieval[24],whichthrowsvariousinfor-mationasfeaturesandtrainsalearnermodelforpredictingarelevancescoreofanitem.Thesetypesofapproachesarewidelyapplicableinvariousnovelapplicationsincludingonlinedatingsystems[11].Manifoldalignment.Thisareahasbeenactivelystud-iedrecentlyinthecontextofimageanalysis[20]andcross-ling

9 ualinformationretrieval[29,8,9].Theprobl
ualinformationretrieval[29,8,9].Theproblemsettingisgenerallysimilartothatofourfeaturealignmentwhere,giventhedi erentvectorrepresentationsand/orrelation-shipsofthecorrespondingitems,theirnewembeddinginacommonspaceiscomputed.Recently,fromtheperspectiveofmulti-relationallearningfrommultiplegraphsorsources,severaladvancedmethodsbasedonjointmatrixfactoriza-tionhavebeenproposed[33,27].Inaddition,ajointNMF-basedapproachhasbeenproposedformulti-viewclusteringproblems[22].However,mostofthesemethodsfocusonthebestrepresentationsofexistingdataitemswhileourpro-posedapproachfocusesonageneralizationcapability,i.e.,embeddingofunseendataintoacommonspacesothattheirhiddencorrespondencesareproperlyrevealed.Analysisonmicro- nancialactivities.Previousworkrelatedtothecomplexmicro- nancelendingbehavioralpat-ternsandKiva'snow-integralroleinthecrowd-sourcedmicro- nancingmovementhavelookedatthee ectsoftheinternetonmicro- nancing[7]andotherpeer-to-peerlendingtrans-actions[3].Studiesonmicro- nancedecision-makinghavediscoveredthatlendersfavorlendingopportunitiesnotonlytoentitiessimilartothemselvesbutalsotoindividualsinsituationsthattriggeranemotionalreaction[2,15].Kiva-related ndingshavesuggestedbiasinlendingdeci-sionsbyshowingthatparticularborrowerfeaturesgenerateahigherlevelofattractionfromthewiderlendingaudience.Inparticular,womenandmorephysicallyattractiveindi-vidualsinheritagreaterchanceofsecuringcharitableloansupport,atleastfromlendersthatconstitutethesetof rst-timeandlesser-activelenders[18].OtherstudiesonKivahaveobservedthenatureoflendingbehaviorbycorrelatingtheimpactofgroupdynamicstolendingparticipation[16,23].Thesestudiesprovideabasisforourworkinwhichweextendsimilardecision-makingprocessesthroughautoma-tiontosupportourlender-loanrecommendationsystem.AllthesestudieshaveanalyzedKiva'sdatainanumberofways,yetthereisalackofresearchthathasutilizedstatis-ticalnumericalanalysisapproacheswhichareclosertoourbodyofwork.Onesuchstudymanuallyde nedasetofcategoriesaboutthemotivationoflendingandappliedma-chinelearningtechniquestotrainautomatictextclassi ersusingalender'sloan because eld[23].Italsoincorporatedseveralsimplefeaturessuchastheloancountandteamaf- liationsinperformingregressiononlendingfrequencyandamount.Thisworkrevealedvariousinterestingknowledgeaboutlendingbehavior,buttheusedinformationandtech-niquesarerelativelylimitedcomparedtoourwork.Tothebestofourknowledge,ourworkisthe rstin-depthstudytodirectlytackletheloanrecommendationprob-lembyincorporatingalltheheterogeneousinformationavail-ablefromKiva.AsseeninSection4,weachieveperformanceviableforpracticalapplicationandrevealsigni cant ndingaboutlendingbehaviorthathasnotbeendiscussedinanypreviousotherwork.6.CONCLUSIONSANDFUTUREWORKInthispaper,wepresentedanovelapplicationofloanrec-ommendationinthenon-pro tmicro- nancesector.Start-ingwithanextensivedatasetfromKiva,afamousmicro- nanceservice,wetackledtheproblemusingasupervisedlearningapproach.Inordertorepresentanygivenlender-loanpairasafeaturevector,whichisakeyprocedureinthisapproach,weproposedtwomainmethodologies:(1)graph-basedfeatureintegrationto\rexiblyincorporateallthehet-erogeneousinformationavailableand(2)featurealignmentviajointNMFtoenhancethelimitedinformationof rst-timelendersandfreshloans.Basedontheproposedap-proachescombinedwithagradientboostingtree,astate-of-the-artpredictionmodel,weachievedupto.92AUCvalue.Furthermore,wepresentedinterestingphenomenaaboutmicro- nancingbehaviorsofKivalendersfromtem-poralandsocialaspects.Theimportanceofourworkandtheinformation-richna-tureoftheKivadataopenupvariousfutureresearchpossi-bilities.Wedescribeafewoftheminthefollowing.Selectingnegativeinstances.Althoughwefoundourexperimentsshowedconsistentresultsovermultiplerunsofdi erentsetsofrandomsamples,itwouldbebene cialtochoosenegativesampleswithmorecare.Thatis,notallnegativeexamplesaretrulynegative.Forexample,alendermaynothavefundedaparticularloansimplybecausehedidnotknowaboutitbutnotbecausehedecidednottofundit.Advancedtechniquessuchastheone-classtypeapproach[25]andtheoneleveragingthecontextofuser-systeminteractions[31],whichtackletheseissuesinotherrecommendationapplications,couldbeadoptedinourwork.Frauddetection.AsseeninSection4,non-paidanddelinquentloanssigni cantlyimpactfurtherlendingactivi-tiesofnovicelenders,andthus,itiscriticaltodetectpoten-tiallyfraudulentloansanddiscouragelendersfromlendingthem.Afraudloandetectionproblemcanbeformulatedandsolvedinasimilarwaytotheproposedmethodsinthispaper.Eventually,integratingtheresultingpotentialfraudscoretoourfeaturerepresentationwillincreasetheloanrec-ommendationperformanceevenfurther.7.ACKNOWLEDGMENTSThisworkwassupportedinpartbyNSFIIS-1116886,NSFCCF-0808863,NSFC61129001,andDARPAXDATAgrantFA8750-12-2-0309.Anyopinions, ndingsandcon-clusionsorrecommendationsexpressedinthismaterialarethoseoftheauthorsanddonotnecessarilyre\rectthevi

10 ewsoffundingagencies.Wealsothankanonymou
ewsoffundingagencies.Wealsothankanonymousreviewersfortheirinsightfulcommentsandsuggestions. 8.REFERENCES[1]G.AdomaviciusandA.Tuzhilin.Towardthenextgenerationofrecommendersystems:Asurveyofthestate-of-the-artandpossibleextensions.IEEETransactionsonKnowledgeandDataEngineering(TKDE),17(6):734{749,2005.[2]J.Andreoni.Impurealtruismanddonationstopublicgoods:Atheoryofwarm-glowgiving.TheEconomicJournal,100(401):464{477,1990.[3]A.AshtaandD.Assadi.Dosocialcauseandsocialtechnologymeet?impactofweb2.0technologiesonpeer-to-peerlendingtransactions.CahiersduCEREN,29:177{192,2009.[4]R.Baeza-YatesandB.Ribeiro-Neto.Moderninformationretrieval.Addison-Wesley,1999.[5]N.J.BelkinandW.B.Croft.Information lteringandinformationretrieval:twosidesofthesamecoin?CommunicationsoftheACM,35(12):29{38,1992.[6]L.Breiman.Classi cationandregressiontrees.CRCpress,1993.[7]T.Bruett.Cows,kiva,andprosper.com:Howdisintermediationandtheinternetarechangingmicro nance.CommunityDevelopmentInvestmentReview,3(2):44{50,2007.[8]P.A.Chew,B.W.Bader,T.G.Kolda,andA.Abdelali.Cross-languageinformationretrievalusingparafac2.InProc.the13thACMinternationalconferenceonKnowledgediscoveryanddatamining(SIGKDD),pages143{152,2007.[9]J.Choo,S.Bohn,G.Nakamura,A.White,andH.Park.Heterogeneousdatafusionviaspacealignmentusingnonmetricmultidimensionalscaling.InProc.theSIAMInternationalConferenceonDataMining(SDM),pages177{188,2012.[10]S.Deerwester,S.Dumais,G.Furnas,T.Landauer,andR.Harshman.Indexingbylatentsemanticanalysis.JournaloftheSocietyforInformationScience,41:391{407,1990.[11]F.Diaz,D.Metzler,andS.Amer-Yahia.Relevanceandrankinginonlinedatingsystems.InProc.the33rdinternationalACMconferenceonResearchanddevelopmentininformationretrieval(SIGIR),pages66{73,2010.[12]T.Fawcett.Rocgraphs:Notesandpracticalconsiderationsforresearchers.MachineLearning,31:1{38,2004.[13]M.Flannery.Kivaandthebirthofperson-to-personmicro nance.Innovations,2(1-2):31{56,2007.[14]J.H.Friedman.Greedyfunctionapproximation:agradientboostingmachine.AnnalsofStatistics,pages1189{1232,2001.[15]J.Galak,D.Small,andA.T.Stephen.Micro- nancedecisionmaking:A eldstudyofprosociallending.JournalofMarketingResearch,48(SPL):S130{S137,2011.[16]S.Hartley.Kiva.org:Crowd-sourcedmicro nance&cooperationingrouplending.2010.[17]T.Hastie,R.Tibshirani,andJ.Friedman.TheElementsofStatisticalLearning:DataMining,Inference,andPrediction.Springer,2009.[18]C.Jenq,J.Pan,andW.Theseira.Whatdodonorsdiscriminateon?evidencefromkiva.org.2012.[19]H.KimandH.Park.Sparsenon-negativematrixfactorizationsviaalternatingnon-negativity-constrainedleastsquaresformicroarraydataanalysis.Bioinformatics,23(12):1495{1502,2007.[20]S.Lafon,Y.Keller,andR.R.Coifman.Datafusionandmulticuedatamatchingbydi usionmaps.IEEETransactionsonPatternAnalysisandMachineIntelligence(TPAMI),28:1784{1797,2006.[21]D.D.LeeandH.S.Seung.Learningthepartsofobjectsbynon-negativematrixfactorization.Nature,401:788{791,1999.[22]J.Liu,C.Wang,J.Gao,andJ.Han.Multi-viewclusteringviajointnonnegativematrixfactorization.InProc.theSIAMInternationalConferenceonDataMining(SDM),pages252{260,2013.[23]Y.Liu,R.Chen,Y.Chen,Q.Mei,andS.Salib.Iloanbecause...:Understandingmotivationsforpro-sociallending.InProc.the5thACMInternationalConferenceonWebSearchandDataMining(WSDM),pages503{512,2012.[24]C.D.Manning,P.Raghavan,andH.Schutze.Introductiontoinformationretrieval,volume1.CambridgeUniversityPressCambridge,2008.[25]R.PanandM.Scholz.Mindthegaps:weightingtheunknowninlarge-scaleone-classcollaborative ltering.InProc.the15thACMinternationalconferenceonKnowledgediscoveryanddatamining(SIGKDD),pages667{676,2009.[26]B.Sarwar,G.Karypis,J.Konstan,andJ.Riedl.Analysisofrecommendationalgorithmsfore-commerce.InProc.the2ndACMconferenceonElectroniccommerce,pages158{167,2000.[27]A.P.SinghandG.J.Gordon.Relationallearningviacollectivematrixfactorization.InProc.the14thACMinternationalconferenceonKnowledgediscoveryanddatamining(SIGKDD),pages650{658,2008.[28]X.SuandT.M.Khoshgoftaar.Asurveyofcollaborative lteringtechniques.AdvancesinArti cialIntelligence,2009:4:2{4:2,2009.[29]C.WangandS.Mahadevan.Manifoldalignmentusingprocrustesanalysis.InProc.the25thInternationalConferenceonMachineLearning(ICML),pages1120{1127,2008.[30]W.Xu,X.Liu,andY.Gong.Documentclusteringbasedonnon-negativematrixfactorization.InProc.the26thinternationalACMconferenceonResearchanddevelopmentininformaionretrieval(SIGIR),pages267{273,2003.[31]S.-H.Yang,B.Long,A.J.Smola,H.Zha,andZ.Zheng.Collaborativecompetitive ltering:learningrecommenderusingcontextofuserchoice.InProc.the34thinternationalACMconferenceonResearchanddevelopmentinInformationRetrieval(SIGIR),pages295{304,2011.[32]M.Yunus.BankertothePoor.PenguinBooksIndia,1998.[33]D.Zhou,S.Zhu,K.Yu,X.Song,B.L.Tseng,H.Zha,andC.L.Giles.Learningmultiplegraphsfordocumentrecommendations.InProc.the17thinternationalconferenceonWorldWideWeb(WWW),pages141{150,20

Related Contents


Next Show more