Journal of Machine Learning Research Su bmitted Re - PDF document

Journal of Machine Learning Research    Su bmitted  Re
Journal of Machine Learning Research    Su bmitted  Re

Journal of Machine Learning Research Su bmitted Re - Description


King DAVISKING USERS SOURCEFORGE NET Northrop Grumman ES ATR and Image Exploitation Group Baltimore Maryland USA Editor Soeren Sonnenburg Abstract There are many excellent toolkits which provide support for developing machine learning soft ware in P ID: 48656 Download Pdf

Tags

King DAVISKING USERS SOURCEFORGE

Download Section

Please download the presentation from below link :


Download Pdf - The PPT/PDF document "Journal of Machine Learning Research ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.

Embed / Share - Journal of Machine Learning Research Su bmitted Re


Presentation on theme: "Journal of Machine Learning Research Su bmitted Re"— Presentation transcript


JournalofMachineLearningResearch10(2009)1755-1758Submitted10/08;Revised4/09;Published7/09Dlib-ml:AMachineLearningToolkitDavisE.KingDAVISKING@USERS.SOURCEFORGE.NETNorthropGrummanES,ATRandImageExploitationGroupBaltimore,Maryland,USAEditor:SoerenSonnenburgAbstractTherearemanyexcellenttoolkitswhichprovidesupportfordevelopingmachinelearningsoft-wareinPython,R,Matlab,andsimilarenvironments.Dlib-mlisanopensourcelibrary,targetedatbothengineersandresearchscientists,whichaimstoprovideasimilarlyrichenvironmentfordevelopingmachinelearningsoftwareintheC++language.Towardsthisend,dlib-mlcontainsanextensiblelinearalgebratoolkitwithbuiltinBLASsupport.ItalsohousesimplementationsofalgorithmsforperforminginferenceinBayesiannetworksandkernel-basedmethodsforclassi-cation,regression,clustering,anomalydetection,andfeatureranking.Toenableeasyuseofthesetools,theentirelibraryhasbeendevelopedwithcontractprogramming,whichprovidescompleteandprecisedocumentationaswellaspowerfuldebuggingtools.Keywords:kernel-methods,svm,rvm,kernelclustering,C++,Bayesiannetworks1.IntroductionDlib-mlisacrossplatformopensourcesoftwarelibrarywrittenintheC++programminglanguage.Itsdesignisheavilyinuencedbyideasfromdesignbycontractandcomponent-basedsoftwareengineering.Thismeansitisrstandforemostacollectionofindependentsoftwarecomponents,eachaccompaniedbyextensivedocumentationandthoroughdebuggingmodes.Moreover,thelibraryisintendedtobeusefulinbothresearchandrealworldcommercialprojectsandhasbeencarefullydesignedtomakeiteasytointegrateintoauser'sC++application.Thereareanumberofwellknownmachinelearninglibraries.However,manyoftheselibrariesfocusonprovidingagoodenvironmentfordoingresearchusinglanguagesotherthanC++.TwoexamplesofthiskindofprojectaretheShogun(Sonnenburgetal.,2006)andTorch(CollobertandBengio,2001)toolkitswhich,whiletheyareimplementedinC++,arenotfocusedonprovid-ingsupportfordevelopingmachinelearningsoftwareinthatlanguage.InsteadtheyareprimarilyintendedtobeusedwithlanguageslikeR,Python,Matlab,orLua.ThentherearetoolkitssuchasShark(Igeletal.,2008)anddlib-mlwhichareexplicitlytargetedatuserswhowishtodevelopsoftwareinC++.Giventheseconsiderations,dlib-mlattemptstohelpllsomeofthegapsintoolsupportnotalreadylledbylibrariessuchasShark.Itishopedthattheseeffortswillproveusefulforresearchersandengineerswhowishtodevelopmachinelearningsoftwareinthislanguage.c\r2009DavisE.King. KING Figure1:Elementsofdlib-ml.Arrowsshowdependenciesbetweencomponents.2.ElementsoftheLibraryThelibraryiscomposedofthefourdistinctcomponentsshowninFigure1.Thelinearalgebracomponentprovidesasetofcorefunctionalitywhiletheotherthreeimplementvarioususefultools.Thispaperaddressesthetwomaincomponents,linearalgebraandmachinelearningtools.2.1LinearAlgebraThedesignofthelinearalgebracomponentofthelibraryisbasedonthetemplateexpressiontech-niquespopularizedbyVeldhuizenandPonnambalam(1996)intheBlitz++numericalsoftware.ThistechniqueallowsanauthortowritesimpleMatlab-likeexpressionsthat,whencompiled,ex-ecutewithspeedcomparabletohand-optimizedCcode.Thedlib-mlimplementationextendsthisoriginaldesigninanumberofways.Mostnotably,thelibrarycanusetheBLASwhenavailable,meaningthattheperformanceofcodedevelopedusingdlib-mlcangainthespeedofhighlyopti-mizedlibrariessuchasATLASortheIntelMKLwhilestillusingaverysimplesyntax.Considerthefollowingexampleinvolvingmatrixmultiplies,transposes,andscalarmultiplications:(1)result=3*trans(A*B+trans(A)*2*B);(2)result=3*trans(B)*trans(A)+6*trans(B)*A;Theresultofexpression(1)couldbecomputedusingonlytwocallstothematrixmultiplyroutineinBLASbutrstitisnecessarytoreorderthetermsintoform(2)tottheformexpectedbytheBLASroutines.Performingthesetransformationsbyhandistediousanderrorprone.Dlib-mlautomaticallyperformsthesetransformationsonallexpressionsandinvokestheappropriateBLAScalls.Thisenablestheusertowriteequationsintheformmostintuitivetothemandleavethesedetailsofsoftwareoptimizationtothelibrary.ThisisafeaturenotfoundinthesupportingtoolsofotherC++machinelearninglibraries.2.2MachineLearningToolsAmajordesigngoalofthisportionofthelibraryistoprovideahighlymodularandsimplearchi-tecturefordealingwithkernelalgorithms.Inparticular,eachalgorithmisparameterizedtoallowausertosupplyeitheroneofthepredeneddlib-mlkernels,oranewuserdenedkernel.Moreover,theimplementationsofthealgorithmsaretotallyseparatedfromthedataonwhichtheyoperate.1756 DLIB-ML:AMACHINELEARNINGTOOLKITThismakesthedlib-mlimplementationgenericenoughtooperateonanykindofdata,beitcolumnvectors,images,orsomeotherformofstructureddata.Allthatisnecessaryisanappropriatekernel.Thisisafeatureuniquetodlib-ml.Manylibrariesallowarbitraryprecomputedkernelsandsomeevenallowuserdenedkernelsbuthaveinterfaceswhichrestrictthemtooperatingoncolumnvectors.However,noneallowtheexibilitytooperatedirectlyonarbitraryobjects,makingitmucheasiertoapplycustomkernelsinthecasewherethekernelsoperateonobjectsotherthanxedlengthvectors.ThelibraryprovidesimplementationsofpopularalgorithmssuchasRBFnetworksandsupportvectormachinesforclassication.ItalsoincludesalgorithmsnotpresentinothermajorMLtoolkitssuchasrelevancevectormachinesforclassicationandregression(TippingandFaul,2003).Allofthesealgorithmsareimplementedasgenerictrainerobjectswithastandardinterface.Thisdesignallowstrainerobjectstobeusedbyanumberofgenericmeta-algorithmsthatdotaskssuchasperformingcrossvalidation,reducingthenumberofoutputsupportvectors(SuttorpandIgel,2007),orttingasigmoidtotheoutputdecisionfunctiontomakedecisionsinterpretableinprobabilisticterms(Platt,1999).Thisgenerictrainerinterface,alongwiththecontractprogrammingapproach,makesthelibraryeasilyextensiblebyotherdevelopers.AnothergoodexampleofagenerickernelalgorithmprovidedbythelibraryisthekernelRLStechniqueintroducedbyEngeletal.(2004).Itisakernelizedversionofthefamousrecursiveleastsquareslter,andfunctionsasanexcellentonlineregressionmethod.Withit,Engelintroducedasimplebutveryeffectivetechniqueforproducingsparseoutputsfromkernellearningalgorithms.Engel'ssparsicationtechniqueisalsousedbyoneofdlib-ml'smostversatiletools,thekcen-troidobject.Itisageneralutilityforrepresentingaweightedsumofsamplepointsinakernelinducedfeaturespace.Itcanbeusedtoeasilykernelizeanyalgorithmthatrequiresonlytheabilitytoperformvectoraddition,subtraction,scalarmultiplication,andinnerproducts.Thekcentroidobjectenablesthelibrarytoprovideanumberofusefulkernel-basedmachinelearningalgorithms.Themoststraightforwardofwhichisonlineanomalydetection,whichsimplymarksdatasamplesasnoveliftheirdistancefromthecentroidofapreviouslyobservedbodyofdataislarge(e.g.,3standarddeviationsfromthemeandistance).Asimilarlysimplebutstillpowerfulapplicationisinfeatureranking,wherefeaturesareconsideredgoodiftheirinclusionresultsinalargedistancebetweenthecentroidsofdifferentclassesofdata.Anotherstraightforwardapplicationofthistechniqueisinkernelizedclusteranalysis.Usingthekcentroiditiseasytocreatesparsekernelclusteringalgorithms.Todemonstratethis,thelibrarycomeswithasparsekernelk-meansalgorithm.Finally,dlib-mlcontainstwoSVMsolvers.OneisessentiallyareimplementationofLIB-SVM(ChangandLin,2001)butwiththegenericparameterizedkernelapproachusedintherestofthelibrary.ThissolverhasroughlythesameCPUandmemoryutilizationcharacteristicsasLIBSVM.TheotherSVMsolverisakernelizedversionofthePegasosalgorithmintroducedbyShalev-Shwartzetal.(2007).Itisbuiltusingthekcentroidandthusproducessparseoutputs.3.AvailabilityandRequirementsThelibraryisreleasedundertheBoostSoftwareLicense,allowingittobeincorporatedintobothopen-sourceandcommercialsoftware.Itrequiresnoadditionallibraries,doesnotneedtobecon-guredorinstalled,andisfrequentlytestedonMSWindows,LinuxandMacOSXbutshouldworkwithanyISOC++compliantcompiler.1757 KINGNotethatdlib-mlisasubsetofalargerprojectnameddlibhostedathttp://dclib.sourceforge.net.Dlibisageneralpurposesoftwaredevelopmentlibrarycontainingagraphicalapplicationforcreat-ingBayesiannetworksaswellastoolsforhandlingthreads,networkI/O,andnumerousothertasks.Dlib-mlisavailablefromthedlibproject'sdownloadpageonSourceForge.ReferencesChih-ChungChangandChih-JenLin.LIBSVM:ALibraryforSupportVectorMachines,2001.Softwareavailableathttp://www.csie.ntu.edu.tw/˜cjlin/libsvmRonanCollobertandSamyBengio.Svmtorch:supportvectormachinesforlarge-scaleregressionproblems.J.Mach.Learn.Res.,1:143–160,2001.ISSN1533-7928.YaakovEngel,ShieMannor,andRonMeir.Kernelrecursiveleastsquares.IEEETransactionsonSignalProcessing,52:2275–2285,2004.ChristianIgel,TobiasGlasmachers,andVerenaHeidrich-Meisner.Shark.JournalofMachineLearningResearch,9:993–996,2008.JohnC.Platt.Probabilisticoutputsforsupportvectormachinesandcomparisonstoregularizedlikelihoodmethods.InAdvancesinLargeMarginClassiers,pages61–74.MITPress,1999.ShaiShalev-Shwartz,YoramSinger,andNathanSrebro.Pegasos:Primalestimatedsub-gradientsolverforsvm.InICML'07,pages807–814,NewYork,NY,USA,2007.ACM.S¨orenSonnenburg,GunnarR¨atsch,ChristinSch¨afer,andBernhardSch¨olkopf.Largescalemultiplekernellearning.J.Mach.Learn.Res.,7:1531–1565,2006.ISSN1533-7928.ThorstenSuttorpandChristianIgel.Resilientapproximationofkernelclassiers.volume4668ofLectureNotesinComputerScience,pages139–148.Springer,2007.MichaelE.TippingandAnitaC.Faul.FastmarginallikelihoodmaximisationforsparseBayesianmodels.InProceedingsoftheNinthInternationalWorkshoponArticialIntelligenceandStatis-tics,pages3–6,2003.ToddVeldhuizenandKumaraswamyPonnambalam.LinearalgebrawithC++templatemetapro-grams.Dr.Dobb'sJournalofSoftwareTools,21(8):38–44,1996.1758

Shom More....