/
PackageFOCIMarch192021TypePackageTitleFeatureOrderingbyConditionalI PackageFOCIMarch192021TypePackageTitleFeatureOrderingbyConditionalI

PackageFOCIMarch192021TypePackageTitleFeatureOrderingbyConditionalI - PDF document

elysha
elysha . @elysha
Follow
343 views
Uploaded On 2021-07-07

PackageFOCIMarch192021TypePackageTitleFeatureOrderingbyConditionalI - PPT Presentation

2codec codecEstimatetheconditionaldependencecoef2cientCODEC DescriptionTheconditionaldependencecoef2cientCODECisameasureoftheamountofconditionaldependencebetweenarandomvariableYandarandomve ID: 855136

foci codec pdf true codec foci true pdf parallel cient 1910 scale aut num features parplat numcores 2019 souravchatterjee

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "PackageFOCIMarch192021TypePackageTitleFe..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1 Package`FOCI'March19,2021TypePackageTitl
Package`FOCI'March19,2021TypePackageTitleFeatureOrderingbyConditionalIndependenceVersion0.1.3MaintainerMonaAzadkia&#xmona; zad;&#xkia@;&#xgmai;&#xl.co;&#xm000;DescriptionFeatureOrderingbyConditionalIndependence(FOCI)isavariableselectionalgo-rithmbasedonthemeasureofconditionaldependence.Formoreinformation,seethepaper:AzadkiaandChatterjee(2019),Asimplemeasureofcon-ditionaldependence rXi;─v:1910.12327.LicenseGPL-3EncodingUTF-8LazyDatatrueVignetteBuilderknitrRoxygenNote7.1.0Suggestsknitr,rmarkdown,testthatDependsR rXi;─(=3.6.0),data.tableImportsRANN,proxy,parallel,gmpNeedsCompilationnoAuthorMonaAzadkia[aut,cre],SouravChatterjee[aut,ctb],NormanMatloff[aut,ctb]RepositoryCRANDate/Publication2021-03-1823:00:07UTCRtopicsdocumented:codec............................................2foci.............................................3Index61 2codec codecEstimatetheconditionaldependencecoefcient(CODEC) DescriptionTheconditionaldependencecoefcient(CODEC)isameasureoftheamountofconditionaldepen-dencebe

2 tweenarandomvariableYandarandomvectorZgi
tweenarandomvariableYandarandomvectorZgivenarandomvectorX,basedonani.i.d.sampleof(Y,Z,X).Thecoefcientisasymptoticallyguaranteedtobebetween0and1.Usagecodec(Y,Z,X=NULL,na.rm=TRUE)ArgumentsYVector(lengthn)ZMatrix(nbyq)XMatrix(nbyp),defaultisNULLna.rmRemoveNAsifTRUEDetailsThevaluereturnedbycodeccanbepositiveornegative.Asymptotically,itisguaranteedtobebetween0and1.AsmallvalueindicateslowconditionaldependencebetweenYandZgivenX,andahighvalueindicatesstrongconditionaldependence.Thecodecfunctionisusedbythefocifunctionforvariableselection.ValueTheconditionaldependencecoefcient(CODEC)ofYandZgivenX.IfX==NULL,thisisjustameasureofthedependencebetweenYandZ.Author(s)MonaAzadkia,SouravChatterjee,NormanMatloffReferencesAzadkia,M.andChatterjee,S.(2019).Asimplemeasureofconditionaldependence.https://arxiv.org/pdf/1910.12327.pdf.SeeAlsofoci,xicor foci3Examplesn=1000xmatrix(runif(n*2),nrow=n)y(x[,1]+x[,2])%%1#givenx[,1],yisafunctionofx[,2]codec(y,x[,2],x[,1])#yisafunctionofxcodec(y,x)zrnorm(n)#yisafunctionofxgivenzcodec(y,x,z)#yisindepende

3 ntofzgivenxcodec(y,z,x) fociVariablesele
ntofzgivenxcodec(y,z,x) fociVariableselectionbytheFOCIalgorithm DescriptionFOCIisavariableselectionalgorithmbasedonthemeasureofconditionaldependencecodec.Usagefoci(Y,X,num_features=NULL,stop=TRUE,na.rm=TRUE,standardize="scale",numCores=parallel::detectCores(),parPlat="none",printIntermed=TRUE)ArgumentsYVectorofresponses(lengthn)XMatrixofpredictors(nbyp)num_featuresNumberofvariablestobeselected,cannotbelargerthanp.ThedefaultvalueisNULLandinthatcaseitwillbesetequaltop.Ifstop==TRUE(seebelow),thennum_featuresisirrelevant.stopStopsattherstinstanceofnegativecodec,ifTRUE.na.rmRemovesNAsifTRUE. 4focistandardizeStandardizecovariatesifsetequalto"scale"or"bounded".Otherwisewillusetherawinputs.Thedefaultvalueis"scale"andnormalizeseachcolumnofXtohavemeanzeroandvariance1.Ifsetequalto"bounded"mapthevaluesofeachcolumnofXto[0,1].numCoresNumberofcoresthataregoingtobeusedforparallelizingthevariableselecc-tionprocess.parPlatSpeciestheparallelplatformtochunkdatabyrows.Itcantakethreevalues:1-Thedefaultvalueissetto'none',inwhichcasenorowch

4 unkingisdone;2-theparallelclustertobeuse
unkingisdone;2-theparallelclustertobeusedforrowchunking;3-"locThreads",specifyingthatrowchunkingwillbedoneviathreadsonthehostmachine.printIntermedThedefaultvalueisTRUE,inwhichcaseprintintermediateresultsfromtheclusternodesbeforenalprocessing.DetailsFOCIisaforwardstepwisealgorithmthatusestheconditionaldependencecoefcient(codec)ateachstep,insteadofthemultiplecorrelationcoefcientasinordinaryforwardstepwise.Ifstop==TRUE,theprocessisstoppedattherstinstanceofnonpositivecodec,therebyselectingasubsetofvariables.Otherwise,asetofcovariatesofsizenum_features,orderedaccordingtopredictivepower(asmeasuredbycodec)isproduced.Parallelcomputation:Thecomputationcanbelengthy,sothepackageofferstwokindsofparallelcomputation.Therst,controlledbytheargumentnumCores,speciesthenumberofcorestobeusedonthehostmachine.Ifatagivensteptherearekcandidatevariablesunderconsiderationforinclusion,thesektasksareassignedtothevariouscores.Thesecondapproach,controlledbytheargumentparPlat("parallelplatform"),involvestheuserrstsettingupaclus

5 terviatheparallelpackage.Thedataaredivid
terviatheparallelpackage.Thedataaredividedintochunksbyrows,witheachclusternodeapplyingFOCItoitsdatachunk.Theunionoftheresultsisthenformed,andfedthroughFOCIonemoretimetoadjustthediscrepancies.Theideaisthatthatlaststepwillnotbetoolengthy,asthenumberofcandidatevariableshasalreadybeenreduced.Aclustersizeofrmayactuallyproduceaspeedupfactorofmorethanr(Matloff2016).Potentiallythebestspeedupisachievedbyusingthetwoapproachestogether.TherstapproachcannotbeusedonWindowsplatforms,asparallel::mcapplyhasnoeffect.Windowsusersshouldthususethesecondapproachonly.Inadditiontospeed,thesecondapproachisusefulfordiagnostics,astheresultsfromthedifferentchunksgivestheuseranideaofthedegreeofsamplingvariabilityintheFOCIresults.Inthesecondapproach,arandompermutationisappliedtotherowsofthedataset,asmanydatasetsaresortedbyoneormorecolumns.Notethatifacertainvalueofafeatureisrareinthefulldataset,itmaybeabsententirelyinsomechunk.ValueAnobjectofclass"foci",withattributesselectedVar,showingtheselectedvariablesindecreasingorderof(conditional)predictivepowe

6 r,andstepT,listingthe'codec'values.Typic
r,andstepT,listingthe'codec'values.Typicallythelatterwillbegintoleveloffatsomepoint,withadditionalmarginalimprovementsbeingsmall. foci5Author(s)MonaAzadkia,SouravChatterjee,andNormanMatloffReferencesAzadkia,M.andChatterjee,S.(2019).Asimplemeasureofconditionaldependence.https://arxiv.org/pdf/1910.12327.pdf.Matloff,N.(2016).SoftwareAlchemy:TurningComplexStatisticalComputationsintoEmbarrassingly-ParallelOnes.J.ofStat.Software.SeeAlsocodec,xicorExamples#Example1n=1000p=100xmatrix(rnorm(n*p),nrow=n)colnames(x)=paste0(rep("x",p),seq(1,p))yx[,1]*x[,10]+x[,20]^2#withnum_featuresequalto3andstopequaltoFALSE,fociwillgivealistof#threeselectedfeaturesresult1=foci(y,x,num_features=3,stop=FALSE,numCores=1)result1#Example2#sameexample,butstopaccordingtothestoppingruleresult2=foci(y,x,numCores=1)result2##Notrun:#Windowsuseofmulticorelibrary(parallel)clsmakeCluster(parallel::detectCores())foci(y,x,parPlat=cls)#runonphysicalclusterclsmakePSOCKcluster('machineA','machineB')foci(y,x,parPlat=cls)##End(Notrun) Indexcodec,2,3–5foci,2,3xicor,2,

Related Contents


Next Show more