/
PowerAwithArithinsteadofjustPowerA,makingallmembersofArithaccessiblewi PowerAwithArithinsteadofjustPowerA,makingallmembersofArithaccessiblewi

PowerAwithArithinsteadofjustPowerA,makingallmembersofArithaccessiblewi - PDF document

tatiana-dople
tatiana-dople . @tatiana-dople
Follow
351 views
Uploaded On 2015-08-26

PowerAwithArithinsteadofjustPowerA,makingallmembersofArithaccessiblewi - PPT Presentation

1avirtualizablelanguageisalsoauniversallanguageaccordingtothedenitionofVeldhuizen42butvirtualizationaddstheeffortcriterium traitPowerAthisArithdefpowerbRepDoublexIntRepDoubleifx0 ID: 115526

1avirtualizablelanguageisalsoauniversallanguageaccordingtothedenitionofVeldhuizen[42]butvirtualizationaddstheeffortcriterium traitPowerA{this:Arith=defpower(b:Rep[Double] x:Int):Rep[Double]=if(x==0)

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "PowerAwithArithinsteadofjustPowerA,makin..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

PowerAwithArithinsteadofjustPowerA,makingallmembersofArithaccessiblewithinPowerAaswell.Thisisallweneedtochange.Themaincharacteristicsoflightweightmodularstagingcanbesummarizedasfollows:binding-timesaredistinguishedonlybytypes;nospecialsyn-tacticannotationsarerequiredgivenasufcientlyexpressivelanguage,thewholeframeworkcanbeimplementedasalibrary(hencelightweight)stagedcodeis“veryshallowly”embeddedintotheprogramgenerator;stagedexpressionsinheritthestaticscopeofthegeneratorandifthegeneratoriswell-typedsoisthegeneratedcodestagedcodefragmentsarecomposedthroughexplicitopera-tions,inparticularliftedvariantsoftheusualoperatorsandcon-trolowstatementsextendedwithoptimizingsymbolicrewrit-ingsusingcomponenttechnology,operationsonstagedexpressions,datatypestorepresentthem,andoptimizations(bothgenericanddomain-specic)canbeextendedandcomposedinaexi-bleway(hencemodular)likewise,differentcodegenerationtargetscanbesupported(heterogeneousstaging);theirimplementationscansharecom-moncodeinthehomogeneouscase,objectsthatarelivewithinthegen-erator'sheapcanbeaccessedfromgeneratedcode(cross-stagepersistence)datatypesrepresentingstagedexpressionscanbehiddenfromclientcode(makingrewritessafethatpreserveonlysemanticequality)butexposedtomodulesthatimplementtherewritingcommonsubexpressionelimination/valuenumberingishan-dledgloballywithintheframework;thereisnodangerofcodeduplication“unstaging”,i.e.compilationandloadingofstagedfunctionsisanexplicitoperation,independentofrunningthecompiledcode;programgeneratorshavefullcontroloverwhencompila-tionhappensandhowcompiledcodeisre-usedManyofthelistedpointsarefoundinothercodegenerationap-proachesaswell,buttothebestofourknowledge,noexistingsys-temcombinesthemall.Webelievethatthiscombinationoccupiesa“sweetspot”inthedesignspace(seeSection4foradetailedcom-parisonwithrelatedwork)––mostprominentlybysignicantlyreducingtheeffortrequiredtogofromanaivelyimplementedal-gorithmtoanoptimizingprogramgenerator.Lightweightmodularstagingprovidesmanyofthebenetsofusingadedicatedmulti-stageprogramminglanguage[38]suchasMetaOCaml,inparticularconcerningwell-formednessandtypesafety,butgoesbeyondthatinsystematicallypreventingcodedu-plicationandprovidingacleaninterfaceforincorporatinggenericandcustomizedoptimizations.LMSisakeytechniqueinourworktodevelophigh-performanceparallelizableDSLs.Inpreviouswork[5],wedenedcriteriaforwhatwecalllanguagevirtualization,sayingthatageneral-purposelanguageisvirtualizableiffitcanprovideanenvironmenttoem-beddedlanguagesthatmakesthemessentiallyidenticaltocor-respondingstand-alonelanguageimplementationsintermsofex-pressiveness(beingabletoexpressaDSLinawaywhichisnaturaltodomainspecialists)performance(leveragingdomainknowledgetoproduceoptimalcode),andsafety(domainprogramsareguar-anteedtohavecertainpropertiesimpliedbytheDSL),whileatthesametimerequiringonlymodestlymoredevelopmenteffortthanimplementingasimple,pure-libraryembedding.1OneingredientofLMSisanallytagless[4]orpolymorphic[20]languageembedding,whichensuresexpressivenessandsafety.Hoferetal.[20]showthatapolymorphicembeddingcanbecon-structedfromapureembedding[21]withacceptableeffort.LMSoffersasystematicwaytoalsoobtainperformance(bymeansofitsoptimizationinterface)whilekeepingtheeffortundercontrol(byenablingmodularcomposition,re-useandextensionofDSLbuildingblocks,includingoptimizations).Thenovelaspectisthatdespitethecomponentarchitecture,LMSusesauniform(butex-tensible)languagerepresentationforallDSLcomponentsinsteadofofferingachoiceofrepresentationsbetweenwhichtranslationsorlayeringswouldneedtobedened.Thisisachievedbysolvingtheresulting“expressionproblem”[43]ofindependentlyaddingdatatypevariantsandoperationsviaanencodingofmulti-methods(opengenericfunctions)intoacombinationofmixin-compositionandpatternmatching.1.1OrganizationWepresentlightweightmodularstagingusingScalaasthehostlan-guage.WhileweuseanumberofScala'sadvancedfeaturesexten-sively(operatoroverloading,implicits,abstracttypesandtypecon-structors,patternmatching,mixin-composition),LMSisnotinher-entlytiedtoScalaandcouldbeimplementedinotherexpressivelanguagesaswell.FeaturesthatScalalacksbutotherlanguagesprovide(e.g.built-inmulti-methodsortransparentcreationoffor-warderobjects)couldevensimplifytheimplementation.Therestofthispaperisstructuredasfollows:Section2de-scribesthebasicLMSsetupindetailforasubsetoflanguagefea-tures.Section3outlineshowmorefeaturescanbeadded.Section4discussesrelatedwork.Section5concludes.2.LightweightModularStagingInthesamewayasthepowerfunctionshownintheintroduction,wecanstagefarmoreinterestingandpracticallyrelevantprograms,suchasthefastfouriertransform(FFT).AstagedFFT,imple-mentedinMetaOCaml,hasbeenpresentedbyKiselyovetal.[26].Theirworkisaverygoodshowcaseforhowstagingallowstotrans-formasimple,unoptimizedalgorithmintoanefcientprogramgenerator.AchievingthisinthecontextofMetaOCaml,however,requiredrestructuringtheprogramintomonadicstyleandaddingafront-endlayerforperformingsymbolicrewritings.Usingourap-proachofjustaddingReptypes,wecangofromthenaivetextbook-algorithmtothestagedversion(showninFigure1)bychangingliterallytwolinesofcode:traitFFT{this:ArithwithTrig=�caseclassComplex(re:Rep[Double],im:Rep[Double])...}Allthatisneededisaddingtheself-typeannotationtoimportarith-meticandtrigonometricoperationsandchangingthetypeoftherealandimaginarycomponentsofcomplexnumbersfromDoubletoRep[Double].Merelychangingthetypeswillnotprovideuswiththedesiredoptimizationsyet.Wewillseebelowhowwecanaddthetransfor-mationsdescribedbyKiselyovetal.togeneratethesamexed-sizeFFTcode,correspondingtothefamousFFTbutterynetworks(seeFigure2).Despitetheseeminglynaivealgorithm,thisstagedcodeisfreeofbranches,intermediatedatastructuresandredundantcom-putations.Theimportantpointhereisthatwecanaddthesetrans- 1avirtualizablelanguageisalsoauniversallanguageaccordingtothedenitionofVeldhuizen[42]butvirtualizationaddstheeffortcriterium traitPowerA{this:Arith=�defpower(b:Rep[Double],x:Int):Rep[Double]=if(x==0)1.0elseb*power(b,x­1)}newPowerAwithArithStr{println{power("(x0+x1)",4)}}//result:((x0+x1)*((x0+x1)*((x0+x1)*((x0+x1)*1.0))))traitPowerB{this:Arith=�defpower(b:Rep[Double],x:Int):Rep[Double]=if(x==0)1.0elseif((x&1)==0){valy=power(b,x/2);y*y}elseb*power(b,x­1)}newPowerBwithArithStr{println{power("(x0+x1)",4)}}//result:((((x0+x1)*1.0)*((x0+x1)*1.0))*(((x0+x1)*1.0)*((x0+x1)*1.0))) Figure5.Twoalgorithmstoimplementthepowerfunction.Usingstringsascoderepresentationresultsincodeduplicationandundoestheimprovementobtainedbyre-usingintermediateresults.newPowerAwithExportGraphwithArithExpOpt{exportGraph{power(fresh[Double]+fresh[Double],4)}} newPowerBwithExportGraphwithArithExpOpt{exportGraph{power(fresh[Double]+fresh[Double],4)}} traitPowerA2extendsPowerA{this:Compile=�valp4=compile{x:Rep[Double]=�power(x+x,4)}//usecompiledfunctionp4...}newPowerA2withCompileScalawithArithExpOptwithScalaGenArith//generatedcode:classAnon$1extends((Double)=�(Double)){defapply(x0:Double):Double={valx1=x0+x0valx2=x1*x1valx3=x1*x2valx4=x1*x3x4}}traitPowerB2extendsPowerB{this:Compile=�valp4=compile{x:Rep[Double]=�power(x+x,4)}//usecompiledfunctionp4...}newPowerB2withCompileScalawithArithExpOptwithScalaGenArith//generatedcode:classAnon$2extends((Double)=�(Double)){defapply(x0:Double):Double={valx1=x0+x0valx2=x1*x1valx3=x2*x2x3}} Figure6.Usingexpressiontreesinsteadofstringsandaddingsymbolicrewritingsremovesthe*1.0operations,preventscodeduplicationandmirrorsalgorithmicimprovementingeneratedcode.Codetooutputgraph(top),codetogenerateandloadScalacode(bottom). Expressions Base BaseExp ScalaGenBase Compile CompileScala Arith ArithExp ArithExpOpt ArithExpOptFFT ScalaGenArith Arithmetic Trig TrigExp TrigExpOpt ScalaGenTrig Trigonometry Interface CoreImplementation Optimizations SpecicOpts ScalaCodegeneration Figure7.Componentarchitecture.Arrowsdenoteextendsrelationships,dashedboxesrepresentunitsoffunctionality. makemattersworse.Therepeated-squaringpoweralgorithminPowerB,whichnormallyreducestheoverallnumberofmultipli-cationstoO(logx),generateslessefcientcodethanthelinearalgorithminPowerA.Evenifthetargetcompilerwouldremovethetrivial*1.0operations,theseeminglycleveralgorithmwouldnothavehadapositiveeffect.Theproblemisthatinsteadofre-usingtheresultsofintermediatecomputations,thecomputationsthem-selvesareduplicated.Thiseffectof“undoing”valuebindingandmemoizationischaracteristicforallinherentlysyntacticstagingapproachesandhasbeenstudiedinthecontextofMSPlanguagesatlength[12,36].Moreover,thereisnoevidentwayofimplementingmoreelab-orateoptimizationsthatneedtoanalyzestagedexpressionsinasemanticmanner.2.2RepresentingStagedCode:asGraphs(good)Insteadofstringswechoosearepresentationbasedonexpressiontrees,or,moreexactly,a“seaofnodes”[7]representationthatisinfactadirected(andforthemoment,acyclic)graphbutcanbeaccessedthroughatree-likeinterface.ThenecessaryinfrastructureisdenedintraitExpressions,showninFigure8.Therearetwocategoriesofobjectsinvolved:expressions,whichareatomic(subclassesofExp:constantsandsymbols)anddeni-tions,whichrepresentcompositeoperations(subclassesofDef,tobeprovidedbyothercomponents).Thereisalsoa“gensym”oper-atorfreshthatcreatesfreshsymbols.Theguidingprincipleisthateachdenitionhasanassociatedsymbolandreferstootherdenitionsonlyviatheirsymbols.Ineffect,thismeansthateverycompositevaluewillbenamed,sim-ilartoadministrativenormalform(ANF)[17].TraitExpressionsprovidesmethodstondadenitiongivenasymbolorviceversa.Theextractorobject[14]Defallowstopattern-matchonthedef-initionofagivensymbol,afacilitythatisusedforimplementingrewritings(seebelow).ThroughtheimplicitconversionmethodtoAtom,adenitioncanbeusedanywhereanatomicexpressionisexpected.Doingsowillsearchthealreadyencountereddenitions,whicharekeptinaninternaltable(omittedinFigure8),forastructurallyequivalentone.Ifamatchingpreviousdenitionisfound,itssymbolwillbereturned.Otherwisethedenitionisseenforthersttime.Itwillbeassociatedwithafreshsymbolandsavedforfuturereference.Thissimpleschemeprovidesapowerfulglobalvaluenumbering(commonsubexpressionelimination)optimizationthateffectivelypreventsgeneratingduplicatecode(thisissafesinceexpressionsdonothavesideeffectssofar—seeSection3).Sinceallopera-tionsininterfacetraitssuchasArithreturnReptypes,deningRep[T]=Exp[T]intraitBaseExp(seeFigure9)meansthatcon-versiontosymbolswilltakeplacealreadywithinthosemethods,makingsurethatthecreateddenitionsareactuallyregistered.Weobservethattherearenoconcretedenitionclassespro-videdbytraitExpressions.Providingmeaningfuldatatypesistheresponsibilityofothertraitsthatimplementtheinterfacesde-nedpreviously(Baseanditsdescendents).Foreachinterfacetrait,thereisonecorrespondingcoreimplementationtrait.ShowninFigure9,wehavetraitsBaseExp,ArithExpandTrigExpforthefunctionalityrequiredbytheFFTexample.TraitBaseExpinstallsatomicexpressionsastherepresentationofstagedvaluesbyden-ingRep[T]=Exp[T].TraitsArithExpandTrigExpdeneonedenitionclassforeachoperationdenedbyArithandTrig,re-spectively,andimplementthecorrespondinginterfacemethodstocreateinstancesofthoseclasses.2.3ImplementingOptimizationsSomeprotableoptimizations,suchastheglobalvaluenumberingdescribedabove,areverygeneric.OtheroptimizationsapplyonlytraitExpressions{//expressions(atomic)abstractclassExp[+T]caseclassConst[T](x:T)extendsExp[T]caseclassSym[T](n:Int)extendsExp[T]deffresh[T]:Sym[T]//definitions(composite,subclassesprovided//byothertraits)abstractclassDef[T]deffindDefinition[T](s:Sym[T]):Option[Def[T]]deffindDefinition[T](d:Def[T]):Option[Sym[T]]deffindOrCreateDefinition[T](d:Def[T]):Sym[T]//binddefinitionstosymbolsautomaticallyimplicitdeftoAtom[T](d:Def[T]):Exp[T]=findOrCreateDefinition(d)//patternmatchondefinitionofagivensymbolobjectDef{defunapply[T](s:Sym[T]):Option[Def[T]]=findDefinition(s)}} Figure8.Expressionrepresentation(methodimplementationsomitted).traitBaseExpextendsBasewithExpressions{typeRep[+T]=Exp[T]}traitArithExpextendsArithwithBaseExp{implicitdefunit(x:Double)=Const(x)caseclassPlus(x:Exp[Double],y:Exp[Double])extendsDef[Double]caseclassTimes(x:Exp[Double],y:Exp[Double])extendsDef[Double]definfix_+(x:Exp[Double],y:Exp[Double])=Plus(x,y)definfix_*(x:Exp[Double],y:Exp[Double])=Times(x,y)}traitTrigExpextendsTrigwithBaseExp{caseclassSin(x:Exp[Double])extendsDef[Double]caseclassCos(x:Exp[Double])extendsDef[Double]defsin(x:Exp[Double])=Sin(x)defcos(x:Exp[Double])=Cos(x)} Figure9.ImplementingtheinterfacetraitsfromFigure3usingtheexpressiontypesfromFigure8.traitArithExpOptextendsArithExp{overridedefinfix_*(x:Exp[Int],y:Exp[Int])=(x,y)match{case(Const(x),Const(y))=�Const(x*y)case(x,Const(1))=�xcase(Const(1),y)=�xcase_=�super.infix_*(x,y)}}traitArithExpOptFFTextendsArithExp{overridedefinfix_*(x:Exp[Int],y:Exp[Int])=(x,y)match{case(x,Def(Times(Const(k),y)))=�Const(k)*(x*y))case(Def(Times(Const(k),x)),y)=�Const(k)*(x*y))...case(x,Const(y))=�Times(Const(y),x)case_=�super.infix_*(x,y)}} Figure10.ExtendingtheimplementationfromFigure9withgeneric(top)andspecic(bottom)optimizations(analogofTrigExpomitted). traitFFTCextendsFFT{this:ArrayswithCompile=�deffftc(size:Int)=compile{input:Rep[Array[Double]]=�assert(sizeispowerof2攀)//happensatstagingtimevalarg=Array.tabulate(size){i=攀Complex(input(2*i),input(2*i+1))}valres=fft(arg)updateArray(input,res.flatMap{caseComplex(re,im)=攀Array(re,im)})}} Figure13.ExtendingtheFFTcomponentfromFigure1withexplicitcompilation.Rep[Double].Whenapplyingcompile,however,wewillreceiveinputoftypeRep[Array[Double]],assumingwewanttogeneratefunctionsthatoperateonarraysofDouble(withthecomplexnum-bersattenedintoadjacentslots).Thus,wewillextendtraitFFTtoFFTC(seeFigure13),importingsupportforstagedarraysandCompile.Theimplementationofstagedarraysisstraightforwardandomittedforbrevity.WecandenecodethatusescompiledFFT“codelets”byem-beddingitinasubtraitofFFTC:traitTestFFTCextendsFFTC{valfft4:Array[Double]=�Array[Double]=fftc(4)valfft8:Array[Double]=�Array[Double]=fftc(8)//embeddedcodeusingfft4,fft8,...}Constructinganinstanceofthissubtrait(mixedinwiththeappro-priateLMStraits)willexecutetheembeddedcode:valOP:TestFFC=newTestFFTCwithCompileScalawithArithExpOptwithArithExpOptFFTwithScalaGenArithwithTrigExpOptwithScalaGenTrigwithArraysExpOptwithScalaGenArraysWecanalsousethecompiledmethodsfromoutsidetheobject:OP.fft4(Array(1.0,0.0,1.0,0.0,2.0,0.0,2.0,0.0)),!Array(6.0,0.0,­1.0,1.0,0.0,0.0,­1.0,­1.0)ProvidinganexplicittypeinthedenitionvalOP:TestFFC=...ensuresthattheinternalrepresentationisnotaccessiblefromtheoutside,onlythemembersdenedbyTestFFC.3.AddingMoreFeaturesUptonowwehavebeenworkingwithaverysimplelanguageatthestagedlevel.Prominentmissingfeaturesaresideeffects,controlow(conditionals,loops)andfunctiondenitions.Thereisnotsufcientspacetoexplaintheirimplementationsinfulldetail.LargepartsarestandardcompilertechnologyandorthogonaltothechoicebetweenLMSandastand-alonecompiler.Wewillvisitonlythemainpointsinthissectiontogiveanoverallideaofhowimplementationscanbeapproached.3.1SideEffectsandControlFlowInSection2,allstagedcodewaspure.Manypracticalprograms,however,needtoincurside-effects,especiallyifthegoalofstagingisimprovedperformance.Wecanextendthepreviousmodeltoincludeeffectfulcomputationsinarelativelysimpleway.Thebasicideaistomakealleffectsexplicitandincludeeffect-dependenciesinthegraph-basedrepresentationbesidesthedatadependencies.Wewillmaintainacurrentstateinamutablefashion,takingtheviewthatstateisanabstractionofaneffecthistory.Howthisabstractionisactuallydenedcanbecontrolledbymixinginasuitabletrait.Inthesimplestcase,thecurrentstateisalistofpreviouseffects.Asuitableprogrammingmodelissuggestedbythenotionofmonadicreectionandreication[15,16].AneffectfuloperationtraitParsers{this:Matching=�typeInput=List[Char]abstractclassParserextends(Input=�Result){def~(p:=�Parser)=newParser{//sequencedefapply(in:Rep[Input])=this(in)switch{caseSuccessR(rest)=�p(rest)}orElse{case_=�FailureR()}end}def|(p:=�Parser):Parser=...//alternative}implicitdefacceptChar(c:Char):Parser=...implicitdefacceptString(s:String)=s.map(acceptChar).reduceLeft(_~_)}traitTestParsersextendsParsers{valphraseA="scala"~' '~"rules"valphraseB="scala"~' '~"rocks"valmain=phraseA|phraseB} Figure14.Stagedparsercombinators.Matchingalternativeswithacommonprex. ) Figure15.Resultingcomputations.Genericvaluenumberingop-timization(disabledontheleft)preventsunnecessarybacktracking.needstobereectedatthepointwhereitseffectshouldoccur.Re-ectionamountstoupdatingthecurrentstateinamutablefashiontoincludetheneweffect.Notsurprisingly,aneffectcanthusbeseenasdeningastatetransition.Howexactlythistransitionworksisagaincustomizable.Optimizingrewritingsontheeffectlevelcanbeimplementedinthesamemannerascanbedoneforthevaluelevel.Thecounterpartofreectionisreication.Reifyingtheeffectsofablockofcodeamountstoexecutingthecodewithanemptycurrentstate,andreturningarepresentationoftheresultvaluetogetherwiththeresultingstate,e.g.:defprint(x:Exp[String]):Exp[Unit]=reflect(Print(x))reify{print("A")print("B")3+4},!Reified(Const(7),List(Print(Const("A"),Const("B"))))Controlowcanalsobedescribedintermsofeffects(afterallajumpinstructionmodiestheprogramcounter).Toimplementcon-ditionals,wecanusethenotionofanaborteffect,possiblyincurredbytheoperationTest.Aconditionalexpressionif(c)aelsebwillberepresentedas: traitFunctionsextendsBase{deflambda[A,B](f:Rep[A]=�Rep[B]):Rep[A=�B]defapp[A,B](f:Rep[A=�B],x:Rep[A]):Rep[B]}traitFunctionsExpextendsFunctionswithBaseExp{caseclassLambda[A,B](f:Exp[A]=�Exp[B])extendsDef[A=�B]caseclassApply[A,B](f:Exp[A=�B],x:Exp[A])extendsDef[B]deflambda[A,B](f:Exp[A]=�Exp[B])=Lambda(f)defapp[A,B](f:Exp[A=�B],x:Exp[A])=Apply(f,x)} Figure16.representing-abstractionsasScalafunctionvalues(higher-orderabstractsyntax)reflect(OrElse(reify{reflect(Test(c));a},reify(b)))TheOrElseoperationissimilarto-nodesincustomarySSArepresentationsbutcapturesthepriorityofthethen-part.Thisnotionofrepresentingconditionalsextendsnaturallytomorecomplicatedstructuressuchaspatternmatching.Aninter-estingaspectisthateffectnodesaresubjecttothesamevaluenum-beringoptimizationasdatanodes.Anexample,whichwepresentwithoutgoingintothedetails,isastagedimplementationofparsercombinators(seeFigure14).Usingsimilarcombinatorsintheirunstagedformcanbeveryexpensivebecauseofunnecessaryback-tracking.Intheexample,thegrammarconsistsoftwoalternativesthatshareacommonprex.Lookingatthecomputationgraphofthestagedprogram(seeFigure15),weobservethatbacktrackingisautomaticallyremovedinthiscase.CodegenerationforcodeincludingconditionalsandsideeffectsismoreinvolvedthanwhatisshowninSection2.ThegraphrepresentationnolongercorrespondstoANFsinceconditionalscanappearconceptually“within”otherexpressions.Thisisnotaproblemifthecodegenerationtargetlanguageisexpressiveenough.Fortargetingsimplerlanguageshowever,moreworkneedstobedone.Itshouldbefairlystraightforward,thoughbynomeanstrivial,toextractacustomarycontrolowgraph(CFG)fromtherepresentationdescribedabove(thishasnotbeenimplementedyet).WithaCFGandaseparationintoatbasicblocksathand,almostanytargetshouldbefeasible.3.2FunctionsandRecursionBasicsupportforstagedfunctiondenitionsandfunctionappli-cationscanbedenedintermsofasimplehigher-orderabstractsyntax(HOAS)[31]representation,similartothoseofCaretteetal.[4]andHoferetal.[20](seeFigure16).Alternatively,ifweareinterestedmainlyinrst-orderfunctions(whichisoftenthecase,sinceonegoalofstagingistotranslateawaytheab-stractionofferedbyhigher-orderfunctionsatthemeta-programlevel),wecanhidefunctiondenitionsinsidetherepresentationsofconditionalsorpatternmatching.Inthepatternmatchinginter-facedescribedabove,patternalternativesarereiedasinstancesofPartialFunction,whichisasubclassoffunctionvalues.Oneavenueistostagethesepatternalternatives.Theheuristichereisthatuser-denedfunctionswilldosomeformofmatchingontheirargumentsanyways.Ifstagedfunctionsareimplementedthatway,lambdaandappdonotleakintoclientcode.AnexampleisthestagedfactorialfunctioninFigure17.Whetherweuselambdaandappdirectlyornot,theHOASrep-resentationhasthedisadvantageofbeingopaque:thereisnoim-mediatewayto“lookinto”aScalafunctionobject.Ifwewanttoanalyzefunctionsinthesamewayasotherprogramconstructs,weneedawaytotransformtheHOASencodingintoouratgraphrepresentation.ForaHOAStermLambda(f),wecancallf(fresh[A])to“unfold”thefunctiondenition.Theresultisasymbolthatrepresentstheentirecomputationdenedbythefunc-traitFac{this:Matching=�deffac(n:Rep[Int]):Rep[Int]=nswitch{casenifnguard0=�1}orElse{casen=�n*fac(n­1)}} Figure17.Stagedfactorialfunction(top).Computationunfoldedonce(bottom).tion.Buttooeagerlyexpandingfunctiondenitionsisproblematic.Forrecursivefunctions,theresultwouldbeinnite,i.e.thecom-putationwillnotterminate.Whatwewouldliketodoisdetectrecursionandgenerateaniterepresentationthatmakestherecur-sivecallexplicit.Howeverthisisdifcultbecauserecursionmightbeveryindirect:deffoo(x:Rep[Int])={valf=(x:Rep[Int])=�foo(x+1)app(lambda(f),x)}Eachincarnationoffoocreatesanewfunctionf;unfoldingwillthuscreateunboundedlymanydifferentfunctionobjects.Todetectcycles,wehavetocomparethosefunctions.This,ofcourse,isundecidableinthegeneralcaseoftakingequalitytobedenedextensionally,i.e.sayingthattwofunctionsareequaliftheymapequalinputstoequaloutputs.Thestandardreferenceequality,ontheotherhand,istooweakforourpurpose:defadder(x:Int)=(y:Int)=�x+yadder(3)==adder(3),!falseHowever,wecanapproximateextensionalequalitybyinten-sional(i.e.structural)equality,whichinmostcasesturnsouttobesufcient.Testingintensionalequalityamountstocheckingiftwofunctionsaredenedatthesamesyntacticlocationinthesourceprogramandwhetheralldatareferencedbytheirfreevariablesisequal.Fortunately,theimplementationofrst-classfunctionsasclosureobjectsoffers(atleastinprinciple)accesstoa“defunction-alized”[33]datatyperepresentationonwhichequalitycaneasilybechecked.Abitofcaremustbetakenthough,becausethestruc-turecanbecyclic.OntheJVMthereisaparticularlyneattrick.Wecanserializethefunctionobjectsintoabytearrayandcomparetheserializedrepresentations:serialize(adder(3))==serialize(adder(3)),!true Withthismethodoftestingequality,wecanimplementcontrolledunfolding.Theresultofunfoldingthefactorialfunctiononce(atthedenitionsite)isagainshowninFigure17(bottom).3.3Cross-StagePersistenceCross-stagepersistence(CSP)meansmakingobjectsthatareliveatthegeneratorstageavailabletothegeneratedprogram[38].Ingen-eral,thisisonlyapplicableifthegeneratedcodeistobeloadedandexecutedwhilethepreviousstageisstillavailable,andifthecodegenerationtargetallowstocallbackintothegenerator.Thiswouldnotbethecasefor,say,OpenCLGPUcodeproducedfromagen-eratorwritteninScala,buthomogenoussetupsarene.RestrictedformsofheterogeneousCSParealsofeasible,e.g.forimmutabledatathathasacorrespondingtarget-languagecounterpart.Intermsofimplementation,generalCSPcanbeachievedbygeneralizingtheimplicitunitmethodfromtraitArith(seeFig-ure3)toliftarbitraryvaluesintothestagedrepresentationinsteadofjustDoubles:implicitdefunit[T](x:T):Rep[T]Forallliftedobjectsthatarenotprimitives,wecanthencreateacorrespondingdenitionnodeofclassExternal:caseclassExternal[T](x:T)extendsDef[T]PrimitivesretaintheirrepresentationasobjectsofclassConst.Duringcodegeneration,wemapeachExternalnodetoaeldinthegeneratedScalaclass.Wheninstantiatingthecodeobject,theseeldsareinitializedwiththecorrespondingexternalreferences.Thisapproachissimilartoaclassicclosureconversion.TheScalaimplementationdoesnotcurrentlyprovideanauto-maticliftingofalloperationsforagiventypeofobject.Opera-tionsmustbe“white-listed”byprovidingstagedversionsexplic-itly,whichcanbetediousifthereisnopre-fabricatedcomponentthatcanbereadilymixedin.Ontheotherhandthisimpliesthatprogrammershavefullcontroloverwhatoperationsareavailabletostagedprograms.4.RelatedWorkStaticmeta-programmingapproachesincludeC++templates[39],andTemplateHaskell[34].BuildingonC++templates,customiz-ablegenerationapproachesarepossiblethroughExpressionTem-plates[40],e.g.usedbyBlitz++[41].AnexampleofruntimecodegenerationinC++istheTaskGraphframework[1].ActivelibrarieswereintroducedbyVeldhuizen[42],telescopinglanguagesby[25].Specictoolkitsusingdomain-speciccodegenerationandopti-mizationincludeFFTW[18],SPIRAL[32]andATLAS[45].ThispaperdrawsalotofinspirationfromtheworkofKiselyovetal.[26]onastagedFFTimplementation.Performingsymbolicrewritingsbydeningoperatorsonliftedexpressionsandperform-ingcommonsubexpressioneliminationontheyisalsocentraltotheirapproach.LMStakestheseideasonestepfurtherbymakingthemacentralpartofthestagingframeworkitself.Immediatelyrelatedworkonembeddingtypedlanguagesin-cludesthatofCaretteetal.[4]andHoferetal.[20].Chaetal.[5]describehowLMSisusedinthedevelopmentofDSLsforhigh-performanceparallelcomputingonheterogenousplatforms.Multi-StageProgrammingLanguagessuchasMetaML[38],MetaOCaml[2]andMint[44]havebeenproposedasadisci-plinedapproachtobuildingcodegenerators.Theselanguagespro-videthreesyntacticannotations,brackets,escapeandrunwhichtogetherprovideasyntacticquasi-quotationfacilitythatissimilartothatfoundinLISPbutstaticallyscopedandstaticallytyped.MSPlanguagesmakewritingprogramgeneratorseasierandsafer,buttheyinherittheessentiallysyntacticnotionofcombin-ingprogramfragments.Ononehand,MSPlanguagestransparentlysupportstagingofalllanguageconstructs,whereLMScomponentshavetobeprovidedexplicitly.Ontheotherhand,thesyntacticMSPapproachincurstheriskofduplicatingcode[3,8,12,36].Codeduplicationcanbeavoidedsystematicallybywritingthegen-eratorincontinuation-passingormonadicstyle,usingappropriatecombinatorstoinsertlet-bindingsinstrategicplaces.OftenthisisimpracticalsincemonadicstyleorCPSsignicantlycomplicatesthegeneratorcode.Theothersuggestedsolutionistomakeex-tensiveuseofside-effectsinthemeta-program,eitherintheformofmutablestateorbyusingdelimitedcontroloperators[10,11].However,side-effectsposeserioussafetyproblemsandinvalidatemuchofthestaticguaranteesofMSPlanguages.Thisdilemmaisdescribedasan“agonizingtrade-off”,duetowhichone“cannotachieveclarity,safety,andefciencyatthesametime”[24].Onlyveryrecentlyhavetype-systemsbeendevisedtohandlebothstag-ingandeffects[23,24,44].Theyarenotexcessivelyrestrictivebutnotwithoutrestrictionseither.Mint[44],amulti-stageextensionofJava,restrictsnon-localoperationswithinescapestofinalclasseswhichexcludesmuchofthestandardJavalibrary.Bycontrast,lightweightmodularstagingpreventscodedupli-cationbyhandlingthenecessarysideeffectsinsidethestagingprimitives,whicharesemanticcombinatorsinsteadofsyntacticexpanders.Therefore,codegeneratorscanusuallybewritteninpurelyfunctionaldirectstyleandaremuchlesslikelytocausescopeextrusionorinvalidatesafetyassurancesinotherways.Eventhoughlesslikely,scopeextrusioncanhappenintheLMSsettingaswell,e.g.iftheargumentofthefunctionpassedtocompilees-capesitsdynamicscope.CombiningLMSwiththetypesystemofWestbrooketal.[44]wouldbeaninterestingavenueforfutureresearch,ifutmostsecurityisstrivedfor.AnothercentralcharacteristicofMSPlanguagesisthatstagedcodecannotbeinspectedduetosafetyconsiderations[37].Thisim-pliesthatdomain-specicoptimizationsmusthappenbeforecodegeneration.Oneapproachisthustorstbuildanintermediatecoderepresentation,uponwhichsymboliccomputationisperformed,andonlythenusetheMSPprimitivestogeneratecode[26].Theburdenofchoosingandimplementingasuitableintermediaterep-resentationisontheprogrammer.Itisnotclearhowdifferentrep-resentationscanbecombinedorre-used.Inthelimit,programmersaretemptedtousearepresentationthatresemblesclassicabstractsyntaxtrees(AST)sincethatisthemostexible.Atthatpoint,onecouldarguethatthebenetofkeepingtheactualcoderepresenta-tionhiddenhasbeenlargelydefeated.Lightweightmodularstagingprovidesasystematicinterfaceforaddingsymbolicrewritings.Safetyismaintainedbyexposingtheinternalcodestructureonlytorewritingmodulesbutkeepingithiddenfromtheclientgeneratorcode.CompiledembeddedDSLs,asstudiedbyLeijenandMeijer[27]andElliottetal.[13],canalsobeimplementedusingMSPlanguagesbywritinganexplicitinterpreterandaddingstagingan-notationsinasecondstep[9,19,35].Thisissimplerthanwritingafullcompilerbutcomparedtoconstructingexplicitinterpreters,purelyembeddedlanguageshavemanyadvantages[21].LMSal-lowsassimplerapproach,bystartingwithapureembeddingin-steadofanexplicitinterpreter.Insimplecases,addingsometypeannotationsinstrategicplacesisallthatisneededtogettoastagedembedding[20].Ifdomain-specicoptimizationsareneeded,newASTclassesandrewritingrulesareeasilyadded.5.ConclusionsInthispaperwehavepresentedlightweightmodularstaging,alibrary-baseddynamiccodegenerationapproach.InparticularwehaveshownhowLMScomplementsthenotionofpolymor-phicDSLembedding[20]withasystematicinterfacefordomain-specicoptimizations.

Related Contents


Next Show more