Mowry Carnegie Mellon Lecture 1 Introduction I What would you get out of this course II Structure of a Compiler III Optimization Example Reference Muchnick 1315 Optimizing Compilers Introduction 2 T Mowry Carnegie Mellon What Do Compilers Do 1 ID: 32940
Download Pdf The PPT/PDF document "Optimizing Compilers Introduction T" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
OptimizingCompilers:Introduction1T.Mowry CarnegieMellon Lecture1IntroductionIWhatwouldyougetoutofthiscourse?IIStructureofaCompilerIIIOptimizationExample CarnegieMellon WhatDoCompilersDo? 1.Translateonelanguageintoanother WhatDoWeMeanByOptimization? InformalDefinition:transformacomputationtoanequivalent butbetterforminwhatwayisitequivalent?inwhatwayisitbetter? CarnegieMellon HowCantheCompilerImprovePerformance? Processormemory cache OptimizingCompilers:Introduction5T.Mowry CarnegieMellon WhyStudyCompilers? Crucialforanyonewhocaresaboutperformancespeedofsystem=hardware+compilersKeyingredientinmodernprocessorarchitecturedevelopmentCompilation:heartofcomputingmapsahigh-levelabstractmachinetoalowerleveloneAnexampleofalargesoftwareprogramProblemsolvingfindcommoncases,formulateproblemmathematically,developalgorithm,implement,evaluateonrealdataSoftwareengineeringbuildlayersofabstraction(basedontheory)andsupportwithtoolsSiliconCompilersCADtoolsincreasinglyrelyonoptimizationoptimizingahardwaredesignissimilartooptimizingaprogramOptimizingCompilers:Introduction6T.Mowry CarnegieMellon WhatWouldYouGetOutofThisCourse? BasicknowledgeofexistingcompileroptimizationsHands-onexperienceinconstructingoptimizationswithinafullyfunctionalresearchcompilerBasicprinciplesandtheoryforthedevelopmentofnewoptimizationsUnderstandingoftheuseoftheoryandabstractiontosolvefutureproblemsOptimizingCompilers:Introduction7T.Mowry CarnegieMellon II.StructureofaCompiler OptimizationsareperformedonanintermediateformsimilartoagenericRISCinstructionsetAllowseasyportabilitytomultiplesourcelanguages,targetmachines SourceCode IntermediateForm ObjectCode Java Verilog Front Back Optimizer Alpha SPARC x86 IA-64 OptimizingCompilers:Introduction8T.Mowry CarnegieMellon IngredientsinaCompilerOptimization FormulateoptimizationproblemIdentifyopportunitiesofoptimizationapplicableacrossmanyprogramsaffectkeypartsoftheprogram(loops/recursions)amenabletoefficientenoughalgorithmRepresentationMustabstractessentialdetailsrelevanttooptimizationAnalysisDetectwhenitislegaldesirabletoapplytransformationCodeTransformationExperimentalEvaluation(andrepeatprocess) OptimizingCompilers:Introduction9T.Mowry CarnegieMellon Representation:Instructions Three-addresscodeA:=BopCLHS:nameofvariablee.g.x,A[t](addressofA+contentsoft)RHS:valueTypicalinstructionsA:=BopCA:=unaryopBA:=BGOTOsIFArelopBGOTOsCALLfOptimizingCompilers:Introduction10T.Mowry CarnegieMellon III.OptimizationExample BubblesortprogramthatsortsanarrayAthatisallocatedinstaticstorage:anelementofArequiresfourbytesofabyte-addressedmachineelementsofAarenumbered1throughn(nisavariable)A[j]isinlocation&A+4*(j-1)FORi:=n-1DOWNTO1DOFORj:=1TOiDOIFA[j]A[j+1]THENBEGINtemp:=A[j];A[j]:=A[j+1];A[j+1]:=tempOptimizingCompilers:Introduction11T.Mowry CarnegieMellon TranslatedCode i:=n-1S5:ifigotos1j:=1s4:if-58; .70;jigotos2t1:=j-1t2:=4*t1t3:=A[t2][t2]t4:=j+1t5:=t4-1t6:=4*t5t7:=A[t6][t6]ift3t7gotos3 t8:=j-1t9:=4*t8temp:=A[t9]]t10:=j+1t11:=t10-1t12:=4*t11t13:=A[t12]]t14:=j-1t15:=4*t14A[t15]:=t13;A[j]:=A[j+1]t16:=j+1t17:=t16-1t18:=4*t17A[t18]:=temp;A[j+1]:=temps3:j:=j+1gotoS4S2:i:=i-1gotos5OptimizingCompilers:Introduction12T.Mowry CarnegieMellon Representation:aBasicBlock Basicblock=asequenceof3-addressstatementsonlythefirststatementcanbereachedfromoutsidetheblock(nobranchesintomiddleofblock)allthestatementsareexecutedconsecutivelyifthefirstoneis(nobranchesoutorhaltsexceptperhapsatendofblock)WerequirebasicblockstobemaximaltheycannotbemadelargerwithoutviolatingtheconditionsOptimizationswithinabasicblockarelocaloptimizations OptimizingCompilers:Introduction13T.Mowry CarnegieMellon FlowGraphs Nodes:basicblocksEdges:B-B,iffBcanfollowBimmediatelyinsomeexecutionEitherfirstinstructionofBistargetofagotoatendofBOr,BphysicallyfollowsBwhichdoesnotendinanunconditionalgoto.Theblockledbyfirststatementoftheprogramisthestart,orentrynode.OptimizingCompilers:Introduction14T.Mowry CarnegieMellon Example i:=n-1 ifigotoout j:=1 ifjigotoB5 i:=i-1gotoB2 t1:=j-1ift3gotoB8 t8:=j-1...A[t18]=temp j:=j+1gotoB4 OptimizingCompilers:Introduction15T.Mowry CarnegieMellon SourcesofOptimization AlgorithmoptimizationAlgebraicoptimizationA:=B+0=A:=BLocaloptimizationswithinabasicblock--acrossinstructionsGlobaloptimizationswithinaflowgraph--acrossbasicblocksInterproceduralanalysiswithinaprogram--acrossprocedures(flowgraphs)OptimizingCompilers:Introduction16T.Mowry CarnegieMellon LocalOptimizations Analysis&transformationperformedwithinabasicblockNocontrolflowinformationisconsideredExamplesoflocaloptimizations:localcommonsubexpressioneliminationanalysis:sameexpressionevaluatedmorethanonceinb.transformation:replacewithsinglecalculationlocalconstantfoldingoreliminationanalysis:expressioncanbeevaluatedatcompiletimetransformation:replacebyconstant,compile-timevaluedeadcodeelimination OptimizingCompilers:Introduction17T.Mowry CarnegieMellon Example B1:i:=n-1B2:ifi1gotooutB3:j:=1B4:if%.7;jigotoB5B6:t1:=j-1t2:=4*t1t3:=A[t2]A[t2]t6:=4*jt7:=A[t6]A[t6]ift3gotoB8 B7:t8:=j-1t9:=4*t8temp:=A[t9];temp:=A[j]t12:=4*jt13:=A[t12];A[j+1]A[t9]:=t13;A[j]:=A[j+1]t18:=4*jA[t18]:=temp;A[j+1]:=tempB8:j:=j+1gotoB4B5:i:=i-1gotoB2out:OptimizingCompilers:Introduction18T.Mowry CarnegieMellon (Intraprocedural)GlobalOptimizations GlobalversionsoflocaloptimizationsglobalcommonsubexpressioneliminationglobalconstantpropagationdeadcodeeliminationLoopoptimizationsreducecodetobeexecutedineachiterationcodemotioninductionvariableeliminationOthercontrolstructuresCodehoisting:eliminatescopiesofidenticalcodeonparallelpathsinaflowgraphtoreducecodesize.OptimizingCompilers:Introduction19T.Mowry CarnegieMellon GlobalCommonSubexpressionElimination B1:i:=n-1B2:ifi1gotooutB3:j:=1B4:if%.7;jigotoB5B6:t1:=j-1t2:=4*t1t3:=A[t2]A[t2]t6:=4*jt7:=A[t6]A[t6]ift3gotoB8 B7:t8:=j-1t9:=4*t8temp:=A[t9];temp:=A[j]t12:=4*jt13:=A[t12];A[j+1]A[t9]:=t13A[t12]:=tempB8:j:=j+1gotoB4B5:i:=i-1gotoB2OptimizingCompilers:Introduction20T.Mowry CarnegieMellon InductionVariableElimination IntuitivelyLoopindicesareinductionvariables(countingiterations)Linearfunctionsoftheloopindicesarealsoinductionvariables(foraccessingarrays)Analysis:detectionofinductionvariableOptimizationsstrengthreduction:replacemultiplicationbyadditionseliminationofloopindex--replaceterminationbytestsonotherinductionvariables OptimizingCompilers:Introduction21T.Mowry CarnegieMellon Example(aftercse) B1:i:=n-1B2:ifi1gotooutB3:j:=1B4:if%.7;jigotoB5B6:t1:=j-1t2:=4*t1t3:=A[t2];A[j]t6:=4*jt7:=A[t6];A[j+1]ift3gotoB8 B7:A[t2]:=t7A[t6]:=t3B8:j:=j+1gotoB4B5:i:=i-1gotoB2OptimizingCompilers:Introduction22T.Mowry CarnegieMellon Example(afteriv) B1:i:=n-1B2:ifi1gotooutB3:t2:=0t6:=4B4:t19:=4*iift6%.7;t19gotoB5B6:t3:=A[t2]t7:=A[t6]A[t6]ift3gotoB8 B7:A[t2]:=t7A[t6]:=t3B8:t2:=t2+4t6:=t6+4gotoB4B5:i:=i-1gotoB2out:OptimizingCompilers:Introduction23T.Mowry CarnegieMellon LoopInvariantCodeMotion AnalysisacomputationisdonewithinaloopandresultofthecomputationisthesameaslongaswekeepgoingaroundtheloopTransformationmovethecomputationoutsidetheloopOptimizingCompilers:Introduction24T.Mowry CarnegieMellon MachineDependentOptimizations RegisterallocationInstructionschedulingMemoryhierarchyoptimizationsetc.