A brief introduction Krzysztof Parzyszek Qualcomm Innovation Center Inc RDF is a framework that hides the complexity of dataflow analysis between registers after register allocation The goal is to enable implementation of arbitrarily detailed dataflow optimizations the precision of inf ID: 582801
Download Presentation The PPT/PDF document "Register Data-Flow Framework" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Register Data-Flow Framework A brief introduction
Krzysztof Parzyszek, Qualcomm Innovation Center, Inc.Slide2
RDF is a framework that hides the complexity of data-flow analysis between registers after register allocation.The goal is to enable implementation of arbitrarily detailed data-flow optimizations: the precision of information is key.Central concept is a data-flow graph
(DFG) that
abstracts the data flow in a form closely resembling SSA.Implemented in RDFGraph.cpp/.h.Includes utilities to recalculate liveness: block live-ins and “kill” flags.Implemented in RDFLiveness.cpp.
What is RDF?Slide3
The graph represents an entire function:Nodes: basic blocks, statements, as well as register uses and defs.
Edges: membership, data-flow.
Nodes:Container nodes (aka “code nodes”) reflect the function structure:function contains basic block blocks,basic block contains instructions (phi nodes and statements),instruction contains register defs and uses.Reference nodes represent the defs
and uses of registers.
Edges:Structural, e.g. “first member”, “next member”.There are helper functions to assist in member traversal.Data-flow, e.g. “reaching def”, “first reached use”.See RDFGraph.h and RDFGraph.cpp for details.
Structure of DFGSlide4
Before Hexagon RDF optimizations DFG dump:[# Machine code for function foo: [...] f1: Function: foo
Function Live Ins:
%R0, %R1, %R2
BB#0: derived from LLVM BB %entry b2: --- BB#0 ---
preds(0): succs(2): BB#1, BB#2 Live Ins: %R0 %R1 %R2 p25: phi [
+d26
<R0>(,d14,u34):] p27: phi [+d28<R1>(,,u20):] p29: phi [+d30<R2>(,,u12):] %P0<def> = C2_cmpgti %R0, 0 s3: C2_cmpgti [d4<P0>(,,u8):, u5<R0>(+d26):] J2_jumpf %P0, <BB#2>, %PC<imp-def> s6: J2_jumpf BB#2 [/+d7<PC>!(,d22,):, u8<P0>(d4):] Successors according to CFG: BB#1 BB#2BB#1: derived from LLVM BB %if.then b10: --- BB#1 --- preds(1): BB#0 succs(1): BB#2 Live Ins: %R0 %R1 %R2 Predecessors according to CFG: BB#0 S4_storeiri_io %R2, 0, 0; mem:ST4[%p] s11: S4_storeiri_io [u12<R2>(+d30):] %R0<def> = A2_addi %R0, 1 s13: A2_addi [d14<R0>(+d26,,u33):, u15<R0>(+d26):u5] Successors according to CFG: BB#2BB#2: derived from LLVM BB %if.end b16: --- BB#2 --- preds(2): BB#1, BB#0 succs(0): Live Ins: %R0 %R1 Predecessors according to CFG: BB#1 BB#0 p31: phi [+d32<R0>(,d18,u19):, u33<R0>(d14,b10):, u34<R0>(+d26,b2):u15] %R0<def> = A2_add %R0, %R1 s17: A2_add [d18<R0>(+d32,,u24):, u19<R0>(+d32):, u20<R1>(+d28):] PS_jmpret %R31, %PC<imp-def>, %R0<imp..> s21: PS_jmpret [d22<PC>!(/+d7,,):, u23<R31>!():, u24<R0>!(d18):]# End machine code for function foo. ]
Structure of DFG: example
Highlighted:
def
-to-use and sibling links for R0Slide5
Post-RA optimizations can invalidate liveness information.Liveness
class can recalculate it based on the DFG:
recompute block live-ins (with precise lane masks), andrecompute “kill” flags.Other useful routines, such as getAllReachingDefs.
See RDFLiveness.cpp and
RDFLiveness.h for more information.Liveness calculationSlide6
Two generic optimizations: copy propagation and dead code elimination. Both have target-specific specializations (via callbacks).CP allows non-”COPY” instructions which can still transfer a register value.
DCE can delete dead references, e.g. auto-update in post-increment instructions.
See RDFCopy.cpp and RDFDeadCode.cpp.Addressing mode optimization: composing a single load/store with a complex addressing mode out of elementary instructions scattered over multiple blocks.Implemented in HexagonOptAddrMode.cpp.
Current usesSlide7
An illustration of RDF use in a simple copy propagation:
Trivial copy propagation
for (
NodeAddr
<BlockNode*> BA :
DFG.getFunc
().Addr->members(DFG)) { for (NodeAddr<StmtNode*> SA : BA.Addr->members_if(DFG, IsStmt)) { for (NodeAddr<UseNode*> UA : SA.Addr->members_if(DFG, IsUse)) { auto D = DFG.addr<DefNode*>(UA.Addr->getReachingDef()); NodeAddr<InstrNode*> I = D.Addr->getOwner(DFG); if (IsPhi(I)) continue; MachineInstr *MI = NodeAddr<StmtNode*>(I)->getCode(); if (!MI->isCopy()) continue; NodeAddr<UseNode*> U = // first use in I if (U->getReachingDef() == /*reaching def of reg at SA*/) UA.Addr->getOperand()->setReg(U.Addr->getOperand()->getReg()); } }}Slide8
Ensuring that it works correctly on all targets. From the functional perspective the code supports all required features, but the multi-target testing has been very limited so far.Developing more optimizations using it. So far most of the effort was on making the framework flexible and robust.
Future workSlide9