significant machine instructions to IR intermediate Representation degraded the performance ID: 817179
Download Pdf The PPT/PDF document "perf. of CPU Specialized HW units comple..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
perf. of CPU Specialized HW units comple
perf. of CPU Specialized HW units complement diminishing single-thread performance improvements. w/o specialized HW units signif
icant machine instructions to IR (inter
icant machine instructions to IR (intermediate Representation) degraded the performance. Ð!ã¼his is typically for binary transla
tors based on general compiler engines.
tors based on general compiler engines. 7 !""#"$%'$("#"$!))#&$"#"$&"#"$*"#"$%"#"$+"#"$!""#"$!&"#"$!*"#"$!%"#"$!+"#"$,-.*-/0.12
3/0#3".123/0#3!,-.)#45675890165:;5=-
3/0#3".123/0#3!,-.)#45675890165:;5=- 15 ;92,5original binary naive translation with optimizations source code re-compilation hi
gher is better naive translation w/o opt
gher is better naive translation w/o optimizations Achieve competitive performance relative to the state-of-the-art source code c
ompiler. 8 !""#"$%'$("#"$!))#&$"#"$na
ompiler. 8 !""#"$%'$("#"$!))#&$"#"$naive translation. Ð!Resulting in limited optimization opportunities. 10 limited optimizatio
n opportunities existing binary IR poor
n opportunities existing binary IR poor HLI (high-level information): ¥!variables ¥!data types optimizers typically do not assume
source languages which results in better
source languages which results in better applicability. Ð!e.g. Dynamo [Bala00], Dynamoregister usage inspection of literal pool:
register usages of COBOL. COBOL Data st
register usages of COBOL. COBOL Data structures & register usage: Machine instructions: Abstract interpretation (a) Load R2,(R9+0
x200) (b) Load R3,(R2+0x100) (c) Load
x200) (b) Load R3,(R2+0x100) (c) Load R4,(R2+0x200) (d) Load R5,(R2+R4) Thereare four significant optimizations which are newl
y introduced to the latest source code c
y introduced to the latest source code compiler. 1.!Type Reduction of Decimal Type. 2.!Strength Reduction of Edited Moves. 3.!Skip
ping Truncations of Binary Numbers. 4.!I
ping Truncations of Binary Numbers. 4.!Inlining Efficient Version of Routines. 14 14 !""#"$%'$("#"$!))#&$"#"$&"#"$*"#"$%"#"$+"#"
$!""#"$!&"#"$!*"#"$!%"#"$!+"#"$,-.*-/0
$!""#"$!&"#"$!*"#"$!%"#"$!+"#"$,-.*-/0.123/0#3".123/0#3!,-.)#45675890165:;5=- 15 ;92,5original binary naive translation with
optimizations source code Convert decim
optimizations source code Convert decimal types to decimal floating-point (DFP) types. Ð!Computation on decimal type: memory-to-m
emory Ð!Computation on DFP type: registe
emory Ð!Computation on DFP type: register-to-register 15 zoned decimal type " 0xF8F7 Representation of integer number 87: binary i
nteger type " 0x087F HLI about variable
nteger type " 0x087F HLI about variables is necessary to enable the type reduction. COBOL: Generate a specialized instruction se
quence for a string format operation in
quence for a string format operation in COBOL. 17 Simple Example: COBOL: ã»ã»FOO㻦ã»string of the format 9999. MOVE 123 TO FOO
. "ã»FOO=0123 Original instruction: EDI
. "ã»FOO=0123 Original instruction: EDIT control-bytes,numeric EDIT is a millicode instruction which interprets each byte of the
control-bytes. Re-construct HLI about th
control-bytes. Re-construct HLI about theGenerate guard code to skip unnecessary truncation of Replace a runtime routine with th
e equivalent code which exploits a more
e equivalent code which exploits a more efficient algorithm and newer instructions. 19 Simple Example: COBOL: ã»ã»FOO, BAR㻦ã»
large packed decimal numbers (e.g. 18 di
large packed decimal numbers (e.g. 18 digits). FOO=FOO/BAR Original instruction: CALL LargeDivide Runtime path length is more than
100 instructions. Optimized instruction
100 instructions. Optimized instruction: PackedToDFP FPR1,FOO PackedToDFP FPR2,BAR DFPDivide FPR1,FPR2 DFPRound FPR1 DFPToPacked
FPR1,FOO 5.5 GHz zEC12 running z/OS. !!
FPR1,FOO 5.5 GHz zEC12 running z/OS. !!Benchmark: Ð!IBM Internal benchmark suiteDevelopment version of IBM Automatic Binary Opti
mizer (2015). !!Original source code com
mizer (2015). !!Original source code compiler: Ð!IBM Enterprise COBOL V4R2 (2009). !!Latest source code compiler: Ð!Development ve
rsion of IBM Enterprise COBOL V5 (2015).
rsion of IBM Enterprise COBOL V5 (2015). 21 !"#$%&'()"*#+%&,-.!"#$%&'$!!()&*&+,!(-.)/$*(&01$-(2/3-
(2/+$+",.$4-"&/,!(-.$250(&0,$&',$-(2*6$*
(2/+$+",.$4-"&/,!(-.$250(&0,$&',$-(2*6$*(&,!(-.)/$*(&0,7"8/"2&$%,9:;1'(-=.! 1; 000;2"!(2',.2(!"005&+10"?.2
x0003;).'$*&+,0"?)"&*$%,@%",2"!(2'01
x0003;).'$*&+,0"?)"&*$%,@%",2"!(2'01580"?).'$*&+,5&'"8"',ABCD,@%",2"!(2'01/0*1!"#$%&'()"*#+%&,-./"%!(