Sub14nm Constraints Kwangsoo Han Andrew B Kahng and Hyein Lee kwhan abk hyeinlee ucsdedu httpvlsicaducsdedu ECE Department UC San Diego Outline Motivation amp Previous Work ID: 461899
Download Presentation The PPT/PDF document "Scalable Detailed Placement Legalization..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Scalable Detailed Placement Legalization for Complex Sub-14nm Constraints
Kwangsoo Han, Andrew B. Kahng and Hyein Lee{kwhan, abk, hyeinlee}@ucsd.eduhttp://vlsicad.ucsd.edu/ECE Department, UC San DiegoSlide2
OutlineMotivation & Previous WorkProblem FormulationOur ApproachExperimental Setup and ResultConclusionSlide3
In old technology nodes, once the library cells were correctly designed, design rule violations (DRVs) could not occur during placementLimitations of patterning resolution lead to complex design rules for front-end-of-line (FEOL) layersPlacing several ‘legal’ standard cells next to each other may cause violations of FEOL layer rules
MotivationFinal detailed cell placement phase is neededto maintain placement legality with respect to new N10 FEOL rulesSlide4
The FEOL layers which affect legal placement include implant layer, oxide diffusion layer and poly Implant layers decide the threshold (Vt) of transistorsOxide diffusion (OD) defines the active region of transistors
Dummy poly gates are inserted at the (vertical) standard cell boundaries to avoid edge device variabilityCell Layout in N10 NodeM2 Power/groundCell boundary, implant region
Oxide diffusion (OD)
Poly
M1
Fin
Middle of line
A
YSlide5
(1) Minimum implant width (IW)
Limitation of the current optical lithography technology New design rule (i.e., minimum implant width)Two same-Vt cells are misaligned vertically A narrow, “staircase” implant layer shape Inter-row IW (
IW1) violation
A
narrow cell is surrounded by different-
V
t
cells
Intra-row IW (
IW2) violation
HVT
HVT
HVT
LVT
LVT
HVT
HVT
IW2
IW1Slide6
Cells can have different oxide diffusion (OD) region heightsLithographic corner rounding minimum OD jog length ruleCells with different OD heights abutment Cause OD jog length violation(2) Minimum OD jog length (OW)
OD
Cell boundary
OD jogSlide7
Dummy poly gates create extra dummy transistorsDummy transistors can induce leakage powerDummy transistors must be tied off to power/ground railsTwo drain nodes are abuttedExtra dummy poly gate tied up with power/ground
railsCell flipping/displacement(3) Drain-drain abutment (DDA)
D
D
Drain-drain
abutment
D
D
S
D
D
S
D
S
S
D
√Slide8
Dynamic programming-based approachesOptimal interleaving for intra-row optimization [Hur and Lillis, ICCAD00]Row-based placement [Kahng et al., ASPDAC99, GLSVLSI04]Integer Linear Programming (ILP)-based approachesPlacement by branch-and-price [Ramachandaran et al., ASPDAC05]
MIP-based detailed placement [Li and Koh, ISPD12]DDA-aware placement[Du and Wong, DATE14] propose a graph model with shortest-path algorithmUse cell flipping and adjacent-cell swappingNo consideration of inter-row constraints (e.g., IW constraint) Previous WorksOur work: MILP-based optimization to providethe comprehensive support of N10-relevent FEOL rulesSlide9
Develop a mixed integer linear programming (MILP)-based placer, called DFPlacerAddress new DRVs caused by complex N10 FEOL rulesPropose a scalable partitioning-based optimization methodIncorporate our flow into a commercial tool-based placement and routing (P&R) flow for evaluationProvide insight into timing and area impacts of the dummy poly gate library cell strategyStandard cells with dummy poly gates (DDA and OW violation free)Standard cells without dummy poly gates
Our ContributionsSlide10
OutlineMotivation & Previous WorkProblem FormulationOur ApproachExperimental Setup and ResultConclusionSlide11
Input: Placement with design rule violationsObjective: Legal placement with minimum cell displacementsSubject to:Minimum implant width (IW) constraint Minimum oxide diffusion jog length (OW) constraintDrain-drain abutment (DDA) constraint
Detailed Placement Problem FormulationHVTHVT
HVT
LVT
LVT
HVT
HVT
IW
OD
Cell boundary
OW
D
D
DDASlide12
OutlineMotivation & Previous WorkProblem FormulationOur ApproachExperimental Setup and ResultsConclusionSlide13
Single-cell-placement binary variable λck Placement state k (location and orientation) of cell cSite occupation variable scrq
kRepresent if site (r,q) is occupied by cell c with placement state kMixed-ILP Model [Li12]
λ
c
k
= {x
c
, y
c
, f
c
}, where x
c
(y
c
) is x(y) location of cell c
f
c
is an indicator whether c is flipped
(0, 0)
(1,7)
λ
c
1
= {0, 1, 0}
λ
c
2
= {4, 0, 1}
s
c21
1
=1
s
c11
1
=1
s
c31
1
=0
[Li12] S. Li and C.-K. Koh, “Mixed Integer Programming Models for Detailed Placement”, Proc. ISPD, 2012, pp. 87-94.Slide14
Placement Problem Formulation
⇒
Minimize displacements
Objective
+ more constraints to support
IW
,
OW
and
DDA
For each cell c
⇒ Select one placement state per cell
Orientation, x/y location, site occupation are determined by
λ
c
k
Placement constraints
No overlapSlide15
New: , a binary vector indicating Vt of the site (r,q)Vt boundaries are checked with inter-/intra-row variables
IW Constraints Formulation
|W| = 3
|W| = 3
At the
V
t
boundary, at least |W| consecutive sites must be same
V
t
At the Vt boundary where two vertically neighboring sites are same Vt, the Vt must be kept for at least |W| sites in the both upper and lower rows
V
t
boundary
V
t
boundarySlide16
Pre-characterize all adjacency conditions which violate OW and/or DDA for each cell pairAdd mutual exclusion constraints λc1i and λc2 j is forbidden pair
OW and DDA Constraints Formulationλc1i + λc2 j 1
λ
c1
i
λ
c2
jSlide17
Fixed cells
Distributable Global Optimization
Limitation of MILP-based approach ⇒ Runtime
Distributable optimization of many windows of cells
Split the post-route layout into small clips
Run optimization for each clip with fixed boundaries
Cells on boundaries are handled by shifting windows
1
st
iteration
2
nd
iteration
Layout
clipSlide18
Overall Flow
Routed layout w/ DRVsDFPlacer
Complex constraints
for N10
DDA, OW, IW
ILP solver
(CPLEX)
ILP
formulation
Optimization for each window
Solve multiple
windows in
parallel
Shift partitioning lines
Routed layout with #DRVs <
δ
ECO Routing
Cell location solution
#DRVs <
δ
?
Make new windows
Y
Global optimization
Local optimization
N
Solve multiple
windows
in parallel
Remaining DRVs
Removed overlapping windowsSlide19
OutlineMotivation & Previous WorkProblem FormulationOur ApproachExperimental Setup and ResultsConclusionSlide20
SP&R tools: Synopsys Design Compiler H-2013.03-SP3 and Cadence Encounter Digital Implementation System XL 13.1Technology: two kinds of 7nm dual Vt libraries62 standard cells without dummy poly gates (CWOD)62 standard cells with dummy poly gates (CWD)
Design: AES, JPEG [OpenCores], ARM Cortex M0, ARM Cortex M0 x 3*_d – implemented with CWD library*_nd – implemented with CWOD libraryExperimental SetupDesignM0_nd
AES_nd
M0x3_nd
JPEG_nd
M0_d
AES_d
M0x3_d
JPEG_d
#Inst
8260
12147
27248
47948
8238
12491
26690
48317
LVT
(%)
52
54
56
51
51
54
55
52
Util.
(%)
77
78
80
77
77
80
79
77
WL
(um)
114685
142294
392540
694624
116866
150632
409579
764738
Area
(um
2
)
7668
8894
24463
49629
8668
10596
27400
55824
[
OpenCores
] http://opencores.com/
M2 Power/ground
Cell boundary, implant region
Oxide diffusion (OD)
Poly
M1
Fin
Middle of line
A
Y
A
Y
Inverter cell layout
in CWOD library
Inverter cell layout
in CWD librarySlide21
Report wirelength and worst setup slack Up to 3.42% wirelength increase*_nd cases shows similar or slightly larger WL% than *_d
WSS ranges from -19ps to 68psPositive WSS there is room to improve timing Experimental Results (1)Slide22
Experimental Results (2)Global optimization fixes ~90% of DRVsRuntime of global optimization using CWOD library are 1.8x larger than those using CWD library (except for Cortex M0)The runtime of the global optimization phase can be further reduced with more computing resource
Global optimization (3rd iteration); 90% violations are fixed1.8xSlide23
DFPlacer fixes 99% of design rule violationsExperimental Results (3)Example solution
DDA violation
IW
violation
IW
violation
OW
violation
flipped
Cells are moved
Design
M0_nd
AES_nd
M0x3_nd
JPEG_nd
M0_d
AES_d
M0x3_d
JPEG_d
Init.
IW
#
vio
.
926
1771
3514
4056
988
1566
2810
6296
Init.
DDA/OW
#
vio
.
1611
1900
4230
12024
0
0
0
0
Final
total #
vio
.
25
34
65
164
10
11
27
43Slide24
OutlineMotivation & Previous WorkProblem FormulationOur ApproachExperimental Setup and ResultsConclusionSlide25
Propose a scalable detailed placement legalization flow for complex FEOL constraints arising at the foundry 10nm nodeConstraints include minimum implant width, minimum OD jog rules and
drain-drain abutment Fixes 99% of DRVs with 3% increase in wirelength and minimal impact on timingFuture work Timing and wirelength-driven placement legalization“Smart ECO” method for few remaining DRVs after global placement legalizationConclusion and Future WorkSlide26
Thank you!Slide27
Minimum OD jog length = 4 sites widthMinimum implant width = 4 sites widthNumber of violations of cell pairMinimum implant width rule violation: 7172 out of 15376 (= 62 x 62 x 2 x 2)Minimum OD jog length rule violation: 280 out of 153767nm cell library with scaled 28nm BEOL (back-end-of-line) LEFSite width/height: 0.136/0.9 um
Experimental Setup: Designs and Technologies
A1
A0
B0
B1
Y
min M2 pitch
of
28nm
node
min M1 pitch of 28nm node
Scale by 2.5x
OAI22 in 7nm nodeSlide28
Scaling of 7nm CellsScale 7nm cells by 2.5XLeft figure is the scaled OAI22_X1All the pins are on track with 0.135um M1 vertical pitchHowever, encounter does not work with 0.135um M1 vertical pitchRight figure shows the modified OAI22_X1 (fit into 0.136um M1 vertical pitch)Increase width from 0.81um 0.816um ( = 0.81 + (0.81/135))
Shift the pins to be aligned to the vertical track with 0.136um pitch0.10.90.81
A1
0.067
0.135
0.135
0.135
0.135
0.135
0.068
A0
0.05
0.1
0.9
0.816
0.068
0.136
0.136
0.136
0.136
0.136
0.068
0.05
B0
B1
Y