A Reactive ReaxFF Molecular Dynamics Package Hassan Metin Akgulta Sagar Pandit Joseph Fogarty Ananth Grama aygcspurdueedu Acknowledgements Department of ID: 216848
Download Presentation The PPT/PDF document "PuReMD" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
PuReMD: A Reactive (ReaxFF) Molecular Dynamics PackageHassan Metin Akgulta, Sagar Pandit, Joseph Fogarty, Ananth Gramaayg@cs.purdue.edu.
Acknowledgements: Department
of
Energy, National
Science FoundationSlide2
OutlineSequential Realization: SerialReaxAlgorithms and Numerical TechniquesValidationPerformance CharacterizationParallel Realization: PuReMDParallelization: Algorithms and TechniquesPerformance and ScalabilityApplicationsStrain Relaxation in Si/Ge/Si NanobarsWater-Silica Interface
Oxidative Stress in Lipid
BilayersSlide3
Sequential Realization: SerialReaxExcellent per-timestep running timeefficient generation of neighbors listselimination of bond order derivative listscubic spline interpolation: for non-bonded interactionshighly optimized linear solver: for charge equilibrationLinear scaling memory footprintfully dynamic and adaptive interaction listsRelated publication:Reactive Molecular Dynamics: Numerical Methods and Algorithmic TechniquesH. M. Aktulga, S. A.
Pandit
, A. C. T. van
Duin
, A. Y.
Grama
SIAM
Journal on Scientific
Computing,
2012.Slide4
SerialReax ComponentsSystemGeometryControl ParametersForce Field Parameters
Trajectory
Snapshots
System Status
Update
Program Log
File
SerialReax
Initialization
Read input data
Initialize data structs
Compute Bonds
Corrections are applied after all
uncorrected bonds are computed
Bonded Interactions
Bonds
Lone pairs
Over/UnderCoord
Valence Angles
Hydrogen Bonds
Dihedral/ Torsion
QEq
Large sparse linear system
PGMRES(50) or PCG
ILUT-based pre-conditioners
give good performance
Evolve the System
F
total = Fnonbonded + FbondedUpdate x & v with velocity VerletNVE, NVT and NPT ensembles
ReallocateFully dynamic and adaptive memory management: efficient use of resources large systems on a single processor
Neighbor Generation3D-grid based O(n) neighbor generation Several optimizations for improved performance
Init Forces Initialize the QEq coef matrix Compute uncorr. bond orders Generate H-bond lists
vd
Waals & electrostatics
Single pass over the
far
nbr
-list after
charges are updated
Interpolation with cubic
splines for nice speed-upSlide5
Generation of Neighbors Listatom list
3D Neighbors grid
atom list:
reorderedSlide6
Neighbors List OptimizationsSize of the neighbor grid cell is importantrnbrs: neighbors cut-off distanceset cell size to half the rnbrsReordering reduces look-ups significantlyAtom to cell distanceVerlet lists for delayed re-neighboring
neighbors grid
Just need to look inside these cells
No need to check these cells with reordering!
Looking for the neighbors of atoms in this box
If d > r
nbrs
, then no need to look into the cell
dSlide7
Elimination of Bond Order Derivative ListsBOij: bond order between atoms i and j at the heart of all bonded interactionsdBOij/drk: derivative of BOij wrt. atom knon-zero for all k sharing a bond with i or
j
costly force computations, large memory space needed
Analogy
Eliminate
dBO
list
delay computation of
dBO
terms
accumulate force related
coef
. into
CdBO
ij
compute
dBO
ij
/
dr
k
at the end of the step
f
k += CdBOij x dBOij
/drkSlide8
Look-up TablesExpensive non-bonded force computationstoo many interactions: rnonb ~ 10A vs. rbond ~ 4-5Abut simple, pair-wise interactionsReduce non-bonded force computationscreate look-up tablescubic spline interpolation: for non-bonded energy & forcesExperiment with a 6540 atom bulk water systemSignificant performance gain, high accuracySlide9
Linear Solver for Charge EquilibrationQEq Method: used for equilibrating chargesoriginal QEq paper cited 600+ timesapproximation for distributing partial chargessolution using Lagrange multipliers yieldsN = # of atomsH: NxN sparse matrixs & t fictitious charges: used to determine the actual charge
q
i
Too expensive for direct methods
Iterative (
Krylov
sub-space) methodsSlide10
Basic Solvers for QEqSample systemsbulk water: 6540 atoms, liquidlipid bilayer system: 56,800 atoms, biological systemPETN crystal: 48,256 atoms, solidSolvers: CG and GMRESH has heavy diagonal: diagonal pre-conditioningslowly evolving environment : extrapolation from prev. solutionsPoor Performance:
tolerance level = 10
-6
which is fairly satisfactory
much worse at 10
-10
tolerance level
due to cache effects
more pronounced here
# of iterations = # of matrix-vector multiplications
actual running time in seconds
fraction of total computation timeSlide11
ILU-based preconditioningILU-based pre-conditioners: no fill-in, 10-2 drop toleranceeffective (considering only the solve time)no fill-in + threshold: nice scaling with system sizeILU factorization is expensive
bulk water system
bilayer system
cache effects are still evident
system/solver
time to
compute
preconditioner
solve time (s)
total time (s)
bulk water/ GMRES+ILU
0.50
0.04
0.54
bulk water/
GMRES+diagonal
~0
0.11
0.11Slide12
ILU-based preconditioningObservation: can amortize the ILU factorization costslowly changing simulation environment re-usable pre-conditionersPETN crystal:solid, 1000s of steps!Bulk water:liquid, 10-100s of steps!Slide13
Memory ManagementCompact data-structuresDynamic and adaptive listsinitially: allocate after estimationat every step: monitor & re-allocate if necessaryLow memory foot-print, linear scaling with system size
n-1
n
n-1’s data
n’s data
in CSR format
neighbors list
Qeq matrix
3-body intrs
n-1
n
n-1’s data
n’s data
reserved for n-1
reserved for n
in modified CSR
bonds list
hbonds listSlide14
ValidationHexane (C6H14) Structure ComparisonExcellent agreement!Slide15
Comparison to MD MethodsComparisons using hexane: systems of various sizesAb-initio MD: CPMDClassical MD: GROMACSReaxFF: SerialReaxSlide16
Comparison to LAMMPS-ReaxFTime per time-step comparisonQeq solver performanceMemory foot-print
different QEq formulations
similar results
LAMMPS:
CG / no preconditionerSlide17
OutlineSequential Realization: SerialReaxAlgorithms and Numerical TechniquesValidationPerformance CharacterizationParallel Realization: PuReMDParallelization: Algorithms and TechniquesPerformance and ScalabilityApplicationsStrain Relaxation in Si/Ge/Si NanobarsWater-Silica Interface
Oxidative Stress in Lipid
BilayersSlide18
Parallel Realization: PuReMDBuilt on the SerialReax platformExcellent per-timestep running timeLinear scaling memory footprintExtends its capabilities to large systems, longer time-scalesScalable algorithms and techniquesDemonstrated scaling to over 3K coresRelated publication:Parallel Reactive Molecular Dynamics: Numerical Methods and Algorithmic TechniquesH. M. Aktulga, J. C. Fogarty, S. A. Pandit, A. Y. GramaParallel Computing, 2012.Slide19
Parallelization: DecompositionDomain decomposition: 3D torusSlide20
Parallelization: Outer-Shell
b
r
b
r
b
r/2
b
r
full shell
half shell
midpoint-shell
tower-plate shellSlide21
Parallelization: Outer-Shell
b
r
b
r
b
r/2
b
r
full shell
half shell
midpoint-shell
tower-plate shell
choose full-shell due to dynamic bonding despite the comm. overheadSlide22
Parallelization: Boundary Interactionsrshell= MAX (3xrbond_cut, rhbond_cut, rnonb_cut)Slide23
Parallelization: MessagingSlide24
Parallelization: Messaging PerformancePerformance Comparison: PuReMD with direct vs. staged messagingSlide25
Parallelization: OptimizationsTruncate bond related computationsdouble computation of bonds at the outer-shellhurts scalability as the sub-domain size shrinkstrim all bonds that are further than 3 hops or more Scalable parallel solver for the QEq problemGMRES/ILU factorization: not scalableuse CG + diagonal pre-conditioninggood initial guess:redundant computations: to avoid reverse communicationiterate both systems togetherSlide26
PuReMD: Performance and ScalabilityWeak scaling testStrong scaling testComparison to LAMMPS-REAXPlatform: Hera cluster at LLNL4 AMD Opterons/node -- 16 cores/node800 batch nodes – 10800 cores, 127 TFLOPS/sec32 GB memory / nodeInfiniband interconnect42nd on TOP500 list as of Nov 2009Slide27
PuReMD: Weak ScalingBulk Water: 6540 atoms in a 40x40x40 A3 box / coreSlide28
PuReMD: Weak ScalingQEq ScalingEfficiencySlide29
PuReMD: Strong ScalingBulk Water: 52320 atoms in a 80x80x80 A3 boxSlide30
PuReMD: Strong ScalingEfficiency and throughputSlide31
OutlineSequential Realization: SerialReaxAlgorithms and Numerical TechniquesValidationPerformance CharacterizationParallel Realization: PuReMDParallelization: Algorithms and TechniquesPerformance and ScalabilityValidation ApplicationsStrain Relaxation in Si/Ge/Si NanobarsWater-Silica Interface
Oxidative Stress in Lipid
BilayersSlide32
LAMMPS-Reax/C User CommunityKonstantin Shefov - Sankt-Peterburgskij Gosudarstvennyj UniversitetCamilo Calderon - Boston UniversityRicardo Paupitz Barbosa dos Santos - Universidade Estadual de MaringaShawn Coleman - University of ArkansasPaolo Valentini - University of MinnesotaHengji Zhang - University of Texas at DallasBenjamin Jensen - Michigan Technological UniversityXiao Dong Han - Beijing UniversityRobert Meissner - Fraunhofer Institute for Manufacturing Technology and Advanced Materials, BremenJames Larentzos - High Performance Technologies, Inc. (HPTi) Hegoi Manzano - Concrete Sustainability Hub, MITOlivier POLITANO - Laboratoire Interdisciplinaire Carnot de BourgogneTi Leggett – University of ChicagoSlide33
Si/Ge/Si nanoscale bars MotivationSi/Ge/Si nanobars: ideal for MOSFETsas produced: biaxial strain, desirable: uniaxialdesign & production: understanding strain behavior is important
Si
Si
Ge
Width (W)
Periodic boundary
conditions
Height (H)
[100], Transverse
[001], Vertical
[010] , Longitudinal
Related publication:
Strain relaxation in Si/Ge/Si nanoscale bars from molecular dynamics simulations
Y. Park, H.M. Aktulga, A.Y. Grama, A. Strachan
Journal of Applied Physics 106, 1 (2009)Slide34
Si/Ge/Si nanoscale bars Key Result:When Ge section is roughly square shaped, it has almost uniaxial strain!
W = 20.09 nm
Si
Ge
Si
average transverse Ge strain
average strains for Si&Ge in each dimension
Simple strain model derived from MD resultsSlide35
Water-Silica InterfaceMotivationa-SiO2: widely used in nano-electronic devicesalso used in devices for in-vivo screeningunderstanding interaction with water: critical for reliability of devicesRelated publication:A Reactive Simulation of the Silica-Water Interface J. C. Fogarty, H. M. Aktulga, A. van Duin, A. Y. Grama, S. A. Pandit Journal of Chemical Physics 132, 174704 (2010) Slide36
Water-Silica InterfaceWater model validationSilica model validationSlide37
Water-Silica InterfaceSilicaWaterSiO
O
O
H
H
Key Result
Silica surface hydroxylation as evidenced by experiments is observed.
Proposed reaction:
H2O + 2Si + O
2SiOHSlide38
Water-Silica InterfaceKey ResultHydrogen penetration is observed: some H atoms penetrate through the slab.Ab-initio simulations predict whole molecule penetration.We propose: water is able to diffuse through a thin film of silica via hydrogen hopping, i.e., rather than diffusing as whole units, water molecules dissociate at the surface, and hydrogens diffuse through, combining with other dissociated water molecules at the other surface.Slide39
Oxidative Damage in Lipid BilayersMotivationModeling reactive processes in biological systemsROS Oxidative stress Cancer & AgingSystem Preparation200 POPC lipid + 10,000 water molecules and same system with 1% H2O2 mixedMass Spectograph: Lipid molecule weighs 760 u
Key Result
Oxidative damage observed:
In pure water:
40% damaged
In 1% peroxide:
75% damagedSlide40
ConclusionsAn efficient and scalable realization of ReaxFF through use of algorithmic and numerical techniquesDetailed performance characterization; comparison to other methodsApplications on various systemsLAMMPS/Reax/C and PuReMD strand-alone Reax implementations available over the public domain.Slide41
References Reactive Molecular Dynamics: Numerical Methods and Algorithmic TechniquesH. M. Aktulga, S. A. Pandit, A. C. T. van Duin, A. Y. GramaSIAM Journal on Scientific Computing (2012). Parallel Reactive Molecular Dynamics: Numerical Methods and Algorithmic TechniquesH. M. Aktulga, J. C. Fogarty, S. A. Pandit, A. Y. GramaParallel Computing, to appear (2012).Strain relaxation in Si/Ge/Si nanoscale bars from molecular dynamics simulations
Y. Park,
H.M.
Aktulga
, A.Y.
Grama
, A. Strachan
Journal of Applied Physics 106, 1 (2009)
A Reactive Simulation of the Silica-Water Interface
J. C. Fogarty,
H. M.
Aktulga
, A. van
Duin
, A. Y.
Grama
, S. A. Pandit Journal of Chemical Physics 132, 174704 (2010)