Unformatted text preview: Quantum Chemistry
with GAMESS
Brett M. Bode Scalable Computing Laboratory Department of Electrical and Computer Engineering
Iowa State University Outline
Introduction to GAMESS
GAMESS history
GAMESS capabilities
Novel capabilities
Running GAMESS 2 GAMESS General Atomic and Molecular Electronic
Structure System
General purpose electronic structure code
Primary focus is on ab initio quantum chemistry
calculations
Also can do
Density functional calculations
Other semiempirical calculations (AM1, PM3)
QM/MM calculations
Its free and in wide use on everything from laptops to
supercomputers.
3 Obtaining GAMESS
Its free, but not “Open source” in the normal sense.
Group license: You get the source and can do anything
your want with it, except distribute it.
See http://wwwmsg.fi.ameslab.gov/GAMESS/ for more
information and the registration page link.
Distribution is source code, with prebuilt binaries also
available for Macintosh, Linux and Windows.
Full manual also on web site. See section 2 for complete
keyword description for the input file, section 4 for
references for all of the methods.
4 GAMESS People
GAMESS is a product of Dr.
Mark Gordon’s research group
at Iowa State University.
Dr. Mike Schmidt coordinates
the development efforts and is
the gatekeeper for the code. 5 GAMESS History
The code base was began in 1980 from parts
of other codes. Some code still goes back to
that version!
Currently stands at about 750000 lines of
mostly Fortran 77 compatible code.
Pretty much runs on any system with a
working Fortran compiler. 6 GAMESS Parallelization
Began in 1991 with the parallelization of the
SCF Energy and Gradient computations
(almost trivially parallel).
Initial parallel work done on the TouchStone
Delta.
In 1996 the Distributed Data Interface (DDI)
was developed to support the new parallel
MP2 energy and gradient code.
7 GAMESS Parallelization
In 2004 DDI was rewritten and
optimizations for SMP using SystemV
shared memory were added. Focus remains
distributed memory systems!
Also added was subgroup support to enable
the Fragment Molecular Orbital method. 8 memory is the memory reserved by all the remaining
parallel processes for their portions of the distributed
data. Every process in a parallel job is allowed to
access/modify any element in the distributed memory
segment (regardless of its physical location); however,
access to local distributedmemory is assumed to be
faster than access to remote distributedmemory. Thus
the DDI programming strategy aims to maximize the
use of local distributed data while minimizing remote
data requests. Note that the performance penalty for
accessing distributedmemory (local or remote) is
completely dependent on the underlying machine with high performance and potentially intelligent
interconnect networks like Gigabit Ethernet [9], Myrinet
[10], SCI [11], or Infiniband [12]. A similar trend is
also evident in dedicated supercomputers, where, for
example large scale IBM SP and HP SC systems now
use SMP nodes. Indeed, very large shared memory
computers, like the SGI Origin 3000 or HP GS, usually
have NonUniform Memory Access (NUMA)
architectures that can be viewed as a cluster of uniform
memory SMPs linked via a network, albeit a very good
network. DDI With this move away from single processor toward
multiprocessor based clusters we are confronted with a
considerably more complicated memory model than that
which was present when either DDI or GA were
originally conceived. Now small groups of processes
have equally fast access to chunks of memory, while
accessing memory between groups of processes is
slower. Recognizing this plus the success and popularity
of these programming models, it is pertinent to consider
how these models might be extended to better exploit
SMP clusters. The aim of this paper is to begin to
address this issue, presenting an enhanced version of
DDI that includes new functionality specifically
targeting SMP clusters. Using both the new and original
versions of DDI, performance results are presented and
discussed for a typical GAMESS computation run on a
variety of MPP systems. First, however, we begin with a
brief discussion of the existing DDI data server model
used in GAMESS. Figure 1: The virtual sharedmemory model. Each large box
(grey) represents the memory available to a given CPU. The
inner boxes represent the memory used by the parallel processes
(rank in lower right). The gold region depicts the memory
reserved for the storage of distributed data. The arrows indicate
memory access (through any means) for the distributed
operations: get, put and accumulate. Modeled on the Global Array Framework. The Distributed Data Interface provides a pseudo global
shared memory interface for a portion of a nodes memory.
2 Normal MPI version uses 2 processes per processor, 1
compute, 1 data server.
Sockets are used for interrupts on data servers because
MPI often polls in receive.
SHMEM and LAPI versions also available...
Also provides processor subgroup support. 9 Program Capabilities
Types of wavefunctions
HartreeFock (RHF, ROHF, UHF, GVB)
CASSCF
CI, MRCI
Coupled cluster methods (closed shells)
Second order perturbation theory
MP2 (closed shells)
ROMP2 (spincorrect open shells)
UMP2 (unrestricted open shells
MCQDPT(CASSCF  MRMP2)
Localized orbitals (SCF, MCSCF)
10 Program Capabilities
Energyrelated properties
Total energy as function of nuclear coordinates (PES):
All wavefunction types
Analytic energy gradient
RHF, ROHF, UHF, MCSCF, CI, MP2, UMP2,
DFT
ROMP2 in progress
Analytic hessian
RHF, ROHF, TCSCF/GVB
MCSCF just completed
11 Program Capabilities
Energyrelated properties (cont’d) Numerical hessians from finite differences of analytic
gradients
Fully numerical derivatives for all methods
Saddle point (TS) search (requires hessian)
Minimum energy path=Intrinsic Reaction Coordinate
Several IRC options  GS2 is most effective
Requires frequency input, gradients along path
Follow reaction path from reactants through TS to
products
Build reaction path Hamiltonian (RPH):
dynamics
12 Program Capabilities
Energyrelated properties (cont’d)
Dynamic reaction coordinate (DRC)
Add kinetic energy to system at any geometry
Add photon(s) to any vibrational mode
Classical trajectory using QMderived energies
Requires gradients
Monte Carlo sampling: find global minimum
Molecular dynamics (in progress) 13 Program Capabilities
Other functionalities
Spinorbit coupling
Any spin states, any number of states
Full twoelectron BreitPauli
Partial twoelectron (P2e)very efficient, accurate
Semiempirical oneelectron Zeff
RESC
Averaging over vibrational states
Derivative (vibronic) coupling: planned
14 Program Capabilities
Interpretive tools
Localized molecular orbitals (LMO)
Localized charge distributions (LCD)
Nuclear and spectroscopic properties
Spin densities at nucleus (ESR)
NMR spinspin couplings (in progress)
NMR chemical shifts
Polarizabilities, hyperpolarizabilities
IR and Raman intensities
Transition probabilities, FranckCondon overlaps
15 Program Capabilities
QM/MM Methods
Effective fragment potential (EFP) method for
Cluster studies of liquids
Cluster studies of solvent effects
Interfaced with continuum methods for study of
liquids and solvation in bulk
Covalent link for study of enzymes, proteins,
materials
SIMOMM: QM/MM method for surface chemistry
QM part can be any method in GAMESS
MM part from Tinker (Jay Ponder)
16 Current Capabilities
Run Type
Energy
Analytic Gradient
Numerical Hessian
Analytic Hessian
MP2 energy
MP2 gradient
CC Energy
EOMCC
CI energy
CI gradient
DFT energy
DFT gradient
MOPAC Energy
MOPAC gradient RHF
CDFP
CDFP
CDP
CDP
CDFP
CDFP
CDF
CD
CDP
CD
CDFP
CDFP
yes
yes ROHF
CDP
CDP
CDP
CDP
CDP
DP
CDP
CDP
CDP
yes
yes SCF Type
UHF
GVB
CDP
CDP
CDP
CDP
CDP
CDP
CDP
CDP
CDP
CDP
CDP
yes
yes CDP
yes
 C= conventional storage of AO integrals on disk
D= direct evaluation of AO integrals
F= Fragment Molecular Orbital enabled
P= parallel execution 17 MCSCF
CDFP
CDFP
CDP
CDP
CP
CDP
 Solvation
Solvation Methods
Explicit vs. implicit methods
Explicit Methods
TIP3P, TIP4P
SPC, SPC/E
EFP Method for Solvation
Summary of EFP1 method for water
Generalized EFP Method (EFP2) 18 General Effective Fragment Potential Discrete solvation method
Fragment potential is one electron contribution to the ab
initio Hamiltonian
Potentials
are obtained by separate ab initio calculations
depend on properties of isolated molecules
can be systematically improved 19 General Effective Fragment Potential
Effective Fragment Potential
System is divided into
System is divided into
an ab initio region for the “solute” and
an ab initio region for the “solute” and
a ragment region he olvent molecules.
a ffragment region for tforsthe solvent molecules. E = Eab initio + Einteraction 10
20 Hartree Fock based EFP
Hartree Fock based EFP Interaction energy consists of : electrostatic, polarization
and exchange repulsion/charge transfer term Einteraction= Ecoulomb + Epolarization + Eexchange repulsion/charge transfer
K Einteraction = ! V
k =1 Distributed
Multipolar expansion Elec
k L M R
( µ, s) + ! Vl (µ , s) + ! Vm e p (µ, s)
Pol l =1 LMO polarizability
expansion 21 m =1 Fit to Functional
Form 12 EFP results
gOH(r):
EFP1/HF, EFP1/DFT, SPC/E
62 waters
1.8 1.6 1.4 1.2 gOH(r) 1 EFP1/HF
EFP1/DFT
SPC/E
Exp (ND) 0.8 0.6 0.4 0.2 0
0 1 2 3 4 5 r (Angstroms) 6 7 22 Exp (ND): Neutron Diffraction; Soper et. al. 22 Generalized EFP2 Method
Generalized EFP2 Method
Interaction energy consists of : electrostatic, polarization
and exchange repulsion term Einteraction= Eelectostatic + Epolarization + Eexchange repulsion Distributed
Multipolar expansion LMO polarizability
expansion 23 From first principles
using LMO overlaps 26 EFP Performance Energy +EFPadient Calculation
Gr Performance:
Energy + Gradient Calculation Method1 20 water
molecules 62 water
molecules 122 water
molecules 512 water
molecules Ab initio2 3.19 hrs   ~157 yrs3 EFP2 3.3 sec 26.1 sec 95.3 sec 26.8 min EFP1/HF 0.2 sec 2.6 sec 5.1 sec 97.8 sec SPC/E4 0.02 sec 0.02 sec 0.1 sec 0.7 sec 1Run
2Ab on 1200 MHz Athlon/Linux machine initio: DZP basis set, 3Assuming N4 scaling, 4SPC/E = Simple Extended Point Charge model 24 30 Fragment Molecular Orbital
Divide up the system into
fragments Basic idea Ignore exchange and
selfconsistency due to
other fragments
Do ab initio calculations
of fragments in the
Coulomb field due to
the whole system. Dividebyhe system into fragments.
work t Kitaura, Ishida and
Likewise, compute pairs each fragment,atgnore exchange and s
For
Federov i AIST
and triples of fragments.
consistency due to other fragments but retain
Coulomb field. Otherwise, do ab initio calcula
25
fragments and their nmers. FMO Features
No hydrogen caps.
All nmer calculations are ab initio.
Interfragment charge transfer, dispersion and exchange are
included.
Systematic manybody effects.
Total properties closely reproduce ab initio values.
No fitted parameters. 26 FMO
Can also add in electron correlation.
MP2
Coupled Cluster
DFT
MCSCF
Can be multilayer  ie MCSCF for active
site, RHF everywhere else.
27 ! FMOMCSCF
!"#$"%&%!' "#$!%&#&%$'!()!*$+(#$*!,)!./.01!
! "23$'!%&#&%$')!,'$!4501! "%'
&%! ()!' ! 6(%$')!(#789*(#:!23$!./.0! !
%&#&%$'!,'$!./.0!;'$*!8(#$)<1!
! "23$'!*(%$')!,'$!450!;=89$!8(#$)<1!
28 ()! Applications of FMO
!""#$%&'$()*+(,+./+ !
01(21'34+("'$2$*&'$()!"!#$%%%!&'()*+,!
!
!"#$%$&$'!./.!*$)5#1+"($)'+1)135$1*!"!#%$%%%!&'()*+,!
!
6&$3+$)'13&%'$()+&)*$*!"!#%$%%%!&'()*+0!
! 1234!1.*546$!
! 54&61!1(78564$!
! 9(:).2!7;.)5*'2:$!
! )(.73&2!73*'.2*,!
!!
29 FMO results !"#$%"&'()*&+,%.'!"#$%$&'"$(")$)*+'
! ! ! /01234/562789:;'<=>?')@",'@$'6AA'1B%*[email protected]$,'
=A;?C8'#%@&,;'<<;<?D'*E*(%[email protected]$,;'86D;DD='F#,+,'G"$(%[email protected]$,>'
! "#!$%&'()*!!"#$%&+!,.&/01).2*3'!4556!78&92!2&0:3*0(;!.(.&/!(<(/=>#!
30 Running GAMESS
GAMESS runs on
Any Unixbased system available in the U.S.
Any Linux based system
Any Macintosh
Windows based system using WinGAMESS or
PCGAMESS Obtained from www.msg.ameslab.gov GAMESS GAMESS is a backend program, ie no GUI.
T
ypically it is run via a script
Input is taken from a ﬁle (usually .inp)
Output appears in .log ﬁle (stdout)
This is intended to be human readable
MO Vectors, coordinates, hessians, etc appear
in .dat ﬁle. Can be used for restarts. IRC and DRP data and numerical hessian restart
information appear in .irc ﬁle.
These are all ASCII text ﬁles. GAMESS Input ﬁle
Input ﬁles are modular, arranged in $groups
Most common input groups
$SYSTEM: speciﬁes memory, time limit
$CONTRL: speciﬁes basics of calculation
$BASIS: speciﬁes basis set if standard
$DATA: speciﬁes nuclear coordinates, basis set if nonstandard Other important groups:
$GUESS, $SCF, $FORCE, $HESS, $VEC, $IRC, $VIB GAMESS Input ﬁle
The input ﬁle is mostly freeformat (ie ﬂexible
spacing) except: ‘$’ sign specifying group must be in column 2!
All groups must terminate with a $END (this ‘$’
can be anywhere except column 1). anything in column 1 indicates a comment line Some key groups
$SYSTEM group:
TIMLIM=(default=525600 min = 1 yr)
MWORDS=(default=1=8MB)
MEMDDI=
relevant for parallel run
T
otal required memory (divide by number of processors to get memory requested/node) Some key groups
$CONTRL
ICHARG= (speciﬁes charge on system)
MUL (speciﬁes spin multiplcity)
T=
1 for singlet, 2 for doublet, ...
EXETYP=
Check: checks input for errors
Run: actual run
UNITS=
angs (default)
bohr Some key groups
$CONTRL
Runtyp= (type of run)
Energy (single point energy run)
Gradient (energy 1st derivative wrt coordinates)
Optimize (optimize geometry)
Hessian (energy second derivative, vibrational frequencies,
thermodynamic properties): generates $HESS group in .dat ﬁle)
Sadpoint (saddle point search:requires hessian in $HESS
group) IRC (performs IRC calculation: usually requires $IRC group,
$HESS group) Some key groups
$CONTRL
scf typ= (type of wavefunction)
RHF
ROHF
UHF
MCSCF
GVB
mplevl=
0 (default, no perturbation theory)
2 (MP2: valid for RHF, ROHF, MCSCF) Some key groups
$CONTRL
Cctyp=
NONE (no coupled cluster, default)
LCCD (linearized doubles CC)
CCD (doubles CC)
CCSD (singles+doubles)
CCSD(T) adds perturbative triples to CCSD
Most popular method
T
riples essential for accurate calculations
RCC, CRCC
Specialized methods to approximate bondbreaking Some key groups
$BASIS  Used to select among the builtin basis sets
GBASIS=
STO
N21
N31
TZV...
NGAUSS=(# gaussians for STO, N21, N31)
NDFUNC=(# sets of d’s on heavy atoms
NPFUNC=(# sets of p’s on hydrogens)
NFFUNC=(# sets of f ’s on TM’s)
DIFFSP=.T. (diffuse sp functions on heavy atoms)
DIFFS=.T. (diffuse s functions on hydrogens) Some key groups
$Data  Gives the molecular geometry
Title line (will be printed in output)
Symmetry group
C1
CS
CNV 2 (C2V), ...
Blank line except C1
Symbol Z xcoord ycoord zcoord
Symbol = atomic symbol
Z = atomic number
xcoord,ycoord, zcoord = Cartesian coordinates
Internal coordinates is another option Some key groups
$Data  continued
Repeat this line for each symmetry unique atom (see below) Need to specify basis set after each coordinate line if $BASIS is not
present symmetry unique atoms
H2O: O and 1 H
NH3: N and 1 H saves CPU time (e.g., numerical hessians only displace symmetry
unique atoms) Need to follow conventions in GAMESS manual
Cs, Cnh: plane is XY
Cnv: axis is Z For Cinfv, use C4v
For Dinfh, use D4h Some key groups
$GUESS  Initial MO guess
Builtin guess (default) works much of the time
$GUESS=MOREAD NORB=xx $END
Requires $VEC group (usually from .dat ﬁle)
NORB=# MO’s to be read in
Useful when SCF convergence is difﬁcult
Necessary for MCSCF, CI GAMESS output
The log ﬁle output is intended to be human readable:
RHF SCF CALCULATION
 NUCLEAR ENERGY =
8.9064898741
MAXIT = 30 NPUNCH= 2
EXTRAP=T DAMP=F SHIFT=F RSTRCT=F DIIS=F DEM=F SOSCF=F
DENSITY MATRIX CONV= 1.00E05
MEMORY REQUIRED FOR RHF STEP= 30441 WORDS.
ITER EX DEM TOTAL ENERGY
E CHANGE DENSITY CHANGE DIIS ERROR
1 0 0 74.7936151096 74.7936151096 .595010038 .000000000
2 1 0 74.9519661838 .1583510742 .180249713 .000000000
...
11 6 0 74.9659012167 .0000000014 .000018538 .000000000
12 7 0 74.9659012170 .0000000003 .000008228 .000000000
13 8 0 74.9659012171 .0000000001 .000003650 .000000000
DENSITY CONVERGED
TIME TO FORM FOCK OPERATORS=
.0 SECONDS (
.0 SEC/ITER)
TIME TO SOLVE SCF EQUATIONS=
.0 SECONDS (
.0 SEC/ITER)
FINAL RHF ENERGY IS 74.9659012171 AFTER 13 ITERATIONS The Dat ﬁle
The dat ﬁle contains formatted numerical data.
Useful, sometimes required for restarts.
Contains items such as:
MO Vectors ($VEC)
Gradient ($GRAD) and Hessian ($HESS)
When copying a group make sure you copy everything from the beginning $ sign through the
corresponding $END. GAMESS output
You will need to look at the log ﬁle to
verify the results. Did the run ﬁnish correctly?
Was the input speciﬁed correctly?
Were there errors in the
computation? Running GAMESS
You frequently need the results from one run as
input to another run. restarting incomplete runs
Multi step problems
A Saddle point search might take several
optimization and hessian computations
followed by IRC computations. Multireference computations often multiple
runs to get the orbital guess correct. Visualization
A number of programs can visualize GAMESS results to varying degrees.
MacMolPlt is one such program that has
been speciﬁcally designed for visualizing
GAMESS output. Demo This af ternoon I will present a demo of running GAMESS and using MacMolPlt. Acknowledgments
Mark Gordon
Dmitri Federov
the rest of the Gordon group in Ames Financial Support
Air Force Office of Scientific Research
National Science Foundation
DoD CHSSI Software Development
DOE SciDAC Program
Ames Laboratory
DoD HPC Grand Challenge Program 51 ...
View
Full Document
 Summer '06
 DuaneJohnson
 Quantum Chemistry, Energy, Computational chemistry, Ab initio, Key Groups, GAMESS, GAMESS Input file

Click to edit the document details