Use local density approximation lda for r v xc good

  • No School
  • AA 1
  • 29

This preview shows page 7 - 15 out of 29 pages.

Use Local Density Approximation (LDA) for )] ( [ r V XC (good Si,C) Many Body Schrodinger Equation (exact but exponential scaling )
Image of page 7
Selfconsistent calculation ) ( ) ( )} , ( 2 1 { 2 r E r r V i i i N i i ,.., 1 } { 2 | ) ( | ) ( r r N i i ) , ( r V Selfconsistency N electrons N wave functions lowest N eigenfunctions
Image of page 8
Choice of Basis for DFT(LDA) Increasing basis size M Gaussian FLAPW Fourier grid Percentage of eigenpairs M/N 30% 2% Eigensolvers Direct (scalapack) Iterative
Image of page 9
Plane-wave Pseudopotential Method in DFT ) ( ) ( ))} ( ( | | | | ) ( 2 1 { 2 r E r r V R r Z r d r r r j j j XC I I Solve Kohn-Sham Equations self-consistently for electron wavefunctions within the Local Density Appoximation r k g i g j g k j e k C r ). ( , ) ( ) ( 1. Plane-wave expansion for 2. Replace “frozen” core by a pseudopotential Different parts of the Hamiltonian calculated in different spaces (fourier and real) 3d FFT used
Image of page 10
PARATEC (PARAllel Total Energy Code) PARATEC performs first-principles quantum mechanical total energy calculation using pseudopotentials & plane wave basis set Written in F90 and MPI Designed to run on large parallel machines IBM SP etc. but also runs on PCs PARATEC uses all-band CG approach to obtain wavefunctions of electrons Generally obtains high percentage of peak on different platforms Developed with Louie and Cohen’s groups (UCB, LBNL), Raczkowski
Image of page 11
PARATEC: Code Details Code written in F90 and MPI (~50,000 lines) 33% 3D FFT, 33% BLAS3, 33% Hand coded F90 Global Communications in 3D FFT (Transpose) Parallel 3D FFT handwritten, minimize comms. reduce latency (written on top of vendor supplied 1D complex FFT )
Image of page 12
Load Balance Sphere by giving columns to different procs. 3D FFT done via 3 sets of 1D FFTs and 2 transposes Most communication in global transpose (b) to (c) little communication (d) to (e) Flops/Comms ~ logN Many FFTs done at the same time to avoid latency issues Only non-zero elements communicated/calculated Much faster than vendor supplied 3D-FFT PARATEC: Parallel Data distribution and 3D FFT (a) (b) (e) (c) (f) (d)
Image of page 13
PARATEC: Performance All architectures generally achieve high performance due to computational intensity of code (BLAS3, FFT) ES achieves highest overall performance to date: 5.5Tflop/s on 2048 procs Main ES advantage for this code is fast interconnect SX8 achieves highest per-processor performance X1 shows lowest % of peak Non-vectorizable code much more expensive on X1 IBM Power5 4.8 Gflops/P (63% peak on 64 procs) BGL got 478 Mflops/P (17% of peak on 512 procs) Problem P NERSC (Power3) Jacquard (Opteron) Thunder (Itanium2) ORNLCray (X1) NEC ES (SX6 * ) NEC SX8 Gflops/P %peak Gflops/P %peak Gflops/P
Image of page 14
Image of page 15

You've reached the end of your free preview.

Want to read all 29 pages?

  • Fall '19
  • Electron, density functional theory, Y. Wang

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern

Stuck? We have tutors online 24/7 who can help you get unstuck.
A+ icon
Ask Expert Tutors You can ask You can ask You can ask (will expire )
Answers in as fast as 15 minutes
A+ icon
Ask Expert Tutors