Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RT-TDDFT GPU Acceleration: RT-TD now fully support GPU computation #5773

Merged
merged 45 commits into from
Jan 22, 2025
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
eee8b75
Phase 1 of RT-TDDFT GPU Acceleration: Rewriting existing code using T…
AsTonyshment Dec 26, 2024
aa4ceb1
[pre-commit.ci lite] apply automatic fixes
pre-commit-ci-lite[bot] Dec 26, 2024
069c434
Merge branch 'deepmodeling:develop' into TDDFT_GPU_phase_1
AsTonyshment Dec 27, 2024
e45398a
Initialize int info in bandenergy.cpp
AsTonyshment Dec 27, 2024
a6040ec
Initialize double aa, bb in bandenergy.cpp
AsTonyshment Dec 27, 2024
0bebb32
Merge branch 'deepmodeling:develop' into TDDFT_GPU_phase_1
AsTonyshment Dec 30, 2024
ac4e737
Merge branch 'TDDFT_GPU_phase_1' of github.com:AsTonyshment/abacus-de…
AsTonyshment Dec 30, 2024
8ed6407
Merge branch 'deepmodeling:develop' into TDDFT_GPU_phase_1
AsTonyshment Dec 31, 2024
e67b42f
Merge branch 'TDDFT_GPU_phase_1' of github.com:AsTonyshment/abacus-de…
AsTonyshment Dec 31, 2024
9e4b889
Fix a bug where CopyFrom caused shared data between tensors, using =(…
AsTonyshment Dec 31, 2024
9ca053d
Merge branch 'deepmodeling:develop' into TDDFT_GPU_phase_1
AsTonyshment Jan 2, 2025
ba12e92
Merge branch 'deepmodeling:develop' into TDDFT_GPU_phase_1
AsTonyshment Jan 3, 2025
3110720
Merge branch 'deepmodeling:develop' into TDDFT_GPU_phase_1
AsTonyshment Jan 3, 2025
eda3add
RT-TDDFT GPU Acceleration (Phase 2): Adding needed BLAS and LAPACK su…
AsTonyshment Jan 3, 2025
4685fb8
Merge branch 'deepmodeling:develop' into TDDFT_GPU_phase_1
AsTonyshment Jan 6, 2025
717c164
Merge branch 'deepmodeling:develop' into TDDFT_GPU_phase_1
AsTonyshment Jan 6, 2025
e3c493d
Merge branch 'deepmodeling:develop' into TDDFT_GPU_phase_1
AsTonyshment Jan 10, 2025
d89f9a3
Merge branch 'deepmodeling:develop' into TDDFT_GPU_phase_1
AsTonyshment Jan 11, 2025
7f94b4d
LAPACK wrapper functions: change const basic-type input parameters fr…
AsTonyshment Jan 13, 2025
0e458b9
Did nothing, just formatting esolver.cpp
AsTonyshment Jan 13, 2025
824168d
Merge branch 'deepmodeling:develop' into TDDFT_GPU_phase_1
AsTonyshment Jan 14, 2025
b9f8ca4
Merge branch 'TDDFT_GPU_phase_1' of github.com:AsTonyshment/abacus-de…
AsTonyshment Jan 14, 2025
bdc6cf6
Merge branch 'deepmodeling:develop' into TDDFT_GPU_phase_1
AsTonyshment Jan 15, 2025
fbe01cd
Merge branch 'TDDFT_GPU_phase_1' of github.com:AsTonyshment/abacus-de…
AsTonyshment Jan 15, 2025
5044ac5
Merge branch 'develop' into TDDFT_GPU_phase_1
AsTonyshment Jan 17, 2025
d732808
Merge branch 'TDDFT_GPU_phase_1' of github.com:AsTonyshment/abacus-de…
AsTonyshment Jan 17, 2025
0e6c42c
Core algorithm: RT-TD now has preliminary support for GPU computation
AsTonyshment Jan 17, 2025
20fd170
Fix GitHub CI CUDA build bug due to deleted variable
AsTonyshment Jan 17, 2025
1d9e60f
Refactor some files
AsTonyshment Jan 18, 2025
c6559dd
Getting ready for gathering MPI processes
AsTonyshment Jan 18, 2025
698bec2
Merge branch 'deepmodeling:develop' into TDDFT_GPU_phase_1
AsTonyshment Jan 18, 2025
38ad956
Merge branch 'TDDFT_GPU_phase_1' of github.com:AsTonyshment/abacus-de…
AsTonyshment Jan 18, 2025
4f24415
MPI multi-process compatibility
AsTonyshment Jan 19, 2025
cca5fa9
Fix GitHub CI MPI compilation bug
AsTonyshment Jan 19, 2025
62df525
Minor fix and refactor
AsTonyshment Jan 20, 2025
8b526a9
Merge branch 'deepmodeling:develop' into TDDFT_GPU_phase_1
AsTonyshment Jan 20, 2025
fde9d05
Initialize double aa, bb and one line for one variable
AsTonyshment Jan 21, 2025
87893a9
Rename bandenergy.cpp to band_energy.cpp and corresponding adjustments
AsTonyshment Jan 21, 2025
a02a352
Fix compile error and change CMakeLists accordingly
AsTonyshment Jan 21, 2025
2bdc83f
Merge branch 'TDDFT_GPU_phase_1' of github.com:AsTonyshment/abacus-de…
AsTonyshment Jan 21, 2025
214bdb8
Initialize int naroc
AsTonyshment Jan 21, 2025
e4ab72a
Initialize MPI related variables: myid, num_procs and root_proc
AsTonyshment Jan 21, 2025
dc54ffd
Refactor Propagator class implementation into multiple files for bett…
AsTonyshment Jan 21, 2025
079f791
Remove all GlobalV::ofs_running from RT-TDDFT core algorithms and pas…
AsTonyshment Jan 21, 2025
c0ca245
Add assert in some places and optimize redundant index calculations i…
AsTonyshment Jan 21, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 7 additions & 2 deletions source/module_esolver/esolver_ks_lcao_tddft.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,12 @@
#include "module_io/print_info.h"

//-----HSolver ElecState Hamilt--------
#include "module_elecstate/cal_ux.h"
#include "module_elecstate/elecstate_lcao.h"
#include "module_hamilt_lcao/hamilt_lcaodft/hamilt_lcao.h"
#include "module_hsolver/hsolver_lcao.h"
#include "module_parameter/parameter.h"
#include "module_psi/psi.h"
#include "module_elecstate/cal_ux.h"

//-----force& stress-------------------
#include "module_hamilt_lcao/hamilt_lcaodft/FORCE_STRESS.h"
Expand Down Expand Up @@ -290,7 +290,12 @@ void ESolver_KS_LCAO_TDDFT::after_scf(UnitCell& ucell, const int istep)
{
std::stringstream ss_dipole;
ss_dipole << PARAM.globalv.global_out_dir << "SPIN" << is + 1 << "_DIPOLE";
ModuleIO::write_dipole(ucell,pelec->charge->rho_save[is], pelec->charge->rhopw, is, istep, ss_dipole.str());
ModuleIO::write_dipole(ucell,
pelec->charge->rho_save[is],
pelec->charge->rhopw,
is,
istep,
ss_dipole.str());
}
}
if (TD_Velocity::out_current == true)
Expand Down
138 changes: 134 additions & 4 deletions source/module_hamilt_lcao/module_tddft/bandenergy.cpp
AsTonyshment marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
#include "bandenergy.h"

#include <complex>
#include <iostream>

#include "evolve_elec.h"
#include "module_base/lapack_connector.h"
#include "module_base/scalapack_connector.h"

#include <complex>
#include <iostream>

namespace module_tddft
{
#ifdef __MPI
Expand Down Expand Up @@ -133,14 +133,144 @@ void compute_ekb(const Parallel_Orbitals* pv,
}
}
} // loop ipcol
} // loop iprow
} // loop iprow
info = MPI_Allreduce(Eii, ekb, nband, MPI_DOUBLE, MPI_SUM, pv->comm());

delete[] tmp1;
delete[] Eij;
delete[] Eii;
}

void compute_ekb_tensor(const Parallel_Orbitals* pv,
const int nband,
const int nlocal,
const container::Tensor& Htmp,
const container::Tensor& psi_k,
container::Tensor& ekb)
{
// Create Tensor objects for temporary data
container::Tensor tmp1(container::DataType::DT_COMPLEX_DOUBLE,
container::DeviceType::CpuDevice,
container::TensorShape({pv->nloc_wfc}));
tmp1.zero();

container::Tensor Eij(container::DataType::DT_COMPLEX_DOUBLE,
container::DeviceType::CpuDevice,
container::TensorShape({pv->nloc}));
Eij.zero();

// Perform matrix multiplication: tmp1 = Htmp * psi_k
ScalapackConnector::gemm('N',
'N',
nlocal,
nband,
nlocal,
1.0,
Htmp.data<std::complex<double>>(),
1,
1,
pv->desc,
psi_k.data<std::complex<double>>(),
1,
1,
pv->desc_wfc,
0.0,
tmp1.data<std::complex<double>>(),
1,
1,
pv->desc_wfc);

// Perform matrix multiplication: Eij = psi_k^dagger * tmp1
ScalapackConnector::gemm('C',
'N',
nband,
nband,
nlocal,
1.0,
psi_k.data<std::complex<double>>(),
1,
1,
pv->desc_wfc,
tmp1.data<std::complex<double>>(),
1,
1,
pv->desc_wfc,
0.0,
Eij.data<std::complex<double>>(),
1,
1,
pv->desc_Eij);

if (Evolve_elec::td_print_eij >= 0.0)
{
GlobalV::ofs_running
<< "------------------------------------------------------------------------------------------------"
<< std::endl;
GlobalV::ofs_running << " Eij:" << std::endl;
for (int i = 0; i < pv->nrow_bands; i++)
{
for (int j = 0; j < pv->ncol_bands; j++)
{
double aa, bb;
AsTonyshment marked this conversation as resolved.
Show resolved Hide resolved
aa = Eij.data<std::complex<double>>()[i * pv->ncol + j].real();
bb = Eij.data<std::complex<double>>()[i * pv->ncol + j].imag();
if (std::abs(aa) < Evolve_elec::td_print_eij)
aa = 0.0;
if (std::abs(bb) < Evolve_elec::td_print_eij)
bb = 0.0;
if (aa > 0.0 || bb > 0.0)
{
GlobalV::ofs_running << i << " " << j << " " << aa << "+" << bb << "i " << std::endl;
}
}
}
GlobalV::ofs_running << std::endl;
GlobalV::ofs_running
<< "------------------------------------------------------------------------------------------------"
<< std::endl;
}

int info;
AsTonyshment marked this conversation as resolved.
Show resolved Hide resolved
int naroc[2];

// Create a Tensor for Eii
container::Tensor Eii(container::DataType::DT_DOUBLE,
container::DeviceType::CpuDevice,
container::TensorShape({nband}));
Eii.zero();

for (int iprow = 0; iprow < pv->dim0; ++iprow)
{
for (int ipcol = 0; ipcol < pv->dim1; ++ipcol)
{
if (iprow == pv->coord[0] && ipcol == pv->coord[1])
{
naroc[0] = pv->nrow;
naroc[1] = pv->ncol;
for (int j = 0; j < naroc[1]; ++j)
{
int igcol = globalIndex(j, pv->nb, pv->dim1, ipcol);
if (igcol >= nband)
continue;
for (int i = 0; i < naroc[0]; ++i)
{
int igrow = globalIndex(i, pv->nb, pv->dim0, iprow);
if (igrow >= nband)
continue;
if (igcol == igrow)
{
Eii.data<double>()[igcol] = Eij.data<std::complex<double>>()[j * naroc[0] + i].real();
}
}
}
}
} // loop ipcol
} // loop iprow

// Perform MPI reduction to compute ekb
info = MPI_Allreduce(Eii.data<double>(), ekb.data<double>(), nband, MPI_DOUBLE, MPI_SUM, pv->comm());
}

#endif

} // namespace module_tddft
8 changes: 8 additions & 0 deletions source/module_hamilt_lcao/module_tddft/bandenergy.h
AsTonyshment marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
#ifndef BANDENERGY_H
#define BANDENERGY_H

#include "module_base/module_container/ATen/core/tensor.h" // container::Tensor
#include "module_basis/module_ao/parallel_orbitals.h"

#include <complex>
Expand All @@ -29,6 +30,13 @@ void compute_ekb(const Parallel_Orbitals* pv,
const std::complex<double>* Htmp,
const std::complex<double>* psi_k,
double* ekb);

void compute_ekb_tensor(const Parallel_Orbitals* pv,
const int nband,
const int nlocal,
const container::Tensor& Htmp,
const container::Tensor& psi_k,
container::Tensor& ekb);
#endif
} // namespace module_tddft
#endif
65 changes: 63 additions & 2 deletions source/module_hamilt_lcao/module_tddft/evolve_elec.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@

namespace module_tddft
{
Evolve_elec::Evolve_elec(){};
Evolve_elec::~Evolve_elec(){};
Evolve_elec::Evolve_elec() {};
Evolve_elec::~Evolve_elec() {};

double Evolve_elec::td_force_dt;
bool Evolve_elec::td_vext;
Expand Down Expand Up @@ -73,6 +73,67 @@ void Evolve_elec::solve_psi(const int& istep,
&(ekb(ik, 0)),
htype,
propagator);

const bool use_tensor = false;
if (use_tensor)
{
std::cout << "Print ekb: " << std::endl;
ekb.print(std::cout);
std::cout << "nband = " << nband << std::endl;
std::cout << "psi->get_nbands() = " << psi->get_nbands() << std::endl;
std::cout << "nlocal = " << nlocal << std::endl;
std::cout << "psi->get_nbasis() = " << psi->get_nbasis() << std::endl;
std::cout << "ekb.nr = " << ekb.nr << std::endl;
std::cout << "ekb.nc = " << ekb.nc << std::endl;

// Create TensorMap for psi_k, psi_k_laststep, H_laststep, S_laststep, ekb
container::TensorMap psi_k_tensor(psi[0].get_pointer(),
container::DataType::DT_COMPLEX_DOUBLE,
container::DeviceType::CpuDevice,
container::TensorShape({psi->get_nbands(), psi->get_nbasis()}));
container::TensorMap psi_k_laststep_tensor(
psi_laststep[0].get_pointer(),
container::DataType::DT_COMPLEX_DOUBLE,
container::DeviceType::CpuDevice,
container::TensorShape({psi->get_nbands(), psi->get_nbasis()}));
container::TensorMap H_laststep_tensor(Hk_laststep[ik],
container::DataType::DT_COMPLEX_DOUBLE,
container::DeviceType::CpuDevice,
container::TensorShape({para_orb.nloc}));
container::TensorMap S_laststep_tensor(Sk_laststep[ik],
container::DataType::DT_COMPLEX_DOUBLE,
container::DeviceType::CpuDevice,
container::TensorShape({para_orb.nloc}));
container::TensorMap ekb_tensor(&(ekb(ik, 0)),
container::DataType::DT_DOUBLE,
container::DeviceType::CpuDevice,
container::TensorShape({nband}));

evolve_psi_tensor(nband,
nlocal,
&(para_orb),
phm,
psi_k_tensor,
psi_k_laststep_tensor,
H_laststep_tensor,
S_laststep_tensor,
ekb_tensor,
htype,
propagator);
// evolve_psi_tensor(nband,
// nlocal,
// &(para_orb),
// phm,
// psi[0].get_pointer(),
// psi_laststep[0].get_pointer(),
// Hk_laststep[ik],
// Sk_laststep[ik],
// &(ekb(ik, 0)),
// htype,
// propagator);
std::cout << "Print ekb tensor: " << std::endl;
ekb.print(std::cout);
}
}
else
{
Expand Down
2 changes: 2 additions & 0 deletions source/module_hamilt_lcao/module_tddft/evolve_elec.h
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@

#include "module_base/global_function.h"
#include "module_base/global_variable.h"
#include "module_base/module_container/ATen/core/tensor.h" // container::Tensor
#include "module_base/module_container/ATen/core/tensor_map.h" // TensorMap
#include "module_esolver/esolver_ks_lcao.h"
#include "module_esolver/esolver_ks_lcao_tddft.h"
#include "module_hamilt_lcao/hamilt_lcaodft/hamilt_lcao.h"
Expand Down
Loading
Loading