diff options
author | 2020-04-14 15:51:10 -0700 | |
---|---|---|
committer | 2020-04-14 15:51:10 -0700 | |
commit | d61cdd0fbef823f5135f7e9c4251ec09ee586651 (patch) | |
tree | 7fc39e980f69013840402c9419c7e2134b3d3f6c /Source/FieldSolver/SpectralSolver/SpectralFieldDataRZ.cpp | |
parent | b28820fe545779411bfb32ca3c574ad89a2a611a (diff) | |
download | WarpX-d61cdd0fbef823f5135f7e9c4251ec09ee586651.tar.gz WarpX-d61cdd0fbef823f5135f7e9c4251ec09ee586651.tar.zst WarpX-d61cdd0fbef823f5135f7e9c4251ec09ee586651.zip |
Implementation of the RZ spectral solver (#816)
* For diagnostics, added RZ modes of scalars, allowed different centerings
* For RZ, generalized the centering of the inverse volume scaling of J and rho
* Fixed spacing in ConstructTotalRZScalarField
Co-Authored-By: MaxThevenet <mthevenet@lbl.gov>
* Added Python wrapper of charge density arrays
* Add assert ensuring that Jr is never node-centered
Co-Authored-By: MaxThevenet <mthevenet@lbl.gov>
* Small fixes to Python to better handle particle weights
* Implementation of the RZ spectral solver
* Removed k-space filtering code
* Removed more k-space filter code from RZ spectral solver
* For RZ spectral, added _rt for literals and cleaned up namespace use
* In RZ spectral solver, cleaned up some member names
* Update Docs/source/building/rzgeometry.rst
Small fix for clarity.
Co-Authored-By: Remi Lehe <remi.lehe@normalesup.org>
* Update Source/Evolve/WarpXEvolve.cpp
Fix macro indentation
Co-Authored-By: Remi Lehe <remi.lehe@normalesup.org>
* Update Source/Evolve/WarpXEvolve.cpp
Fix more macro indentation
Co-Authored-By: Remi Lehe <remi.lehe@normalesup.org>
* Update Source/Evolve/WarpXEvolve.cpp
Fix another macro indentation
Co-Authored-By: Remi Lehe <remi.lehe@normalesup.org>
* New diagnostics support RZ (#836)
* Start implementation of new averaging with staggering:
- face-to-cell-center and edge-to-cell-center replaced so far;
- TODO: node-to-cell-center and 1D behavior (AMREX_SPACEDIM=1).
* first implementation of Diags base classes
* add example, temporarily
* Continue implementation of new averaging with staggering:
- new function takes reference to single MultiFab (no vector);
- TODO: node-to-cell-center still in progress.
* Fix small bug and clean up
* Fix bug in loop over n=0,...,ncomp-1 and clean up
* add more functions
* Add Doxygen documentation and clean up
* Small clean-up in Doxygen documentation
* Compile in single precision: add _rt suffix to avoid unnecessary conversions
* Avoid accessing staggering index directly from IntVect in innermost loops
* Replace do-while loop with for loop (default ncomp=1)
* Remove temporary pointer and pass reference to MultiFab (instead of MultiFab*)
* Replace AMREX_LAUNCH_HOST_DEVICE_LAMBDA with ParallelFor
* cleaning and initialize output mf
* use general average routine
* move flush in new class, and implemented the Plotfile derived class
* add comments
* eol
* free memory in destructor
* typo
* typo
* no need to clear MF pointers there
* though shalt not break existing tests
* FlushRaw doesnt have to be virtual for now
* The importance of being constant
* Capability to select fields in output files
* EOL
* revert to old inputs
* const in right place
* avoid brace initializer there
* oops, fix logic error in is_in
* user can choose flush interval, same behavior as plot_int
* Add option to plot raw fields
* eol
* replace ter flush with dump to avoid confusion
* add options
* Diagnostics stores a vector of functors to compute diags on the fly
* eol
* Field gather from diags to sync particle quantities
* New diagnostics handle RZ with same behavior as old ones
* cleaning and doc
* const ref for string
* smarter for loop from Axel and typo fix from Reva
* simplify code following Dave's comments
* rename mode_avg to convertRZmodes2cartesian
* Update CellCenterFunctor.H
* fill varnames and vector of functors at the same time
* WarpX instance not needed here
* add const
Co-authored-by: Edoardo Zoni <ezoni@lbl.gov>
* Add load balance options documentation (#842)
* Add load balance options documentation
* Add load balance options documentations
* EOL
* Replace tilebox by growntilebox (#849)
* Updated Profiling information in running_cpp (#776)
* Fixed link that was pointing to 404 error page
* Added motivation for profiling and TINYPROFILERS explanation
* Moved section on NERSC profiling to developers Docs
* Update Docs/source/running_cpp/profiling.rst
Co-Authored-By: MaxThevenet <mthevenet@lbl.gov>
* Update tiny profilers suggestion Docs/source/running_cpp/profiling.rst
Co-Authored-By: MaxThevenet <mthevenet@lbl.gov>
* Fix typo Docs/source/running_cpp/profiling.rst
Co-authored-by: MaxThevenet <mthevenet@lbl.gov>
* Add a few additional diags (divE etc.) (#844)
* Start implementation of new averaging with staggering:
- face-to-cell-center and edge-to-cell-center replaced so far;
- TODO: node-to-cell-center and 1D behavior (AMREX_SPACEDIM=1).
* first implementation of Diags base classes
* add example, temporarily
* Continue implementation of new averaging with staggering:
- new function takes reference to single MultiFab (no vector);
- TODO: node-to-cell-center still in progress.
* Fix small bug and clean up
* Fix bug in loop over n=0,...,ncomp-1 and clean up
* add more functions
* Add Doxygen documentation and clean up
* Small clean-up in Doxygen documentation
* Compile in single precision: add _rt suffix to avoid unnecessary conversions
* Avoid accessing staggering index directly from IntVect in innermost loops
* Replace do-while loop with for loop (default ncomp=1)
* Remove temporary pointer and pass reference to MultiFab (instead of MultiFab*)
* Replace AMREX_LAUNCH_HOST_DEVICE_LAMBDA with ParallelFor
* cleaning and initialize output mf
* use general average routine
* move flush in new class, and implemented the Plotfile derived class
* add comments
* eol
* free memory in destructor
* typo
* typo
* no need to clear MF pointers there
* though shalt not break existing tests
* FlushRaw doesnt have to be virtual for now
* The importance of being constant
* Capability to select fields in output files
* EOL
* revert to old inputs
* const in right place
* avoid brace initializer there
* oops, fix logic error in is_in
* user can choose flush interval, same behavior as plot_int
* Add option to plot raw fields
* eol
* replace ter flush with dump to avoid confusion
* add options
* Diagnostics stores a vector of functors to compute diags on the fly
* eol
* Field gather from diags to sync particle quantities
* New diagnostics handle RZ with same behavior as old ones
* cleaning and doc
* const ref for string
* smarter for loop from Axel and typo fix from Reva
* Functors to compute some fields
* simplify code following Dave's comments
* Create subfolders and add more output options (divE etc.)
* eol
* rename mode_avg to convertRZmodes2cartesian
* Update CellCenterFunctor.H
* fill varnames and vector of functors at the same time
* output rho_new, not rho_old
* WarpX instance not needed here
* add const
* little bit more of reorganization
* Apply suggestions from code review
Co-Authored-By: Axel Huebl <axel.huebl@plasma.ninja>
* add a bunch of const
* make derived classes final
Co-authored-by: Edoardo Zoni <ezoni@lbl.gov>
Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja>
* Add Initial Distribution Test (#735)
* Add Histogram
* Add normalization
* Add doc
* Minor
* Minor
* Fix a bug
* Add gaussian distribution test
* Fix alert and change amr.plot_int
* Add maxwell-boltzmann distribution test
* Add maxwell-boltzmann distribution test
* Add maxwell-boltzmann distribution test
* Add maxwell-juttner
* Minor
* Typo
* Minor
* Minor
* Add const
* Apply suggestions from code review
Co-Authored-By: MaxThevenet <mthevenet@lbl.gov>
* Modify based on suggestions.
* Add histogram name
* Add bin values
* Don't add histogram name
* Modify read_raw_data.py
* Add doc
* Change ux,uy,uz units
* Change ux,uy,uz units
* Change if format
* Save some variables
* Change more
* Minor
* Fix a bug on GPU
* Fix a bug on GPU
* Add wrong species name abort
* Minor doc
* Change #include format
* Apply suggestions from code review
Co-Authored-By: MaxThevenet <mthevenet@lbl.gov>
* Add const
* Change to member variables
* revert
* Change units based on changes of PR#727
* merge
* Add Gaussian position distribution test
* Minor
* Change based on suggestions
* Use read_raw_data.py
* Minor
* Change to no normalization
* Add more in doc
* doc
* doc
* Use relative error
* Don't divide by bin_size
* Change based on suggestions
Co-authored-by: MaxThevenet <mthevenet@lbl.gov>
* Tests: Fix Bool Switch Typo OMP (#854)
useOMP is 0 (False) or 1 (True)
* Costs vector of (pointer to) vector (#829)
* [WIP] costs mf --> costs vector
* [WIP] costs vector
* [WIP] vector costs
* formatting
* makefile
* [WIP] costs vector
* [WIP] *= costs
* wts do not need to divide by num cells
* Tiling safety on CPU
* Add tests
* EOL
* Remove unneeded input
* Update Source/WarpX.H costs documentation
Co-Authored-By: MaxThevenet <mthevenet@lbl.gov>
* Update timers with times only if user Timers update
* warpx.-->WarpX::
* warpx.-->WarpX::
* warpx.-->WarpX::
* warpx.-->WarpX::
* warpx.-->WarpX::
* add dev synch
* Update Regression/WarpX-tests.ini
Co-Authored-By: MaxThevenet <mthevenet@lbl.gov>
* Delete inputs_loadbalance_costs_heuristic
* Update and rename inputs_loadbalancecosts_timers to inputs_loadbalancecosts
* Update WarpX-tests.ini
Co-authored-by: MaxThevenet <mthevenet@lbl.gov>
* [mini-PR] Read species distribution from OPMD file (#847)
* Added <species>.profile=external_file and .profile_file
* Added description of input parameters to Docs
* Changed from profile to injection option for external file
* Fix typo in amrex abort message (due to copy paste)
* Added the OpenPMD use amrex abort message
* Minor fix - not sure how to remove EOL issue
* Tried to add AddExternalFileBeam functon to PhysicalParticleContainer
* Trued to fix EOL white space issue
* Added read/print species name from OPMD file
* Update Source/Initialization/PlasmaInjector.cpp
Co-Authored-By: Axel Huebl <axel.huebl@plasma.ninja>
* Update Source/Particles/PhysicalParticleContainer.cpp
Co-Authored-By: Axel Huebl <axel.huebl@plasma.ninja>
* Update Source/Particles/PhysicalParticleContainer.cpp
Co-Authored-By: Axel Huebl <axel.huebl@plasma.ninja>
* Update Source/Particles/PhysicalParticleContainer.H
Co-Authored-By: Axel Huebl <axel.huebl@plasma.ninja>
* Update Source/Particles/PhysicalParticleContainer.cpp
Co-Authored-By: Axel Huebl <axel.huebl@plasma.ninja>
* No need to include openPMD header yet
* Fix EOL according to @ax3l's recommendation in #845
* Remove commented out AbortMessage
Co-Authored-By: MaxThevenet <mthevenet@lbl.gov>
* Removed commented out part initialization (used only in branch for next PR)
Co-Authored-By: MaxThevenet <mthevenet@lbl.gov>
* Added warning that this is WIP
Co-Authored-By: MaxThevenet <mthevenet@lbl.gov>
* Changed function name to AddPlasmaFromFile
* Removed AMReX warning from loop
Co-Authored-By: MaxThevenet <mthevenet@lbl.gov>
Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja>
Co-authored-by: MaxThevenet <mthevenet@lbl.gov>
* Ignore python build/dist and egg folders (#850)
* Travis CI: set max numprocs=2 and do not overwrite (#860)
* [mini-PR] Fix bug in Breit-Wheeler engine (#852)
* fixed bug in BW engine
* fixed bug
* fixed bug
* fixed bug
* fixed bug
* fixed bug
* eliminate useless variable
* updated test
* updated inputfile
* Updated tests
* increase tolerance from .04 to .07 in QED 3D BW test
* do plot pos_bw and ele_bw
Co-authored-by: MaxThevenet <mthevenet@lbl.gov>
* Documentation update - towards full SI (#301)
* Added blank line after list. Changed characters in link to Q. H. Liu paper so hyoerlink works with sphinx-build 2.1.2.
* Added line cut unintentionally.
* Removed line added unintentionally.
* Same as before.
* Same as before, but hopefully successfully
* Same as before, but hopefully successfully
* Minor changes to description of PWFA example run
* Revert "Minor changes to description of PWFA example run"
This reverts commit a4d7fa969c906959b018efe683a3e585cbd741f9.
* Revert "Profiler wrapper to allow for cudaDeviceSynchronize (#738)"
This reverts commit bbefc3dad687f4370afd5bc85386d983201cb321.
* Revert "Revert "Minor changes to description of PWFA example run""
This reverts commit 965982d35361ff54d0ad10101ffc31605117e5ac.
* Revert "Minor changes to description of PWFA example run"
This reverts commit a4d7fa969c906959b018efe683a3e585cbd741f9.
* I am making a huge mess with merging
* Minor changes to description of PWFA example run
* Added explanation PWFA simulation section
* Re-structuring. Adding sections for each choice.
* Minor fix to note
* Minor changes to text
* Time step description + fixed line length
* Added FDTD and CKC selection
* Added max time step calculations
* Trying to fix EOL issue
* Added mesh refinement and moving window
* Fixed minor issues
* Fix EOL issues again
* Fixed typo - auxiliary
Co-Authored-By: MaxThevenet <mthevenet@lbl.gov>
Co-authored-by: Diana Amorim <diana@henrivincenti.dhcp.lbl.gov>
Co-authored-by: MaxThevenet <mthevenet@lbl.gov>
* Remove compiler warnings (#843)
* Fix compiler warnings with DIM=2
* Fix compiler warnings with USE_RZ=TRUE
* Fix compiler warnings with USE_PSATD=TRUE and DIM=2
* Fix compiler warnings with USE_PSATD=TRUE and DIM=3
* Fix bug: discard only return value when calling DefineAndReturnParticleTile
* Remove unused variables not triggering warnings
* [mini-PR] Fix energy calculation for photons in reduced diagnostics (#861)
* fix energy calculation for photons
* fixed typo and mada calculation clearer
* added photon particles in reduced diags test
* Update Source/Diagnostics/ReducedDiags/ParticleEnergy.cpp
Co-Authored-By: MaxThevenet <mthevenet@lbl.gov>
* Update Source/Diagnostics/ReducedDiags/ParticleEnergy.cpp
Co-Authored-By: MaxThevenet <mthevenet@lbl.gov>
Co-authored-by: MaxThevenet <mthevenet@lbl.gov>
* Port rigid injection to the gpu (#862)
* port rigid injection to the gpu
* eol
* Apply suggestions from code review
Co-Authored-By: MaxThevenet <mthevenet@lbl.gov>
* define csqi
Co-authored-by: MaxThevenet <mthevenet@lbl.gov>
* Added blocking factor to 2d and RZ geometries (#864)
* doc: fix formatting for ascent yaml examples (#865)
* [mini-PR] Clarifying ionizable particle charge (#863)
* Added documentation note on ionization particle charge
* Added correct charge of ion to be ionized
* Corrected multiplication symbol
Co-Authored-By: MaxThevenet <mthevenet@lbl.gov>
* Testing doxygen issue
* Charge correction only to ionizable species
* Trying to fix doxygen url fetch issue
Co-authored-by: MaxThevenet <mthevenet@lbl.gov>
* In HankelTransform, added explicit matrix multiply for GPU
* In RZ spectral solver, update setval to be on device
* Removed CEXE_headers from FieldSolver/SpectralSolver/Make.package
* In HankelTransform, added check of the Bessel root finder
* Updated includes to RZ spectral solver
* Added comments on how Hankel transform matrix is calculated
* Added more comments to Hankel transform calculation
* For RZ spectral solver, cleaned up naming and add subdirectory for Hankel transform files
* Cleaned up code in PsatdAlgorithmRZ.cpp
* Updated comment for fields in SpectralFieldDataRZ.cpp
* Changed HankelTransformer to MultiSpectralHankelTransformer
* Updates comments in RZ spectral solver
* Removed code for k-space filtering that will be added in a later PR
* For RZ spectral solver, passed lev in
* Fixed comment in SpectralFieldDataRZ.cpp
Co-authored-by: MaxThevenet <mthevenet@lbl.gov>
Co-authored-by: Remi Lehe <remi.lehe@normalesup.org>
Co-authored-by: Edoardo Zoni <ezoni@lbl.gov>
Co-authored-by: Michael E Rowan <38045958+mrowan137@users.noreply.github.com>
Co-authored-by: NeilZaim <49716072+NeilZaim@users.noreply.github.com>
Co-authored-by: L. Diana Amorim <LDianaAmorim@lbl.gov>
Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja>
Co-authored-by: Yinjian Zhao <yinjianzhao@lbl.gov>
Co-authored-by: Edoardo Zoni <59625522+EZoni@users.noreply.github.com>
Co-authored-by: Luca Fedeli <luca.fedeli@cea.fr>
Co-authored-by: Diana Amorim <diana@henrivincenti.dhcp.lbl.gov>
Co-authored-by: Andrew Myers <atmyers@lbl.gov>
Co-authored-by: Cyrus Harrison <cyrush@llnl.gov>
Diffstat (limited to 'Source/FieldSolver/SpectralSolver/SpectralFieldDataRZ.cpp')
-rw-r--r-- | Source/FieldSolver/SpectralSolver/SpectralFieldDataRZ.cpp | 444 |
1 files changed, 444 insertions, 0 deletions
diff --git a/Source/FieldSolver/SpectralSolver/SpectralFieldDataRZ.cpp b/Source/FieldSolver/SpectralSolver/SpectralFieldDataRZ.cpp new file mode 100644 index 000000000..4a66d924a --- /dev/null +++ b/Source/FieldSolver/SpectralSolver/SpectralFieldDataRZ.cpp @@ -0,0 +1,444 @@ +/* Copyright 2019-2020 David Grote + * + * This file is part of WarpX. + * + * License: BSD-3-Clause-LBNL + */ +#include "SpectralFieldDataRZ.H" + +#include "WarpX.H" + +using amrex::operator""_rt; + +/* \brief Initialize fields in spectral space, and FFT plans + * + * \param realspace_ba Box array that corresponds to the decomposition + * * of the fields in real space (cell-centered ; includes guard cells only in z) + * \param k_space Defined the domain of the k space + * \param dm Indicates which MPI proc owns which box, in realspace_ba + * \param n_field_required Specifies the number of fields that will be transformed + * \param n_modes Number of cylindrical modes + * */ +SpectralFieldDataRZ::SpectralFieldDataRZ (amrex::BoxArray const & realspace_ba, + SpectralKSpaceRZ const & k_space, + amrex::DistributionMapping const & dm, + int const n_field_required, + int const n_modes, + int const lev) + : n_rz_azimuthal_modes(n_modes) +{ + amrex::BoxArray const & spectralspace_ba = k_space.spectralspace_ba; + + // Allocate the arrays that contain the fields in spectral space. + // SpectralField is comparable to a MultiFab but stores complex numbers. + // This stores all of the transformed fields in one place, with the last dimension + // being the list of fields, defined by SpectralFieldIndex, for all of the modes. + // The fields of each mode are grouped together, so that the index of a + // field for a specific mode is given by field_index + mode*n_fields. + fields = SpectralField(spectralspace_ba, dm, n_rz_azimuthal_modes*n_field_required, 0); + + // Allocate temporary arrays - in real space and spectral space. + // These complex arrays will store the data just before/after the z FFT. + // Note that the realspace_ba should not include the radial guard cells. + tempHTransformed = SpectralField(realspace_ba, dm, n_rz_azimuthal_modes, 0); + tmpSpectralField = SpectralField(spectralspace_ba, dm, n_rz_azimuthal_modes, 0); + + // By default, we assume the z FFT is done from/to a nodal grid in real space. + // It the FFT is performed from/to a cell-centered grid in real space, + // a correcting "shift" factor must be applied in spectral space. + zshift_FFTfromCell = k_space.getSpectralShiftFactor(dm, 1, + ShiftType::TransformFromCellCentered); + zshift_FFTtoCell = k_space.getSpectralShiftFactor(dm, 1, + ShiftType::TransformToCellCentered); + + // Allocate and initialize the FFT plans and Hankel transformer. + forward_plan = FFTplans(spectralspace_ba, dm); +#ifndef AMREX_USE_GPU + // The backward plan is not needed with GPU since it would be the same + // as the forward plan anyway. + backward_plan = FFTplans(spectralspace_ba, dm); +#endif + multi_spectral_hankel_transformer = MultiSpectralHankelTransformer(spectralspace_ba, dm); + + // Loop over boxes and allocate the corresponding plan + // for each box owned by the local MPI proc. + for (amrex::MFIter mfi(spectralspace_ba, dm); mfi.isValid(); ++mfi){ + amrex::IntVect grid_size = realspace_ba[mfi].length(); +#ifdef AMREX_USE_GPU + // Create cuFFT plan. + // This is alway complex to complex. + // This plan is for one azimuthal mode only. + cufftResult result; + int fft_length[] = {grid_size[1]}; + int inembed[] = {grid_size[1]}; + int istride = grid_size[0]; + int idist = 1; + int onembed[] = {grid_size[1]}; + int ostride = grid_size[0]; + int odist = 1; + int batch = grid_size[0]; // number of ffts + result = cufftPlanMany(&forward_plan[mfi], 1, fft_length, inembed, istride, idist, + onembed, ostride, odist, CUFFT_Z2Z, batch); + if (result != CUFFT_SUCCESS) { + amrex::Print() << " cufftPlanMany failed! \n"; + } + // The backward plane is the same as the forward since the direction is passed when executed. + +#else + // Create FFTW plans. + fftw_iodim dims[1]; + fftw_iodim howmany_dims[2]; + dims[0].n = grid_size[1]; + dims[0].is = grid_size[0]; + dims[0].os = grid_size[0]; + howmany_dims[0].n = n_rz_azimuthal_modes; + howmany_dims[0].is = grid_size[0]*grid_size[1]; + howmany_dims[0].os = grid_size[0]*grid_size[1]; + howmany_dims[1].n = grid_size[0]; + howmany_dims[1].is = 1; + howmany_dims[1].os = 1; + forward_plan[mfi] = + // Note that AMReX FAB are Fortran-order. + fftw_plan_guru_dft(1, // int rank + dims, + 2, // int howmany_rank, + howmany_dims, + reinterpret_cast<fftw_complex*>(tempHTransformed[mfi].dataPtr()), // fftw_complex *in + reinterpret_cast<fftw_complex*>(tmpSpectralField[mfi].dataPtr()), // fftw_complex *out + FFTW_FORWARD, // int sign + FFTW_ESTIMATE); // unsigned flags + backward_plan[mfi] = + fftw_plan_guru_dft(1, // int rank + dims, + 2, // int howmany_rank, + howmany_dims, + reinterpret_cast<fftw_complex*>(tmpSpectralField[mfi].dataPtr()), // fftw_complex *in + reinterpret_cast<fftw_complex*>(tempHTransformed[mfi].dataPtr()), // fftw_complex *out + FFTW_BACKWARD, // int sign + FFTW_ESTIMATE); // unsigned flags +#endif + + // Create the Hankel transformer for each box. + std::array<amrex::Real,3> xmax = WarpX::UpperCorner(mfi.tilebox(), lev); + multi_spectral_hankel_transformer[mfi] = SpectralHankelTransformer(grid_size[0], n_rz_azimuthal_modes, xmax[0]); + } +} + + +SpectralFieldDataRZ::~SpectralFieldDataRZ() +{ + if (fields.size() > 0){ + for (amrex::MFIter mfi(fields); mfi.isValid(); ++mfi){ +#ifdef AMREX_USE_GPU + // Destroy cuFFT plans. + cufftDestroy(forward_plan[mfi]); + // cufftDestroy(backward_plan[mfi]); // This was never allocated. +#else + // Destroy FFTW plans. + fftw_destroy_plan(forward_plan[mfi]); + fftw_destroy_plan(backward_plan[mfi]); +#endif + } + } +} + +/* \brief Z Transform the FAB to spectral space, + * and store the corresponding result internally + * (in the spectral field specified by `field_index`) + * The input, tempHTransformedSplit, is the complex, Hankel transformed + * data, which is stored wih the real and imaginary parts split. + * The input should include the imaginary component of mode 0 + * (even though it is all zeros). */ +void +SpectralFieldDataRZ::FABZForwardTransform (amrex::MFIter const & mfi, + amrex::MultiFab const & tempHTransformedSplit, + int const field_index, const bool is_nodal_z) +{ + // Copy the split complex to the interleaved complex. + + amrex::Box const& realspace_bx = tempHTransformed[mfi].box(); + + amrex::Array4<const amrex::Real> const& split_arr = tempHTransformedSplit[mfi].array(); + amrex::Array4<Complex> const& complex_arr = tempHTransformed[mfi].array(); + + int const modes = n_rz_azimuthal_modes; + ParallelFor(realspace_bx, modes, + [=] AMREX_GPU_DEVICE(int i, int j, int k, int mode) noexcept { + int const mode_r = 2*mode; + int const mode_i = 2*mode + 1; + complex_arr(i,j,k,mode) = Complex{split_arr(i,j,k,mode_r), split_arr(i,j,k,mode_i)}; + }); + + // Perform Fourier transform from `tempHTransformed` to `tmpSpectralField`. +#ifdef AMREX_USE_GPU + // Perform Fast Fourier Transform on GPU using cuFFT. + // Make sure that this is done on the same + // GPU stream as the above copy. + cufftResult result; + cudaStream_t stream = amrex::Gpu::Device::cudaStream(); + cufftSetStream(forward_plan[mfi], stream); + for (int mode=0 ; mode < n_rz_azimuthal_modes ; mode++) { + result = cufftExecZ2Z(forward_plan[mfi], + reinterpret_cast<cuDoubleComplex*>(tempHTransformed[mfi].dataPtr(mode)), // cuDoubleComplex *in + reinterpret_cast<cuDoubleComplex*>(tmpSpectralField[mfi].dataPtr(mode)), // cuDoubleComplex *out + CUFFT_FORWARD); + if (result != CUFFT_SUCCESS) { + amrex::Print() << " forward transform using cufftExecZ2Z failed ! \n"; + } + } +#else + fftw_execute(forward_plan[mfi]); +#endif + + // Copy the spectral-space field `tmpSpectralField` to the appropriate + // index of the FabArray `fields` (specified by `field_index`) + // and apply correcting shift factor if the real space data comes + // from a cell-centered grid in real space instead of a nodal grid. + amrex::Array4<const Complex> const& tmp_arr = tmpSpectralField[mfi].array(); + amrex::Array4<Complex> const& fields_arr = fields[mfi].array(); + Complex const* zshift_arr = zshift_FFTfromCell[mfi].dataPtr(); + + // Loop over indices within one box, all components. + // The fields are organized so that the fields for each mode + // are grouped together in memory. + amrex::Box const& spectralspace_bx = tmpSpectralField[mfi].box(); + int const nz = spectralspace_bx.length(1); + amrex::Real inv_nz = 1._rt/nz; + constexpr int n_fields = SpectralFieldIndex::n_fields; + + ParallelFor(spectralspace_bx, modes, + [=] AMREX_GPU_DEVICE(int i, int j, int k, int mode) noexcept { + Complex spectral_field_value = tmp_arr(i,j,k,mode); + // Apply proper shift. + if (is_nodal_z==false) spectral_field_value *= zshift_arr[j]; + // Copy field into the correct index. + int const ic = field_index + mode*n_fields; + fields_arr(i,j,k,ic) = spectral_field_value*inv_nz; + }); +} + +/* \brief Backward Z Transform the data from the fields + * (in the spectral field specified by `field_index`) + * to physical space, and return the resulting FArrayBox. + * The output, tempHTransformedSplit, is the complex, Hankel transformed + * data, which is stored wih the real and imaginary parts split. + * The output includes the imaginary component of mode 0 + * (even though it is all zeros). */ +void +SpectralFieldDataRZ::FABZBackwardTransform (amrex::MFIter const & mfi, int const field_index, + amrex::MultiFab & tempHTransformedSplit, + const bool is_nodal_z) +{ + // Copy the spectral-space field from the appropriate index of the FabArray + // `fields` (specified by `field_index`) to field `tmpSpectralField` + // and apply correcting shift factor if the real space data is on + // a cell-centered grid in real space instead of a nodal grid. + amrex::Array4<const Complex> const& fields_arr = fields[mfi].array(); + amrex::Array4<Complex> const& tmp_arr = tmpSpectralField[mfi].array(); + Complex const* zshift_arr = zshift_FFTtoCell[mfi].dataPtr(); + + // Loop over indices within one box, all components. + amrex::Box const& spectralspace_bx = tmpSpectralField[mfi].box(); + + int const modes = n_rz_azimuthal_modes; + constexpr int n_fields = SpectralFieldIndex::n_fields; + ParallelFor(spectralspace_bx, modes, + [=] AMREX_GPU_DEVICE(int i, int j, int k, int mode) noexcept { + int const ic = field_index + mode*n_fields; + Complex spectral_field_value = fields_arr(i,j,k,ic); + // Apply proper shift. + if (is_nodal_z==false) spectral_field_value *= zshift_arr[j]; + // Copy field into the right index. + tmp_arr(i,j,k,mode) = spectral_field_value; + }); + + // Perform Fourier transform from `tmpSpectralField` to `tempHTransformed`. +#ifdef AMREX_USE_GPU + // Perform Fast Fourier Transform on GPU using cuFFT. + // Make sure that this is done on the same + // GPU stream as the above copy. + cufftResult result; + cudaStream_t stream = amrex::Gpu::Device::cudaStream(); + cufftSetStream(forward_plan[mfi], stream); + for (int mode=0 ; mode < n_rz_azimuthal_modes ; mode++) { + result = cufftExecZ2Z(forward_plan[mfi], + reinterpret_cast<cuDoubleComplex*>(tmpSpectralField[mfi].dataPtr(mode)), // cuDoubleComplex *in + reinterpret_cast<cuDoubleComplex*>(tempHTransformed[mfi].dataPtr(mode)), // cuDoubleComplex *out + CUFFT_INVERSE); + if (result != CUFFT_SUCCESS) { + amrex::Print() << " backwardtransform using cufftExecZ2Z failed ! \n"; + } + } +#else + fftw_execute(backward_plan[mfi]); +#endif + + // Copy the interleaved complex to the split complex. + amrex::Box const& realspace_bx = tempHTransformed[mfi].box(); + + amrex::Array4<amrex::Real> const& split_arr = tempHTransformedSplit[mfi].array(); + amrex::Array4<const Complex> const& complex_arr = tempHTransformed[mfi].array(); + + ParallelFor(realspace_bx, modes, + [=] AMREX_GPU_DEVICE(int i, int j, int k, int mode) noexcept { + int const mode_r = 2*mode; + int const mode_i = 2*mode + 1; + split_arr(i,j,k,mode_r) = complex_arr(i,j,k,mode).real(); + split_arr(i,j,k,mode_i) = complex_arr(i,j,k,mode).imag(); + }); + +} + +/* \brief Transform the component `i_comp` of MultiFab `field_mf` + * to spectral space, and store the corresponding result internally + * (in the spectral field specified by `field_index`) */ +void +SpectralFieldDataRZ::ForwardTransform (amrex::MultiFab const & field_mf, int const field_index, + int const i_comp) +{ + // Check field index type, in order to apply proper shift in spectral space. + // Only cell centered in r is supported. + bool const is_nodal_z = field_mf.is_nodal(1); + + int const ncomp = 2*n_rz_azimuthal_modes - 1; + + // This will hold the Hankel transformed data, with the real and imaginary parts split. + // A full multifab is created so that each GPU stream has its own temp space. + amrex::MultiFab tempHTransformedSplit(tempHTransformed.boxArray(), tempHTransformed.DistributionMap(), 2*n_rz_azimuthal_modes, 0); + + // Loop over boxes. + for (amrex::MFIter mfi(field_mf); mfi.isValid(); ++mfi){ + + // Perform the Hankel transform first. + // tempHTransformedSplit includes the imaginary component of mode 0. + // field_mf does not. + amrex::Box const& realspace_bx = tempHTransformed[mfi].box(); + amrex::FArrayBox field_comp(field_mf[mfi], amrex::make_alias, i_comp*ncomp, ncomp); + multi_spectral_hankel_transformer[mfi].PhysicalToSpectral_Scalar(realspace_bx, field_comp, tempHTransformedSplit[mfi]); + + FABZForwardTransform(mfi, tempHTransformedSplit, field_index, is_nodal_z); + + } +} + +/* \brief Transform the coupled components of MultiFabs `field_mf_r` and `field_mf_t` + * to spectral space, and store the corresponding result internally + * (in the spectral fields specified by `field_index_r` and `field_index_t`) */ +void +SpectralFieldDataRZ::ForwardTransform (amrex::MultiFab const & field_mf_r, int const field_index_r, + amrex::MultiFab const & field_mf_t, int const field_index_t) +{ + // Check field index type, in order to apply proper shift in spectral space. + // Only cell centered in r is supported. + bool const is_nodal_z = field_mf_r.is_nodal(1); + + // Create copies of the input multifabs. The copies will include the imaginary part of mode 0. + // Also note that the Hankel transform will overwrite the copies. + // Full multifabs are created for the temps so that each GPU stream has its own temp space. + amrex::MultiFab field_mf_r_copy(field_mf_r.boxArray(), field_mf_r.DistributionMap(), 2*n_rz_azimuthal_modes, field_mf_r.nGrowVect()); + amrex::MultiFab field_mf_t_copy(field_mf_t.boxArray(), field_mf_t.DistributionMap(), 2*n_rz_azimuthal_modes, field_mf_t.nGrowVect()); + amrex::MultiFab::Copy(field_mf_r_copy, field_mf_r, 0, 0, 1, field_mf_r.nGrowVect()); // Real part of mode 0 + amrex::MultiFab::Copy(field_mf_t_copy, field_mf_t, 0, 0, 1, field_mf_t.nGrowVect()); // Real part of mode 0 + field_mf_r_copy.setVal(0._rt, 1, 1, field_mf_r.nGrowVect()); // Imaginary part of mode 0 + field_mf_t_copy.setVal(0._rt, 1, 1, field_mf_t.nGrowVect()); // Imaginary part of mode 0 + amrex::MultiFab::Copy(field_mf_r_copy, field_mf_r, 1, 2, 2*n_rz_azimuthal_modes-2, field_mf_r.nGrowVect()); + amrex::MultiFab::Copy(field_mf_t_copy, field_mf_t, 1, 2, 2*n_rz_azimuthal_modes-2, field_mf_t.nGrowVect()); + + amrex::MultiFab tempHTransformedSplit_p(tempHTransformed.boxArray(), tempHTransformed.DistributionMap(), 2*n_rz_azimuthal_modes, 0); + amrex::MultiFab tempHTransformedSplit_m(tempHTransformed.boxArray(), tempHTransformed.DistributionMap(), 2*n_rz_azimuthal_modes, 0); + + // Loop over boxes. + for (amrex::MFIter mfi(field_mf_r); mfi.isValid(); ++mfi){ + + // Perform the Hankel transform first. + amrex::Box const& realspace_bx = tempHTransformed[mfi].box(); + multi_spectral_hankel_transformer[mfi].PhysicalToSpectral_Vector(realspace_bx, + field_mf_r_copy[mfi], field_mf_t_copy[mfi], + tempHTransformedSplit_p[mfi], tempHTransformedSplit_m[mfi]); + + FABZForwardTransform(mfi, tempHTransformedSplit_p, field_index_r, is_nodal_z); + FABZForwardTransform(mfi, tempHTransformedSplit_m, field_index_t, is_nodal_z); + + } +} + +/* \brief Transform spectral field specified by `field_index` back to + * real space, and store it in the component `i_comp` of `field_mf` */ +void +SpectralFieldDataRZ::BackwardTransform (amrex::MultiFab& field_mf, int const field_index, + int const i_comp) +{ + // Check field index type, in order to apply proper shift in spectral space. + bool const is_nodal_z = field_mf.is_nodal(1); + + int const ncomp = 2*n_rz_azimuthal_modes - 1; + + // A full multifab is created so that each GPU stream has its own temp space. + amrex::MultiFab tempHTransformedSplit(tempHTransformed.boxArray(), tempHTransformed.DistributionMap(), 2*n_rz_azimuthal_modes, 0); + + // Loop over boxes. + for (amrex::MFIter mfi(field_mf); mfi.isValid(); ++mfi){ + + FABZBackwardTransform(mfi, field_index, tempHTransformedSplit, is_nodal_z); + + // Perform the Hankel inverse transform last. + // tempHTransformedSplit includes the imaginary component of mode 0. + // field_mf does not. + amrex::Box const& realspace_bx = tempHTransformed[mfi].box(); + amrex::FArrayBox field_comp(field_mf[mfi], amrex::make_alias, i_comp*ncomp, ncomp); + multi_spectral_hankel_transformer[mfi].SpectralToPhysical_Scalar(realspace_bx, tempHTransformedSplit[mfi], field_comp); + + } +} + +/* \brief Transform spectral fields specified by `field_index_r` and + * `field_index_t` back to real space, and store them in `field_mf_r` and `field_mf_t` */ +void +SpectralFieldDataRZ::BackwardTransform (amrex::MultiFab& field_mf_r, int const field_index_r, + amrex::MultiFab& field_mf_t, int const field_index_t) +{ + // Check field index type, in order to apply proper shift in spectral space. + bool const is_nodal_z = field_mf_r.is_nodal(1); + + // Full multifabs are created for the temps so that each GPU stream has its own temp space. + amrex::MultiFab tempHTransformedSplit_p(tempHTransformed.boxArray(), tempHTransformed.DistributionMap(), 2*n_rz_azimuthal_modes, 0); + amrex::MultiFab tempHTransformedSplit_m(tempHTransformed.boxArray(), tempHTransformed.DistributionMap(), 2*n_rz_azimuthal_modes, 0); + + // Create copies of the input multifabs. The copies will include the imaginary part of mode 0. + amrex::MultiFab field_mf_r_copy(field_mf_r.boxArray(), field_mf_r.DistributionMap(), 2*n_rz_azimuthal_modes, field_mf_r.nGrowVect()); + amrex::MultiFab field_mf_t_copy(field_mf_t.boxArray(), field_mf_t.DistributionMap(), 2*n_rz_azimuthal_modes, field_mf_t.nGrowVect()); + + // Loop over boxes. + for (amrex::MFIter mfi(field_mf_r); mfi.isValid(); ++mfi){ + + FABZBackwardTransform(mfi, field_index_r, tempHTransformedSplit_p, is_nodal_z); + FABZBackwardTransform(mfi, field_index_t, tempHTransformedSplit_m, is_nodal_z); + + // Perform the Hankel inverse transform last. + // tempHTransformedSplit includes the imaginary component of mode 0. + // field_mf_[ri] do not. + amrex::Box const& realspace_bx = tempHTransformed[mfi].box(); + multi_spectral_hankel_transformer[mfi].SpectralToPhysical_Vector(realspace_bx, + tempHTransformedSplit_p[mfi], tempHTransformedSplit_m[mfi], + field_mf_r_copy[mfi], field_mf_t_copy[mfi]); + + amrex::Array4<amrex::Real> const & field_mf_r_array = field_mf_r[mfi].array(); + amrex::Array4<amrex::Real> const & field_mf_t_array = field_mf_t[mfi].array(); + amrex::Array4<amrex::Real> const & field_mf_r_copy_array = field_mf_r_copy[mfi].array(); + amrex::Array4<amrex::Real> const & field_mf_t_copy_array = field_mf_t_copy[mfi].array(); + + ParallelFor(realspace_bx, 2*n_rz_azimuthal_modes-1, + [=] AMREX_GPU_DEVICE(int i, int j, int k, int icomp) noexcept { + if (icomp == 0) { + // mode 0 + field_mf_r_array(i,j,k,icomp) = field_mf_r_copy_array(i,j,k,icomp); + field_mf_t_array(i,j,k,icomp) = field_mf_t_copy_array(i,j,k,icomp); + } else { + field_mf_r_array(i,j,k,icomp) = field_mf_r_copy_array(i,j,k,icomp+1); + field_mf_t_array(i,j,k,icomp) = field_mf_t_copy_array(i,j,k,icomp+1); + } + }); + + } + +} |