Age | Commit message (Collapse) | Author | Files | Lines |
|
We already saw C++17 issues with the earlier CUDA releases in the
11.0 line and found on Summit (and with ImpactX) that CUDA 11.3+
is more reliable - and now widely available.
|
|
Update ADIOS2 on Summit to a more modern release.
|
|
* Implement legacy mode
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix file_name initialization for RZ test
* Clear commented-out code
* Change permission
* import sys in analysis_2d_binary.py & add checksum test
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add explanations in class file and update the docs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* ignore y_min and y_max allocation in 2D
* fix syntax
* Implement warnings, improving docs, fix indentation
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Remi Lehe <remi.lehe@normalesup.org>
|
|
|
|
|
|
|
|
|
|
* break application of PEC to rho into separate loops over boxes
* round #1: fix failing CI tests
* also apply the fix to the `ApplyPECtoJfield`
* round 2: fix failing CI tests
* round 3: fix failing CI tests
* refactor PEC handling for charge and current density
* removed unused variable
* round 4: fix failing CI tests
* Fix `mirrorfac` calculation for `rho`
* only apply rho and J PEC boundaries for Cartesian grid (for now) in `SyncCurrentAndRho()`
* use the same kernel to apply PEC boundary to rho and J
* perform J and rho PEC application for RZ but warn about incorrect results if r_max is PEC
* do not apply PEC boundary for rho and J to r-max in RZ
* set warning level to medium and increase abort_on_warning_threshold to high for failing CI tests
|
|
* Doc: LASY 0.1.1
Document the LASY dependency for installs.
* Keep Py versions simple
|
|
Use the same suffix logic as for the install to create the
`libwarpx.ND.[so|dll]` in the build path.
|
|
* Add a links to CMake docs in cmake.rst
Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja>
|
|
* Doc: ux,uy,uz Momenta
The scales of these quantities in user-facing inputs are the
scale of a momenta (gamma*beta or p/mc), not of a velocity.
* Fix Math Formatting
|
|
* Moving window: check pointers to F,G when applying shift
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Apply suggestions from code review
---------
Co-authored-by: Bensoubaya <adam@bc-d0-74-0-dd-e4.dhcp.lbnl.us>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Edoardo Zoni <59625522+EZoni@users.noreply.github.com>
|
|
|
|
Add the `boost` package to the conda developer environment.
Used for QED table generation.
|
|
|
|
Replace explicit array with little formula for the range of the
support function of shape factors.
|
|
|
|
Newer versions fixed the include issue that required us to use
`<cstddef>` includes before including rocFFT. Fixed with >4.3.
|
|
* Enable field ionization from PICMI
* Implement Dave's suggestions and add automated test
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update test benchmark
* Add docstring for FieldIonization class
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
|
* rocFFT: 5.2+ Compatible
More careful include for old versions of rocFFT/ROCm.
* ROCm: 5.2+
|
|
|
|
|
|
|
|
* Imported openPMD
* Unable to add E
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update Source/Laser/LaserProfilesImpl/LaserProfileFromTXYEFile.cpp
* Updated read data t chunk
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Began modifying parse_tyxe_file
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* set coordinates
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* corrected errors
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Clean up .H file
* Extract grid coordinates
* Corrected field name
* Implement laser oscillations
* More correct position
* More correct position
* Updated test script, corrected laser normalization
* Cleaned up code
* Began editing analysis.py (needs debugging)
* Support reading complex data
* Update AMReX depsndency to use Bcast with complexs
* Update test
* Update Source/Laser/LaserProfilesImpl/LaserProfileFromTXYEFile.cpp
* Correct the implementation of the test with Gaussian profile, fix and clean up previous code
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Add `lasy` in dependencies
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update documentation
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Add checksum regression test
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Add enabling condition if WarpX is compiled with OpenPMD support
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update lasy download link
* Updating analysis.py
* Update LaserProfileFromTXYEFile.cpp
* Fix compilation in 1D
* Update LaserProfileFromTXYEFile.cpp for enabling condition if WarpX is compiled with OpenPMD support
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update LaserProfileFromTXYEFile.cpp
* Change unused vectors to scalars and cleanup code
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update automated test
* Implement reading chunks of openPMD file as an input parameter
* Update checksum
* Update warning message if WarpX is not compiled with openPMD support
* Do not output By (too sensitive to noise)
* Remove By from checksum
* Update Source/Laser/LaserProfilesImpl/LaserProfileFromTXYEFile.cpp
* Update Source/Laser/LaserProfilesImpl/LaserProfileFromTXYEFile.cpp
* Update Source/Laser/LaserProfilesImpl/LaserProfileFromTXYEFile.cpp
* Update inputs file for chunks reading
* Correct dimensions assignment
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Code consistent with axilabels
* Start support for 2D and 1D
* Update LaserProfileFromTXYEFile.cpp for 2D reading
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Implementation of a 2D test
* Implementation of 1D reading & 1D test
* Fix call to trilinear_interp
* Fix indentation
* Implement RZ reading & test & fix 2D Checksum test
* Update docs
* Update WarpX-test.ini
---------
Co-authored-by: Camille Woicekowski <camille.meala@berkeley.edu>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Remi Lehe <remi.lehe@normalesup.org>
|
|
* Interpolate on GPU
* another fix in Diagnostics
|
|
|
|
|
|
Use a newer Ubuntu that ships a recent OpenSSL when building
Sphinx with RTD. This migrates a broken dependency for urllib3
in version 3+.
|
|
Guarantee improved plasma stability for 2D with very low initial
target temperature when using Esirkepov current deposition with
energy-conserving field gather (default).
|
|
* replace NULL with std::nullptr everywhere
* fix bug
|
|
* AMReX: 23.05
* PICSAR: 23.05
* WarpX: 23.05
* Update ES EB RZ Tests
- slightly relaxed Phi
- better selection of r outside cutcells
|
|
```
/opt/rocm-5.5.0/include/rocfft.h:16:2: error: "This file is deprecated. Use the header file from /opt/rocm-5.5.0/include/rocfft/rocfft.h by using #include <rocfft/rocfft.h>" [-Werror,-W#warnings]
warning "This file is deprecated. Use the header file from /opt/rocm-5.5.0/include/rocfft/rocfft.h by using #include <rocfft/rocfft.h>"
^
```
|
|
(#3852)
* add documentation for Adastra supercomputer
* add Adastra to toc
* Update Docs/source/install/hpc/adastra.rst
Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja>
---------
Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja>
|
|
amrex::Abort(Utils::TextMsg::Err("msg")) (#3879)
* use WARPX_ABORT_WITH_MSG instead of amrex::Abort(Utils::TextMsg::Err(msg)) [WIP]
* use WARPX_ABORT_WITH_MESSAGE
* fix typo
* fix missing parenthesis
* remove spaces to prevent automatic text wrapping
* remove wrong parenthesis
|
|
* Doc: Latest HDF5 on Summit
Update the Summit docs to use the latest available HDF5
module (v1.12.1).
* Summit: hdf5/1.12.2
|
|
Outdated docs: No, all inputs are lab-frame, besides `stop_time`.
|
|
|
|
Cache only selected brew paths in macOS, not all of
`/usr/local`, which contains more runner specific software.
|
|
|
|
* Add new paper using WarpX
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
|
|
|
|
|
PML (#3884)
|
|
|
|
generated (#3873)
Add thin wrapper aroung amrex::Gpu::DeviceVector in order to be able to use it in lookup tables also when managed memory is not used.
|
|
* Use GPU shared memory to accelerate charge deposition (#66)
* WIP Apply charge deposition unconditionally in scratch memory
* Ensure enough threads to touch every value in the array, even if there are no particles
* Zero out the shared memory before accumulating into it
* Replace box-aware accumulation of final results with simple pointers
* Remove unused code
* WIP
* Account for shared memory being allocated per-block, not per grid/kernel
* Wording
* Fall back to non-shared memory for cases where the grid size is too big to fit, for now
* Filter out additions of 0.0 from atomic accumulation
* Restore non-GPU code path
* Pick apart #if stuff to allow better formatting and comprehension
* Fix egregious whitespace failure
* Abort on insufficient shared memory, rather than falling back to global memory
* Fix silly whitespace
* Fix stray tab character
* Sort and bin particles, pass bins to charge deposition
* Contribute on a binned tile basis; memory errors now
* Initialize array to the extent we actually allocated
* Make sure we initialize the vector the tboxes with invalid Box objects
* in 2D, we make sure we use the same particle position to tile box mapping as amrex
* go ahead and skip empty bins in deposit charge
* Quiet warning from HIP
* Avoid signed/unsigned comparison
* Code compiles for CPU ...
* Rename intermediate buffer back to reduce extraneous diff bits
* Remove another extraneous diff bit
* Leave DPC++ out in the cold, since it expects different syntax for GPU-specific code
* Reset failing value of rho that only slightly changes
* Try tiling over ng_rho to capture particles moved into guard cells
* Match box expansion in both call sites - maybe the third is also necessary
* Grow last box by ng_rho as well.
* Match macro syntax
* Use WarpX dimensionality macros instead of AMREX_SPACEDIM
* Add support for 1D case
* Update benchmark checksums for ME tests
* Fix macro used for 1D
* Fix CUDA compilation
* Rename variable to simplify diff
* Clear up assertions now that stuff is working
* Fix comment referring to current in ChargeDeposition.H
* Fix warning about unused variable after assertion change
* add runtime option
* Convert flag variable from int to bool
* Switch AMREX_SPACEDIM conditions to use WARPX_DIM_* macros
* Once again, leave DPC++ out in the cold, as it doesn't support the same syntax as CUDA and HIP
* Grow charge deposition boxes by the necessary amount
* Mark a variable only used for assertions to suppress warnings
* Fix compilation error for 1D
* Re-add missing 1D support
* Fix other bits of codfe specific to CUDA and HIP, and not DPC++
* restore missing accumulation of thread local charge into main fab.
* reset benchmark for background_mcc because randomization makes it very sensitive
* reset benchmark for Langmuir_multi_psatd_div_cleaning because diffing field is a numerical artifact
* Calm nvcc about function missing a return
* reset benchmark for background_mcc because it's randomized and numerically chaotic
* reset benchmark for LaserAccelerationBoost because of numerical shift in momentum from charge deposition order
* Remove extra nesting level
* Skip sorting the particles and just access them according to the binned permutation
* Load permutation pointer outside GPU kernel
* Revert background_mcc benchmark values
* Loosen overly-strict checksum tolerances in single-precision tests, rather than changing target values
* Revert embedded_circle
* Convert AMREX_ALWAYS_ASSERT to AMREX_ASSERT for particle bounds checking
* Match assertion macro change from #2939
* Fix indentation
* Disable shared memory charge deposition by default
* Ignore variable only used in assertion
* Add documentation of added input parameter warpx.do_shared_mem_charge_deposition
* Add comments as suggested by Remi
* Docs: Fix syntax issues in parameters.rst
* Convert error check to unconditional assertion as requested
* Make some arguments const to ease refactoring
* Finished DepositCurrent function
Ready to call the function from CurrentDeposition.H, but currently there
is only a dummy function there
* AMReX: Weekly Update
* Reset: `reduced_diags_single_precision`
* Reset: `background_mcc_dp_psp`
* Merged with develop. runs on mpi no gpu
* All funcs implemented. Compiles with bugs
* Fixed typo in CurrentDeposition
* Working on 2D version. there is bug
jz doesn't line up correctly
* Fixed 2d bug
* Removed some debugging lines
* Cleaning up comments
* Added an input param for threads per block
* Added a variable NS and START/STOP
* Added a region for kernel
* Not working on tilesize > 1 1 1
* Implemented Andrews new algo for max tilesize
* Reduce the amount of shared memory needed by re-using the same buffer for all three components
* Made default tilesize sort_bin_size LAST V1 COMMIT
* bugfix - don't add 0.0 cells back to global memory.
* Need to take abs before checking > 0.0
* Ran Whitespace Fixer As instructed
* Updated Comments
* change default tpb for current deposition to 128
* clean up comments
* quiet compiler warning
* remove unused variables
* refactor shared current depo code
* forgot to check in file
* fix cpu compilation
* fix uninitialized
* fix typo
* fix bad merge
* Fixed default tilesize bug
Previously had defualted shared_tilesize to sort_bin_size. This was
overwritting the shared_tilesize. Some scanning shows that sort_bin_size
isn't a very good default for tilesize anyways, so the new default is 1
1 1.
* changed shared tilesize default to 6 6 8
Decision based on scan over tilesizes and ppc by @atmyers and @kaplannp
* Put in switches for default tilesize 288 3d 144 2d
Tested correctness in 2 and 3 d
* Simplified parcticle contribution section
In accordance to @AlexanderSinn feedback, and tested RZ, 2D, 3D
* Cleanup tbox construction and depos->(depos+1)/2
in accordance to changes proposed by @AlexanderSinn. Tested on 3D 2D and
RZ
* Restored shared to previous version from 6acab48
Changes from before broke single precision
* Found new spot to benefit from (depos_order+1)/2
* Cleaned up sloppy comments
* Throw error on shared if no hip or cuda
This commit makes the assumption that if you use shared, you must be
using HIP or CUDA. This allows us to remove a bunch of macros that tried
to quietly revert to non shared if you didn't use HIP/CUDA, and we now
throw error if you try to run without HIP/CUDA
* More cleanup to compile and test
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add cost to GPU clock conditional
Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja>
* Update Source/Particles/Deposition/CurrentDeposition.H
Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja>
* add cost to GPU clock conditional
Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja>
* whitespace fix
Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja>
* Updated tilesize docs, and change 1d/rz default
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Added docs for tpb
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Changed grow (depos_order+1)/2->depos_order
Turns out, above fails for shape 2
* Change default to non share, and add error check
throws errors if you try vay or esirkepov with shared, and defaults to
not using shared for all algos.
* update to use ablaster kernel timer
Compiles and runs, but at step 80 diverges from dev
---------
Co-authored-by: Phil Miller <unmobile+gh@gmail.com>
Co-authored-by: Phil Miller <phil@intensecomputing.com>
Co-authored-by: Tools <warpx@lbl.gov>
Co-authored-by: kaplannp <kaplannp@gmail.com>
Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja>
Co-authored-by: kaplannp <56896283+kaplannp@users.noreply.github.com>
Co-authored-by: kaplannp <kaplannp@whitman.edu>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
|
* Don't rely on managed memory in SmartUtils
* use different copy
* Policies Vector: Explicit Device Copy
Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja>
|
|
* suppress sphinx warning about duplicate bibliography labels
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
|
|