Age | Commit message (Collapse) | Author | Files | Lines |
|
* Docs: Reorder Summit Files
* Docs: Reorder Spock Files
* Docs: Reorder Cori Files
* Docs: Reorder Perlmutter Files
* Docs: Reorder Juwels Files
* Docs: Reorder Lassen Files
* Docs: Reorder Quartz Files
* Docs: Reorder Ookami Files
* Docs: Also Move Summit Profile Script
* Listing Captions: Location in Source
|
|
* Docs: Perlmutter Early Science
Document how to perform large runs.
* Change comment style
Co-authored-by: Edoardo Zoni <59625522+EZoni@users.noreply.github.com>
|
|
* Fix conflict with upstream
* Apply suggestions from code review
* Remove space in the end of lines
* Include suggestions from PR review
* Generalize ROMIO Hints in Batch Scripts
* Fix Comment
* Fix Comment
* Remove duplication
* Formatting
Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja>
|
|
Document work-arounds for libfabric 1.6+ on Cray systems when using
data staging / streaming with ADIOS2 SST.
|
|
Add the size of GPU HBM on documented HPC machines.
This helps users to quickly transition node hours between them for
memory-size-limited problems.
|
|
* Docs: Cori V100 GPU Job Script
Add a job script template for Cori V100 GPU nodes.
* Add concrete cgpu allocation
* Update Tools/BatchScripts/batch_cori_gpu.sh
Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja>
* Update: 1 Rank per GPU
- 1 rank per GPU
- more doc hints
* Slurm: CUDA_VISIBLE_DEVICES
Set `--gpus-per-task=1`
* Visible devices for real
* Doc for GPU Binding
Co-authored-by: Remi Lehe <remi.lehe@normalesup.org>
|
|
Fix crashes on Summit at scale (>~224 nodes) due to failing Barriers
in IBM's MPI stack.
Seen mostly with I/O routines and only since the RHEL8 upgrade.
We see no signfiicant performance impact from this work-around until
OLCF & IBM fix the problem (OLCFHELP-3545).
An alternative work-around via
```
export OMPI_MCA_coll_ibm_collselect_mode_barrier=failsafe
```
was tested and is a tiny bit slower than just falling back to HCOLL
or OMPI's barrier implementations via
```
export OMPI_MCA_coll_ibm_skip_barrier=true
```
Thanks to Brian Smith at OLCF for the support!
|
|
We see problems again at runtime with missing libs. We don't need Darshan by
default, so let's unload it to make our stack more stable.
We also avoid sourcing the profile again in the batch script: we inherit the
environment and modules loaded at submission.
|
|
After the update to RHEL8, Summit changed the default
permissions of newly created files and directories (for the
worse).
This has been reported as OLCFHELP-3442 but we face some
resistance from support to triage this properly at the moment.
Since we need to continue to keep working, we change the `umask`
(aka defaults for new files & dirs) manually so that group
members of the same project can read files and access dirs.
For files and dirs created since this update and not yet using
this `umask`, please use the following fix.
```
find . -type -d -exec chmod a+rx {} \;
find . -type -f -exec chmod a+r {} \;
```
Replace `.` (current directory) with another path if needed.
|
|
The adios2 system module is now fixed on Summit.
We don't need to load the openpmd-api module for CMake, as we
build it on-the-fly at the moment against the ADIOS2 and HDF5
modules.
|
|
* Docs: Perlmutter
Start a documentation page for Perlmutter.
* Cleaning
- better links to docs
- clean submission script
* Perlmutter: Add I/O
|
|
Update CUDA from 11.0 (default) to 11.3.
|
|
* Docs: Spock (OLCF)
Add an initial instruction on how to build on Spock (OLCF)
for AMD rocm GPUs (HIP).
This works around the missing Cray `PrgEnv-hip` that could be used
with the compiler wrappers.
* Missing -L: Via $CRAYLIBS_X86_64
Co-authored-by: Weiqun Zhang <WeiqunZhang@lbl.gov>
|
|
- fix minor typos in example commands
- simplify/clarify `git clone` to be uniform
- modernize Juwels section to CMake
|
|
* [Draft] Doc: Lassen (LLNL)
Document installation and usage on Lassen (LLNL).
* [Draft] Doc: Quartz (LLNL)
Document installation and usage on Quartz (LLNL).
|
|
Thread multiple support is not the default on Cori.
One needs to set this with an environment variable.
|
|
* 1st
* modify .sh
* Apply suggestions from code review
Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja>
|
|
* Power9 CPU docs
* Update Tools/BatchScripts/batch_summit_power9.sh
Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja>
* Subsection convention
Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja>
|
|
* Doc how to compile and run on Juwels
* add a bunch of AMReX-specific options
* Update juwels.rst
* Update Docs/source/building/juwels.rst
Co-authored-by: MaxThevenet <mthevenet@lbl.gov>
* Juwels: Filesystem Note
* Jewels Docs: Missing newline
Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja>
|
|
* Summit: no `-n` in jsrun
The calculation of `-n` does not seem to work anymore. Luckily, the
parameter seems to be automatically taken by some LFS setting in the
job.
* Summit Runs: OMP, GPU-Aware MPI, Latency
Refs.:
- https://jsrunvisualizer.olcf.ornl.gov/?s4f0o11n6c7g1r11d1b1l0=
- https://docs.olcf.ornl.gov/systems/summit_user_guide.html#cuda-aware-mpi
|
|
Document explicit steps to build WarpX with openPMD support on
Summit.
|
|
* reorganize Tools/ into subfolders, in prevision of LibEnsemble scripts
* Oops, also need to let git know some files have been deleted
* caps for consistency
* few paths to fix
|