aboutsummaryrefslogtreecommitdiff
path: root/Tools/machines/perlmutter-nersc/perlmutter.sbatch
diff options
context:
space:
mode:
authorGravatar Axel Huebl <axel.huebl@plasma.ninja> 2022-08-31 11:29:07 -0700
committerGravatar GitHub <noreply@github.com> 2022-08-31 11:29:07 -0700
commit19dba606b11391c67e33857926db8b94ee60829c (patch)
treec8681b8ced83bcbf35e00bb7c68d46c785f5681e /Tools/machines/perlmutter-nersc/perlmutter.sbatch
parent3e47534613e02fd9bedbdda32892a2e0a7b76817 (diff)
downloadWarpX-19dba606b11391c67e33857926db8b94ee60829c.tar.gz
WarpX-19dba606b11391c67e33857926db8b94ee60829c.tar.zst
WarpX-19dba606b11391c67e33857926db8b94ee60829c.zip
Perlmutter: Work-Around CUDA-Aware MPI & Slurm (#3349)
* Perlmutter: Work-Around CUDA-Aware MPI & Slurm There are known HPE bugs on Perlmutter that can blow up simulations (segfault) with CUDA-aware MPI. We avoid the respective Slurm options now and just manually control the exposed GPUs per MPI rank. * Add: `gpus-per-node`
Diffstat (limited to '')
-rw-r--r--Tools/machines/perlmutter-nersc/perlmutter.sbatch6
1 files changed, 4 insertions, 2 deletions
diff --git a/Tools/machines/perlmutter-nersc/perlmutter.sbatch b/Tools/machines/perlmutter-nersc/perlmutter.sbatch
index 2c085364d..65777f304 100644
--- a/Tools/machines/perlmutter-nersc/perlmutter.sbatch
+++ b/Tools/machines/perlmutter-nersc/perlmutter.sbatch
@@ -16,8 +16,7 @@
#SBATCH -C gpu
#SBATCH -c 32
#SBATCH --ntasks-per-node=4
-#SBATCH --gpus-per-task=1
-#SBATCH --gpu-bind=single:1
+#SBATCH --gpus-per-node=4
#SBATCH -o WarpX.o%j
#SBATCH -e WarpX.e%j
@@ -42,6 +41,9 @@
# GPU-aware MPI
export MPICH_GPU_SUPPORT_ENABLED=1
+# expose one GPU per MPI rank
+export CUDA_VISIBLE_DEVICES=$SLURM_LOCALID
+
EXE=./warpx
#EXE=../WarpX/build/bin/warpx.3d.MPI.CUDA.DP.OPMD.QED
#EXE=./main3d.gnu.TPROF.MPI.CUDA.ex