diff options
author | 2019-08-26 21:55:53 -0700 | |
---|---|---|
committer | 2019-08-26 21:55:53 -0700 | |
commit | 17eefa31d683117b365c7c272f2e3631c64c71a3 (patch) | |
tree | ad389f30c3b1956eb502e018499d1bcf3cefa748 /Docs/source/running_cpp | |
parent | 3d44362029fb39476f9542a10af9e0fc5eb1ef9b (diff) | |
parent | 83d451a02b3fc493e3e48d992599e60319042860 (diff) | |
download | WarpX-17eefa31d683117b365c7c272f2e3631c64c71a3.tar.gz WarpX-17eefa31d683117b365c7c272f2e3631c64c71a3.tar.zst WarpX-17eefa31d683117b365c7c272f2e3631c64c71a3.zip |
Merge pull request #291 from ECP-WarpX/doc_platforms
Docs: System Submission & Helper Scripts
Diffstat (limited to 'Docs/source/running_cpp')
-rw-r--r-- | Docs/source/running_cpp/parallelization.rst | 24 | ||||
-rw-r--r-- | Docs/source/running_cpp/platforms.rst | 69 | ||||
-rw-r--r-- | Docs/source/running_cpp/running_cpp.rst | 1 |
3 files changed, 75 insertions, 19 deletions
diff --git a/Docs/source/running_cpp/parallelization.rst b/Docs/source/running_cpp/parallelization.rst index 440c17235..a8c89f340 100644 --- a/Docs/source/running_cpp/parallelization.rst +++ b/Docs/source/running_cpp/parallelization.rst @@ -61,22 +61,8 @@ and MPI decomposition and computer architecture used for the run: * Amount of high-bandwidth memory. -Below is a list of experience-based parameters -that were observed to give good performance on given supercomputers. - -Rule of thumb for 3D runs on NERSC Cori KNL -------------------------------------------- - -For a 3D simulation with a few (1-4) particles per cell using FDTD Maxwell -solver on Cori KNL for a well load-balanced problem (in our case laser -wakefield acceleration simulation in a boosted frame in the quasi-linear -regime), the following set of parameters provided good performance: - -* ``amr.max_grid_size=64`` and ``amr.blocking_factor=64`` so that the size of - each grid is fixed to ``64**3`` (we are not using load-balancing here). - -* **8 MPI ranks per KNL node**, with ``OMP_NUM_THREADS=8`` (that is 64 threads - per KNL node, i.e. 1 thread per physical core, and 4 cores left to the - system). - -* **2 grids per MPI**, *i.e.*, 16 grids per KNL node. +Because these parameters put additional contraints on the domain size for a +simulation, it can be cumbersome to calculate the number of cells and the +physical size of the computational domain for a given resolution. This +:download:`Python script<../../../Tools/compute_domain.py>` does it +automatically. diff --git a/Docs/source/running_cpp/platforms.rst b/Docs/source/running_cpp/platforms.rst new file mode 100644 index 000000000..fc4e2b1fb --- /dev/null +++ b/Docs/source/running_cpp/platforms.rst @@ -0,0 +1,69 @@ +Running on specific platforms +============================= + +Running on Cori KNL at NERSC +---------------------------- + +The batch script below can be used to run a WarpX simulation on 2 KNL nodes on +the supercomputer Cori at NERSC. Replace descriptions between chevrons ``<>`` +by relevant values, for instance ``<job name>`` could be ``laserWakefield``. + +.. literalinclude:: ../../../Examples/batchScripts/batch_cori.sh + :language: bash + +To run a simulation, copy the lines above to a file ``batch_cori.sh`` and +run +:: + + sbatch batch_cori.sh + +to submit the job. + +For a 3D simulation with a few (1-4) particles per cell using FDTD Maxwell +solver on Cori KNL for a well load-balanced problem (in our case laser +wakefield acceleration simulation in a boosted frame in the quasi-linear +regime), the following set of parameters provided good performance: + +* ``amr.max_grid_size=64`` and ``amr.blocking_factor=64`` so that the size of + each grid is fixed to ``64**3`` (we are not using load-balancing here). + +* **8 MPI ranks per KNL node**, with ``OMP_NUM_THREADS=8`` (that is 64 threads + per KNL node, i.e. 1 thread per physical core, and 4 cores left to the + system). + +* **2 grids per MPI**, *i.e.*, 16 grids per KNL node. + +Running on Summit at OLCF +------------------------- + +The batch script below can be used to run a WarpX simulation on 2 nodes on +the supercomputer Summit at OLCF. Replace descriptions between chevrons ``<>`` +by relevalt values, for instance ``<input file>`` could be +``plasma_mirror_inputs``. Note that the only option so far is to run with one +MPI rank per GPU. + +.. literalinclude:: ../../../Examples/batchScripts/batch_summit.sh + :language: bash + +To run a simulation, copy the lines above to a file ``batch_summit.sh`` and +run +:: + + bsub batch_summit.sh + +to submit the job. + +For a 3D simulation with a few (1-4) particles per cell using FDTD Maxwell +solver on Summit for a well load-balanced problem (in our case laser +wakefield acceleration simulation in a boosted frame in the quasi-linear +regime), the following set of parameters provided good performance: + +* ``amr.max_grid_size=256`` and ``amr.blocking_factor=128``. + +* **One MPI rank per GPU** (e.g., 6 MPI ranks for the 6 GPUs on each Summit + node) + +* **Two `128x128x128` grids per GPU**, or **one `128x128x256` grid per GPU**. + +A batch script with more options regarding profiling on Summit can be found at +:download:`Summit batch script<../../../Examples/Tests/gpu_test/script_profiling.sh>`
\ No newline at end of file diff --git a/Docs/source/running_cpp/running_cpp.rst b/Docs/source/running_cpp/running_cpp.rst index 7d82e55f1..31cecb12f 100644 --- a/Docs/source/running_cpp/running_cpp.rst +++ b/Docs/source/running_cpp/running_cpp.rst @@ -9,3 +9,4 @@ Running WarpX as an executable parameters profiling parallelization + platforms
\ No newline at end of file |