aboutsummaryrefslogtreecommitdiff
path: root/Docs/source/running_cpp
diff options
context:
space:
mode:
Diffstat (limited to 'Docs/source/running_cpp')
-rw-r--r--Docs/source/running_cpp/profiling.rst57
1 files changed, 23 insertions, 34 deletions
diff --git a/Docs/source/running_cpp/profiling.rst b/Docs/source/running_cpp/profiling.rst
index 4ab311295..fd00a7c0d 100644
--- a/Docs/source/running_cpp/profiling.rst
+++ b/Docs/source/running_cpp/profiling.rst
@@ -1,46 +1,35 @@
Profiling the code
==================
-Profiling with AMREX's built-in profiling tools
------------------------------------------------
-See `this page <https://amrex-codes.github.io/amrex/docs_html/Chapter12.html>`__ in the AMReX documentation.
-
-
-Profiling the code with Intel Advisor on NERSC
-----------------------------------------------
-
-Follow these steps:
-
-- Instrument the code during compilation
+Profiling allows us to find the bottle-necks of the code as it is currently implemented.
+Bottle-necks are the parts of the code that may delay the simulation, making it more computationally expensive.
+Once found, we can update the related code sections and improve its efficiency.
+Profiling tools can also be used to check how load balanced the simulation is, i.e. if the work is well distributed accross all MPI ranks used.
+Load balancing can be activated in WarpX by setting input parameters, see the parallelization section at :doc:`parameters`.
- ::
-
- module swap craype-haswell craype-mic-knl
- make -j 16 COMP=intel USE_VTUNE=TRUE
-
- (where the first line is only needed for KNL)
-
-- In your SLURM submission script, use the following
- lines in order to run the executable. (In addition
- to setting the usual ``OMP`` environment variables.)
+Profiling with AMReX's built-in profiling tools
+-----------------------------------------------
- ::
+By default, WarpX uses the AMReX baseline tool, the TINYPROFILER, to evaluate the time information for different parts of the code (functions) between the different MPI ranks.
+The results, timers, are stored into four tables in the standard output, stdout, that are located below the simulation steps information and above the warnings regarding unused input file parameters (if there were any).
- module load advisor
- export ADVIXE_EXPERIMENTAL=roofline
- srun -n <n_mpi> -c <n_logical_cores_per_mpi> --cpu_bind=cores advixe-cl -collect survey -project-dir advisor -trace-mpi -- <warpx_executable> inputs
- srun -n <n_mpi> -c <n_logical_cores_per_mpi> --cpu_bind=cores advixe-cl -collect tripcounts -flop -project-dir advisor -trace-mpi -- <warpx_executable> inputs
+The timers are displayed in tables for which the columns correspond to:
- where ``<n_mpi>`` and ``<n_logical_cores_per_mpi>`` should be replaced by
- the proper values, and ``<warpx_executable>`` should be replaced by the
- name of the WarpX executable.
+* name of the function
+* number of times it is called in total
+* minimum of time spent exclusively/inclusively in it, between all ranks
+* average of time, between all ranks
+* maximum time, between all ranks
+* maximum percentage of time spent, accross all ranks
-- Launch the Intel Advisor GUI
+If the simulation is well load balanced the minimum, average and maximum times should be identical.
- ::
+The top two tables refer to the complete simulation information.
+The bottom two are related to the Evolve() section of the code (where each time step is computed).
- module load advisor
- advixe-gui
+Each set of two timers show the exclusive, top, and inclusive, bottom, information depending on wether the time spent in nested sections of the codes are included.
- (Note: this requires to use ``ssh -XY`` when connecting to Cori.)
+For more detailed information please visit the `AMReX profiling documentation <https://amrex-codes.github.io/amrex/docs_html/AMReX_Profiling_Tools_Chapter.html>`__.
+.. note:
+ When creating performance-related issues on the WarpX GitHub repo, please include Tiny Profiler tables (besides the usual issue description, input file and submission script), or (even better) the whole standard output.