diff --git a/docs.it4i/lumi/pytorch.md b/docs.it4i/lumi/pytorch.md
new file mode 100644
index 0000000000000000000000000000000000000000..fa3e54afde555a7d886a2890ac7e29cbf5a87def
--- /dev/null
+++ b/docs.it4i/lumi/pytorch.md
@@ -0,0 +1,194 @@
+# PyTorch
+
+## PyTorch Highlight
+
+* Official page: [https://pytorch.org/][1]
+* Code: [https://github.com/pytorch/pytorch][2]
+* Python-based framework for machine learning
+  * Auto-differentiation on tensor types
+* Official LUMI page: [https://lumi-supercomputer.github.io/LUMI-EasyBuild-docs/p/PyTorch/][3]
+  * **Warning:** be careful where the SIF image is installed or copied ($HOME is not recommended for quota reasons). For EasyBuild you must specify the installation path: `eb PyTorch.eb -r --prefix=$PWD/easybuild`.
+
+## PyTorch Install
+
+### Base Environment
+
+```console
+module purge
+module load CrayEnv
+module load PrgEnv-cray/8.3.3
+module load craype-accel-amd-gfx90a
+module load cray-python
+
+# Default ROCm – more recent versions are preferable (e.g. ROCm 5.6.0)
+module load rocm/5.2.3.lua
+```
+
+### Scripts
+
+* natively
+    * [01-install-direct-torch1.13.1-rocm5.2.3.sh](scripts/install/01-install-direct-torch1.13.1-rocm5.2.3.sh)
+    * [01-install-direct-torch2.1.2-rocm5.5.3.sh](scripts/install/01-install-direct-torch2.1.2-rocm5.5.3.sh)
+* virtual env
+    * [02-install-venv-torch1.13.1-rocm5.2.3.sh](scripts/install/02-install-venv-torch1.13.1-rocm5.2.3.sh)
+    * [02-install-venv-torch2.1.2-rocm5.5.3.sh](scripts/install/02-install-venv-torch2.1.2-rocm5.5.3.sh)
+* conda env
+    * [03-install-conda-torch1.13.1-rocm5.2.3.sh](scripts/install/03-install-conda-torch1.13.1-rocm5.2.3.sh)
+    * [03-install-conda-torch2.1.2-rocm5.5.3.sh](scripts/install/03-install-conda-torch2.1.2-rocm5.5.3.sh)
+    * from source: [04-install-source-torch1.13.1-rocm5.2.3.sh](scripts/install/04-install-source-torch1.13.1-rocm5.2.3.sh)
+* containers (singularity)
+    * [05-install-container-torch2.0.1-rocm5.5.1.sh](scripts/install/05-install-container-torch2.0.1-rocm5.5.1.sh)
+    * [05-install-container-torch2.1.0-rocm5.6.1.sh](scripts/install/05-install-container-torch2.1.0-rocm5.6.1.sh)
+
+## PyTorch Tests
+
+### Run Interactive Job on Single Node
+
+```console
+salloc -A project_XXX --partition=standard-g -N 1 -n 1 --gpus 8 -t 01:00:00
+```
+
+### Scripts
+
+* natively
+    * [01-simple-test-direct-torch1.13.1-rocm5.2.3.sh](scripts/tests/01-simple-test-direct-torch1.13.1-rocm5.2.3.sh)
+    * [01-simple-test-direct-torch2.1.2-rocm5.5.3.sh](scripts/tests/01-simple-test-direct-torch2.1.2-rocm5.5.3.sh)
+* virtual env
+    * [02-simple-test-venv-torch1.13.1-rocm5.2.3.sh](scripts/tests/02-simple-test-venv-torch1.13.1-rocm5.2.3.sh)
+    * [02-simple-test-venv-torch2.1.2-rocm5.5.3.sh](scripts/tests/02-simple-test-venv-torch2.1.2-rocm5.5.3.sh)
+* conda env
+    * [03-simple-test-conda-torch1.13.1-rocm5.2.3.sh](scripts/tests/03-simple-test-conda-torch1.13.1-rocm5.2.3.sh)
+    * [03-simple-test-conda-torch2.1.2-rocm5.5.3.sh](scripts/tests/03-simple-test-conda-torch2.1.2-rocm5.5.3.sh)
+    * from source: [04-simple-test-source-torch1.13.1-rocm5.2.3.sh](scripts/tests/04-simple-test-source-torch1.13.1-rocm5.2.3.sh)
+* containers (singularity)
+    * [05-simple-test-container-torch2.0.1-rocm5.5.1.sh](scripts/tests/05-simple-test-container-torch2.0.1-rocm5.5.1.sh)
+    * [05-simple-test-container-torch2.1.0-rocm5.6.1.sh](scripts/tests/05-simple-test-container-torch2.1.0-rocm5.6.1.sh)
+
+### Run Interactive Job on Multiple Nodes
+
+```
+salloc -A project_XXX --partition=standard-g -N 2 -n 16 --gpus 16 -t 01:00:00
+```
+
+### Scripts
+
+* containers (singularity)
+    * [07-mnist-distributed-learning-container-torch2.0.1-rocm5.5.1.sh](scripts/tests/07-mnist-distributed-learning-container-torch2.0.1-rocm5.5.1.sh)
+    * [07-mnist-distributed-learning-container-torch2.1.0-rocm5.6.1.sh](scripts/tests/07-mnist-distributed-learning-container-torch2.1.0-rocm5.6.1.sh)
+    * [08-cnn-distributed-container-torch2.0.1-rocm5.5.1.sh](scripts/tests/08-cnn-distributed-container-torch2.0.1-rocm5.5.1.sh)
+    * [08-cnn-distributed-container-torch2.1.0-rocm5.6.1.sh](scripts/tests/08-cnn-distributed-container-torch2.1.0-rocm5.6.1.sh)
+
+## Tips
+
+### Official Containers
+
+```
+ls -la /appl/local/containers/easybuild-sif-images/
+```
+
+### Unofficial Versions of ROCM
+
+```
+module use /pfs/lustrep2/projappl/project_462000125/samantao-public/mymodules
+ml rocm/5.4.3
+ml rocm/5.6.0
+```
+
+### Unofficial Containers
+
+```
+ls -la /pfs/lustrep2/projappl/project_462000125/samantao-public/containers/
+```
+
+### Installing Python Modules in Containers
+
+```console
+#!/bin/bash
+
+wd=$(pwd)
+SIF=/pfs/lustrep2/projappl/project_462000125/samantao-public/containers/lumi-pytorch-rocm-5.6.1-python-3.10-pytorch-v2.1.0-dockerhash-aa8dbea5e0e4.sif
+
+rm -rf $wd/setup-me.sh
+cat > $wd/setup-me.sh << EOF
+#!/bin/bash -e
+
+\$WITH_CONDA
+pip3 install scipy h5py tqdm
+EOF
+chmod +x $wd/setup-me.sh
+
+mkdir -p $wd/pip_install
+
+srun -n 1 --gpus 8 singularity exec \
+-B /var/spool/slurmd:/var/spool/slurmd \
+-B /opt/cray:/opt/cray \
+-B /usr/lib64/libcxi.so.1:/usr/lib64/libcxi.so.1 \
+-B $wd:/workdir \
+-B $wd/pip_install:$HOME/.local/lib \
+$SIF /workdir/setup-me.sh
+
+# Add the path of pip_install to singularity-exec in run.sh:
+# -B $wd/pip_install:$HOME/.local/lib \
+```
+
+### Controlling Device Visibility
+
+* `HIP_VISIBLE_DEVICES=0,1,2,3 python -c 'import torch; print(torch.cuda.device_count())'`
+* `ROCR_VISIBLE_DEVICES=0,1,2,3 python -c 'import torch; print(torch.cuda.device_count())'`
+* SLURM sets `ROCR_VISIBLE_DEVICES`
+* Implications of both ways of setting visibility – blit kernels and/or DMA
+
+### RCCL
+
+* The problem – on startup we can see:
+    * `NCCL error in: /pfs/lustrep2/projappl/project_462000125/samantao/pytorchexample/pytorch/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1269, unhandled system error, NCCL version 2.12.12`
+* Checking error origin:
+    * `export NCCL_DEBUG=INFO`
+    * `NCCL INFO NET/Socket : Using [0]nmn0:10.120.116.65<0> [1]hsn0:10.253.6.67<0> [2]hsn1:10.253.6.68<0>[3]hsn2:10.253.2.12<0> [4]hsn3:10.253.2.11<0>`
+    * `NCCL INFO /long_pathname_so_that_rpms_can_package_the_debug_info/data/driver/rccl/src/init.cc:1292`
+* The fix:
+    * `export NCCL_SOCKET_IFNAME=hsn0,hsn1,hsn2,hsn3`
+
+### RCCL AWS-CXI Plugin
+
+* RCCL relies on runtime plugin-ins to connect with some transport layers
+    * Libfabric – provider for Slingshot
+* Hipified plugin adapted from AWS OpenFabrics support available
+* [https://github.com/ROCmSoftwarePlatform/aws-ofi-rccl][7]
+* 3-4x faster collectives
+* Plugin needs to be pointed at by the loading environment
+
+```console
+module use /pfs/lustrep2/projappl/project_462000125/samantao-public/mymodules
+module load aws-ofi-rccl/rocm-5.2.3.lua
+# Or
+export LD_LIBRARY_PATH=/pfs/lustrep2/projappl/project_462000125/samantao-public/apps-rocm-5.2.3/aws-ofirccl
+# (will detect librccl-net.so)
+```
+
+* Verify the plugin is detected
+
+```console
+export NCCL_DEBUG=INFO
+export NCCL_DEBUG_SUBSYS=INIT
+# and search the logs for:
+# [0] NCCL INFO NET/OFI Using aws-ofi-rccl 1.4.0
+```
+
+### amdgpu.ids Issue
+
+[https://github.com/pytorch/builder/issues/1410][4]
+
+## References
+
+* Samuel Antao (AMD), LUMI Courses
+* [https://lumi-supercomputer.github.io/LUMI-training-materials/4day-20230530/extra_4_10_Best_Practices_GPU_Optimization/][5]
+* [https://lumi-supercomputer.github.io/LUMI-training-materials/4day-20231003/extra_4_10_Best_Practices_GPU_Optimization/][6]
+
+[1]: https://pytorch.org/
+[2]: https://github.com/pytorch/pytorch
+[3]: https://lumi-supercomputer.github.io/LUMI-EasyBuild-docs/p/PyTorch/
+[4]: https://github.com/pytorch/builder/issues/1410
+[5]: https://lumi-supercomputer.github.io/LUMI-training-materials/4day-20230530/extra_4_10_Best_Practices_GPU_Optimization/
+[6]: https://lumi-supercomputer.github.io/LUMI-training-materials/4day-20231003/extra_4_10_Best_Practices_GPU_Optimization/
+[7]: https://github.com/ROCmSoftwarePlatform/aws-ofi-rccl
diff --git a/docs.it4i/lumi/software.md b/docs.it4i/lumi/software.md
index bc414036e2a123a2599c049be2f3f21ee1e86759..ce11667840e101a3b1cf94cabd303f648dc32469 100644
--- a/docs.it4i/lumi/software.md
+++ b/docs.it4i/lumi/software.md
@@ -1,11 +1,19 @@
 # LUMI Software
 
-Below are the guides for selected [LUMI Software modules][1]:
+Below are links to LUMI guides for selected [LUMI Software modules][1]:
 
-## How to Run PyTorch on Lumi-G AMD GPU Accelerators
+## PyTorch
 
 [PyTorch][8] is an optimized tensor library for deep learning using GPUs and CPUs.
 
+### Comprehensive Guide on PyTorch
+
+See the [PyTorch][a] subsection for guides on how to install PyTorch and run interactive jobs.
+
+### How to Run PyTorch on Lumi-G AMD GPU Accelerators
+
+Link to LUMI guide on how to run PyTorch on LUMI GPUs:
+
 [https://docs.lumi-supercomputer.eu/software/packages/pytorch/][2]
 
 ## How to Run Gromacs on Lumi-G AMD GPU Accelerators
@@ -53,3 +61,5 @@ Conda is an open-source, cross-platform,language-agnostic package manager and en
 [6]: https://docs.lumi-supercomputer.eu/software/local/csc/
 [7]: https://docs.lumi-supercomputer.eu/software/installing/container-wrapper/
 [8]: https://pytorch.org/docs/stable/index.html
+
+[a]: pytorch.md
diff --git a/mkdocs.yml b/mkdocs.yml
index c33781664b100c4bbcd0385ed240fca6ae478125..9ed42908f3b81b34a856cc4630c1699418f64ec9 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -302,7 +302,9 @@ nav:
       - VESTA: software/viz/vesta.md
   - LUMI:
       - About LUMI: lumi/about.md
-      - LUMI Software: lumi/software.md
+      - LUMI Software:
+        - General: lumi/software.md
+        - PyTorch: lumi/pytorch.md
       - LUMI Support: lumi/support.md
   - Clouds:
     - e-INFRA CZ Cloud: cloud/einfracz-cloud.md