diff --git a/docs.it4i/cs/amd.md b/docs.it4i/cs/amd.md index 0cc12f3934c632359b6bbc25c5dfaeaa16c4ad9c..423f081df1ecc77103bc95bf8b4e5d521dc2f1b5 100644 --- a/docs.it4i/cs/amd.md +++ b/docs.it4i/cs/amd.md @@ -6,15 +6,18 @@ you need to prepare a job script for that partition or use the interactive job: ``` salloc -N 1 -c 64 -A PROJECT-ID -p p03-amd --gres=gpu:4 --time=08:00:00 ``` -where: -- -N 1 means allocating one server, -- -c 64 means allocation 64 cores, -- -A is your project, -- -p p03-amd is AMD partition, -- --gres=gpu:4 means allcating all 4 GPUs of the node, -- --time=08:00:00 means allocation for 8 hours. -You have also an option to allocate subset of the resources only, by reducing the -c and --gres=gpu to smaller values. +where: + +- `-N 1` means allocating one server, +- `-c 64` means allocation 64 cores, +- `-A` is your project, +- `-p p03-amd` is AMD partition, +- `--gres=gpu:4` means allocating all 4 GPUs of the node, +- `--time=08:00:00` means allocation for 8 hours. + +You have also an option to allocate a subset of the resources only, +by reducing the `-c` and `--gres=gpu` to smaller values. ``` salloc -N 1 -c 48 -A PROJECT-ID -p p03-amd --gres=gpu:3 --time=08:00:00 @@ -22,22 +25,26 @@ salloc -N 1 -c 32 -A PROJECT-ID -p p03-amd --gres=gpu:2 --time=08:00:00 salloc -N 1 -c 16 -A PROJECT-ID -p p03-amd --gres=gpu:1 --time=08:00:00 ``` -### Note: - -p03-amd01 server has hyperthreading enabled therefore htop shows 128 cores. +!!! Note -p03-amd02 server has hyperthreading dissabled therefore htop shows 64 cores. + p03-amd01 server has hyperthreading **enabled** therefore htop shows 128 cores. + p03-amd02 server has hyperthreading **disabled** therefore htop shows 64 cores. ## Using AMD MI100 GPUs -The AMD GPUs can be programmed using the ROCm open-source platform (see: https://docs.amd.com/ for more information.) +The AMD GPUs can be programmed using the ROCm open-source platform +(for more information, see [https://docs.amd.com/][1].) + +ROCm and related libraries are installed directly in the system. +You can find it here: -ROCm and related libraries are installed directly in the system. You can find it here: ``` /opt/rocm/ ``` -The actual version can be found here: + +The actual version can be found here: + ``` [user@p03-amd02.cs]$ cat /opt/rocm/.info/version @@ -46,9 +53,11 @@ The actual version can be found here: ## Basic HIP code -The first way how to program AMD GPUs is to use HIP. +The first way how to program AMD GPUs is to use HIP. -The basic vector addition code in HIP looks like this. This a full code and you can copy and paste it into a file. For this example we use `vector_add.hip.cpp` . +The basic vector addition code in HIP looks like this. +This a full code and you can copy and paste it into a file. +For this example, we use `vector_add.hip.cpp`. ``` #include <cstdio> @@ -123,44 +132,46 @@ int main() } ``` -To compile the code we use `hipcc` compiler. The compiler information can be found like this: +To compile the code, we use `hipcc` compiler. +The compiler information can be found like this: -```` -[user@p03-amd02.cs ~]$ hipcc --version +``` +[user@p03-amd02.cs ~]$ hipcc --version HIP version: 5.5.30202-eaf00c0b AMD clang version 16.0.0 (https://github.com/RadeonOpenCompute/llvm-project roc-5.5.1 23194 69ef12a7c3cc5b0ccf820bc007bd87e8b3ac3037) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /opt/rocm-5.5.1/llvm/bin -```` +``` -The code is compiled a follows: +The code is compiled as follows: ``` hipcc vector_add.hip.cpp -o vector_add.x ``` -The correct output of the code is: +The correct output of the code is: + ``` -[user@p03-amd02.cs ~]$ ./vector_add.x +[user@p03-amd02.cs ~]$ ./vector_add.x X: 0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 Y: 0.00 10.00 20.00 30.00 40.00 50.00 60.00 70.00 80.00 90.00 Y: 0.00 110.00 220.00 330.00 440.00 550.00 660.00 770.00 880.00 990.00 ``` -## HIP and ROCm libraries - -The list of official AMD libraries can be found here: https://docs.amd.com/category/libraries. +## HIP and ROCm Libraries +The list of official AMD libraries can be found here: [https://docs.amd.com/category/libraries][2]. +The libraries are installed in the same directory as ROCm -The libraries are installed in the same directory is ROCm ``` /opt/rocm/ ``` -Following libraries are installed: +Following libraries are installed: + ``` drwxr-xr-x 4 root root 44 Jun 7 14:09 hipblas drwxr-xr-x 3 root root 17 Jun 7 14:09 hipblas-clients @@ -172,7 +183,7 @@ drwxr-xr-x 4 root root 44 Jun 7 14:09 hipsolver drwxr-xr-x 4 root root 44 Jun 7 14:09 hipsparse ``` -and +and ``` drwxr-xr-x 4 root root 32 Jun 7 14:09 rocalution @@ -185,11 +196,11 @@ drwxr-xr-x 4 root root 44 Jun 7 14:09 rocsparse drwxr-xr-x 3 root root 29 Jun 7 14:09 rocthrust ``` +### Using hipBlas Library - -### Using hipBlas library - -The basic code in HIP that uses hipBlas looks like this. This a full code and you can copy and paste it into a file. For this example we use `hipblas.hip.cpp` . +The basic code in HIP that uses hipBlas looks like this. +This is a full code and you can copy and paste it into a file. +For this example we use `hipblas.hip.cpp`. ``` #include <cstdio> @@ -304,14 +315,17 @@ int main() } ``` -The code compilation can be done as follows: +The code compilation can be done as follows: + ``` hipcc hipblas.hip.cpp -o hipblas.x -lhipblas ``` -### Using hipSolver library +### Using hipSolver Library -The basic code in HIP that uses hipSolver looks like this. This a full code and you can copy and paste it into a file. For this example we use `hipsolver.hip.cpp` . +The basic code in HIP that uses hipSolver looks like this. +This a full code and you can copy and paste it into a file. +For this example we use `hipsolver.hip.cpp`. ``` #include <cstdio> @@ -439,18 +453,18 @@ int main() } ``` -The code compilation can be done as follows: +The code compilation can be done as follows: + ``` hipcc hipsolver.hip.cpp -o hipsolver.x -lhipblas -lhipsolver ``` -### Other AMD libraries and frameworks - - - - +### Other AMD Libraries and Frameworks Please see [gcc options](https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html) for more advanced compilation settings. No complications are expected as long as the application does not use any intrinsic for `x64` architecture. If you want to use intrinsic, -[SVE](https://developer.arm.com/documentation/102699/0100/Optimizing-with-intrinsics) instruction set is available. \ No newline at end of file +[SVE](https://developer.arm.com/documentation/102699/0100/Optimizing-with-intrinsics) instruction set is available. + +[1]: https://docs.amd.com/ +[2]: https://docs.amd.com/category/libraries