Skip to content
Snippets Groups Projects
Commit 986b613f authored by Jan Siwiec's avatar Jan Siwiec
Browse files

Update amd.md

parent c6ddeed1
Branches
No related tags found
No related merge requests found
Pipeline #32348 failed
...@@ -6,15 +6,18 @@ you need to prepare a job script for that partition or use the interactive job: ...@@ -6,15 +6,18 @@ you need to prepare a job script for that partition or use the interactive job:
``` ```
salloc -N 1 -c 64 -A PROJECT-ID -p p03-amd --gres=gpu:4 --time=08:00:00 salloc -N 1 -c 64 -A PROJECT-ID -p p03-amd --gres=gpu:4 --time=08:00:00
``` ```
where:
- -N 1 means allocating one server,
- -c 64 means allocation 64 cores,
- -A is your project,
- -p p03-amd is AMD partition,
- --gres=gpu:4 means allcating all 4 GPUs of the node,
- --time=08:00:00 means allocation for 8 hours.
You have also an option to allocate subset of the resources only, by reducing the -c and --gres=gpu to smaller values. where:
- `-N 1` means allocating one server,
- `-c 64` means allocation 64 cores,
- `-A` is your project,
- `-p p03-amd` is AMD partition,
- `--gres=gpu:4` means allocating all 4 GPUs of the node,
- `--time=08:00:00` means allocation for 8 hours.
You have also an option to allocate a subset of the resources only,
by reducing the `-c` and `--gres=gpu` to smaller values.
``` ```
salloc -N 1 -c 48 -A PROJECT-ID -p p03-amd --gres=gpu:3 --time=08:00:00 salloc -N 1 -c 48 -A PROJECT-ID -p p03-amd --gres=gpu:3 --time=08:00:00
...@@ -22,22 +25,26 @@ salloc -N 1 -c 32 -A PROJECT-ID -p p03-amd --gres=gpu:2 --time=08:00:00 ...@@ -22,22 +25,26 @@ salloc -N 1 -c 32 -A PROJECT-ID -p p03-amd --gres=gpu:2 --time=08:00:00
salloc -N 1 -c 16 -A PROJECT-ID -p p03-amd --gres=gpu:1 --time=08:00:00 salloc -N 1 -c 16 -A PROJECT-ID -p p03-amd --gres=gpu:1 --time=08:00:00
``` ```
### Note: !!! Note
p03-amd01 server has hyperthreading enabled therefore htop shows 128 cores.
p03-amd02 server has hyperthreading dissabled therefore htop shows 64 cores. p03-amd01 server has hyperthreading **enabled** therefore htop shows 128 cores.
p03-amd02 server has hyperthreading **disabled** therefore htop shows 64 cores.
## Using AMD MI100 GPUs ## Using AMD MI100 GPUs
The AMD GPUs can be programmed using the ROCm open-source platform (see: https://docs.amd.com/ for more information.) The AMD GPUs can be programmed using the ROCm open-source platform
(for more information, see [https://docs.amd.com/][1].)
ROCm and related libraries are installed directly in the system.
You can find it here:
ROCm and related libraries are installed directly in the system. You can find it here:
``` ```
/opt/rocm/ /opt/rocm/
``` ```
The actual version can be found here:
The actual version can be found here:
``` ```
[user@p03-amd02.cs]$ cat /opt/rocm/.info/version [user@p03-amd02.cs]$ cat /opt/rocm/.info/version
...@@ -46,9 +53,11 @@ The actual version can be found here: ...@@ -46,9 +53,11 @@ The actual version can be found here:
## Basic HIP code ## Basic HIP code
The first way how to program AMD GPUs is to use HIP. The first way how to program AMD GPUs is to use HIP.
The basic vector addition code in HIP looks like this. This a full code and you can copy and paste it into a file. For this example we use `vector_add.hip.cpp` . The basic vector addition code in HIP looks like this.
This a full code and you can copy and paste it into a file.
For this example, we use `vector_add.hip.cpp`.
``` ```
#include <cstdio> #include <cstdio>
...@@ -123,44 +132,46 @@ int main() ...@@ -123,44 +132,46 @@ int main()
} }
``` ```
To compile the code we use `hipcc` compiler. The compiler information can be found like this: To compile the code, we use `hipcc` compiler.
The compiler information can be found like this:
```` ```
[user@p03-amd02.cs ~]$ hipcc --version [user@p03-amd02.cs ~]$ hipcc --version
HIP version: 5.5.30202-eaf00c0b HIP version: 5.5.30202-eaf00c0b
AMD clang version 16.0.0 (https://github.com/RadeonOpenCompute/llvm-project roc-5.5.1 23194 69ef12a7c3cc5b0ccf820bc007bd87e8b3ac3037) AMD clang version 16.0.0 (https://github.com/RadeonOpenCompute/llvm-project roc-5.5.1 23194 69ef12a7c3cc5b0ccf820bc007bd87e8b3ac3037)
Target: x86_64-unknown-linux-gnu Target: x86_64-unknown-linux-gnu
Thread model: posix Thread model: posix
InstalledDir: /opt/rocm-5.5.1/llvm/bin InstalledDir: /opt/rocm-5.5.1/llvm/bin
```` ```
The code is compiled a follows: The code is compiled as follows:
``` ```
hipcc vector_add.hip.cpp -o vector_add.x hipcc vector_add.hip.cpp -o vector_add.x
``` ```
The correct output of the code is: The correct output of the code is:
``` ```
[user@p03-amd02.cs ~]$ ./vector_add.x [user@p03-amd02.cs ~]$ ./vector_add.x
X: 0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 X: 0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00
Y: 0.00 10.00 20.00 30.00 40.00 50.00 60.00 70.00 80.00 90.00 Y: 0.00 10.00 20.00 30.00 40.00 50.00 60.00 70.00 80.00 90.00
Y: 0.00 110.00 220.00 330.00 440.00 550.00 660.00 770.00 880.00 990.00 Y: 0.00 110.00 220.00 330.00 440.00 550.00 660.00 770.00 880.00 990.00
``` ```
## HIP and ROCm libraries ## HIP and ROCm Libraries
The list of official AMD libraries can be found here: https://docs.amd.com/category/libraries.
The list of official AMD libraries can be found here: [https://docs.amd.com/category/libraries][2].
The libraries are installed in the same directory as ROCm
The libraries are installed in the same directory is ROCm
``` ```
/opt/rocm/ /opt/rocm/
``` ```
Following libraries are installed: Following libraries are installed:
``` ```
drwxr-xr-x 4 root root 44 Jun 7 14:09 hipblas drwxr-xr-x 4 root root 44 Jun 7 14:09 hipblas
drwxr-xr-x 3 root root 17 Jun 7 14:09 hipblas-clients drwxr-xr-x 3 root root 17 Jun 7 14:09 hipblas-clients
...@@ -172,7 +183,7 @@ drwxr-xr-x 4 root root 44 Jun 7 14:09 hipsolver ...@@ -172,7 +183,7 @@ drwxr-xr-x 4 root root 44 Jun 7 14:09 hipsolver
drwxr-xr-x 4 root root 44 Jun 7 14:09 hipsparse drwxr-xr-x 4 root root 44 Jun 7 14:09 hipsparse
``` ```
and and
``` ```
drwxr-xr-x 4 root root 32 Jun 7 14:09 rocalution drwxr-xr-x 4 root root 32 Jun 7 14:09 rocalution
...@@ -185,11 +196,11 @@ drwxr-xr-x 4 root root 44 Jun 7 14:09 rocsparse ...@@ -185,11 +196,11 @@ drwxr-xr-x 4 root root 44 Jun 7 14:09 rocsparse
drwxr-xr-x 3 root root 29 Jun 7 14:09 rocthrust drwxr-xr-x 3 root root 29 Jun 7 14:09 rocthrust
``` ```
### Using hipBlas Library
The basic code in HIP that uses hipBlas looks like this.
### Using hipBlas library This is a full code and you can copy and paste it into a file.
For this example we use `hipblas.hip.cpp`.
The basic code in HIP that uses hipBlas looks like this. This a full code and you can copy and paste it into a file. For this example we use `hipblas.hip.cpp` .
``` ```
#include <cstdio> #include <cstdio>
...@@ -304,14 +315,17 @@ int main() ...@@ -304,14 +315,17 @@ int main()
} }
``` ```
The code compilation can be done as follows: The code compilation can be done as follows:
``` ```
hipcc hipblas.hip.cpp -o hipblas.x -lhipblas hipcc hipblas.hip.cpp -o hipblas.x -lhipblas
``` ```
### Using hipSolver library ### Using hipSolver Library
The basic code in HIP that uses hipSolver looks like this. This a full code and you can copy and paste it into a file. For this example we use `hipsolver.hip.cpp` . The basic code in HIP that uses hipSolver looks like this.
This a full code and you can copy and paste it into a file.
For this example we use `hipsolver.hip.cpp`.
``` ```
#include <cstdio> #include <cstdio>
...@@ -439,18 +453,18 @@ int main() ...@@ -439,18 +453,18 @@ int main()
} }
``` ```
The code compilation can be done as follows: The code compilation can be done as follows:
``` ```
hipcc hipsolver.hip.cpp -o hipsolver.x -lhipblas -lhipsolver hipcc hipsolver.hip.cpp -o hipsolver.x -lhipblas -lhipsolver
``` ```
### Other AMD libraries and frameworks ### Other AMD Libraries and Frameworks
Please see [gcc options](https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html) for more advanced compilation settings. Please see [gcc options](https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html) for more advanced compilation settings.
No complications are expected as long as the application does not use any intrinsic for `x64` architecture. No complications are expected as long as the application does not use any intrinsic for `x64` architecture.
If you want to use intrinsic, If you want to use intrinsic,
[SVE](https://developer.arm.com/documentation/102699/0100/Optimizing-with-intrinsics) instruction set is available. [SVE](https://developer.arm.com/documentation/102699/0100/Optimizing-with-intrinsics) instruction set is available.
\ No newline at end of file
[1]: https://docs.amd.com/
[2]: https://docs.amd.com/category/libraries
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment