From 591495cb97d6e704d74ef92ad1637668f51cdd7e Mon Sep 17 00:00:00 2001
From: Jan Siwiec <jan.siwiec@vsb.cz>
Date: Thu, 15 Feb 2024 08:59:58 +0100
Subject: [PATCH] Update file grace.md

---
 docs.it4i/cs/guides/grace.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs.it4i/cs/guides/grace.md b/docs.it4i/cs/guides/grace.md
index d51729b31..a2fba21d7 100644
--- a/docs.it4i/cs/guides/grace.md
+++ b/docs.it4i/cs/guides/grace.md
@@ -40,7 +40,7 @@ for(int i = 0; i < 1000000; ++i) {
 ```
 may emit scalar code for the inner loop leading to no vectorization being used at all.
 
-### Clang (for Grace) Toolchain
+### Clang (For Grace) Toolchain
 
 The Clang/LLVM tends to behave similarly, but can be guided to properly vectorize the inner loop with either flags `-O3 -ffast-math -march=native -fno-unroll-loops -mllvm -force-vector-width=8` or pragmas such as `#pragma clang loop vectorize_width(8)` and `#pragma clang loop unroll(disable)`.
 
@@ -257,7 +257,7 @@ OMP_NUM_THREADS=144 OMP_PROC_BIND=spread ./main
 !!! note
     It may be advantageous to use NVPL libraries instead NVHPC ones. For example DGEMM BLAS 3 routine from NVPL is almost 30% faster than NVHPC one.
 
-### Using Clang (for Grace) Toolchain
+### Using Clang (For Grace) Toolchain
 
 Similarly Clang for Grace toolchain with NVPL BLAS can be used to compile C++ version of the example.
 
-- 
GitLab