diff --git a/docs.it4i/anselm/software/nvidia-cuda.md b/docs.it4i/anselm/software/nvidia-cuda.md index 406c5e4b6d3faddf2dd142a9815a644d9aedd1ed..b87e4cc626a0721e87c12a86af6fe223236c6679 100644 --- a/docs.it4i/anselm/software/nvidia-cuda.md +++ b/docs.it4i/anselm/software/nvidia-cuda.md @@ -197,11 +197,11 @@ $ ./test.cuda ### cuBLAS -The NVIDIA CUDA Basic Linear Algebra Subroutines (cuBLAS) library is a GPU-accelerated version of the complete standard BLAS library with 152 standard BLAS routines. A basic description of the library together with basic performance comparisons with MKL can be found [here](https://developer.nvidia.com/cublas "Nvidia cuBLAS"). +The NVIDIA CUDA Basic Linear Algebra Subroutines (cuBLAS) library is a GPU-accelerated version of the complete standard BLAS library with 152 standard BLAS routines. A basic description of the library together with basic performance comparisons with MKL can be found [here][a]. #### cuBLAS Example: SAXPY -The SAXPY function multiplies the vector x by the scalar alpha, and adds it to the vector y, overwriting the latest vector with the result. A description of the cuBLAS function can be found in [NVIDIA CUDA documentation](http://docs.nvidia.com/cuda/cublas/index.html#cublas-lt-t-gt-axpy "Nvidia CUDA documentation "). Code can be pasted in the file and compiled without any modification. +The SAXPY function multiplies the vector x by the scalar alpha, and adds it to the vector y, overwriting the latest vector with the result. A description of the cuBLAS function can be found in [NVIDIA CUDA documentation][b]. Code can be pasted in the file and compiled without any modification. ```cpp /* Includes, system */ @@ -283,8 +283,8 @@ int main(int argc, char **argv) !!! note cuBLAS has its own function for data transfers between CPU and GPU memory: - - [cublasSetVector](http://docs.nvidia.com/cuda/cublas/index.html#cublassetvector) - transfers data from CPU to GPU memory - - [cublasGetVector](http://docs.nvidia.com/cuda/cublas/index.html#cublasgetvector) - transfers data from GPU to CPU memory + - [cublasSetVector][c] - transfers data from CPU to GPU memory + - [cublasGetVector][d] - transfers data from GPU to CPU memory To compile the code using the NVCC compiler a "-lcublas" compiler flag has to be specified: @@ -307,3 +307,8 @@ $ ml cuda $ ml intel $ icc -std=c99 test_cublas.c -o test_cublas_icc -lcublas -lcudart ``` + +[a]: https://developer.nvidia.com/cublas +[b]: http://docs.nvidia.com/cuda/cublas/index.html#cublas-lt-t-gt-axpy +[c]: http://docs.nvidia.com/cuda/cublas/index.html#cublassetvector +[d]: http://docs.nvidia.com/cuda/cublas/index.html#cublasgetvector