Please-notes go to /dev/null

e4dc6645 · David Hrbáč · 201844d6 · e4dc6645 · e4dc6645 · e4dc6645
Commit e4dc6645 authored 8 years ago by David Hrbáč
--- a/docs.it4i/anselm-cluster-documentation/capacity-computing.md
+++ b/docs.it4i/anselm-cluster-documentation/capacity-computing.md
@@ -216,17 +216,18 @@ $ qsub -N JOBNAME jobscript
 In this example, we submit a job of 101 tasks. 16 input files will be processed in  parallel. The 101 tasks on 16 cores are assumed to complete in less than 2 hours.
-Please note the #PBS directives in the beginning of the jobscript file, dont' forget to set your valid PROJECT_ID and desired queue.
+!!! Hint
+    Use #PBS directives in the beginning of the jobscript file, dont' forget to set your valid PROJECT_ID and desired queue.
 ## Job Arrays and GNU Parallel
 !!! Note
-	Combine the Job arrays and GNU parallel for best throughput of single core jobs
+    Combine the Job arrays and GNU parallel for best throughput of single core jobs
 While job arrays are able to utilize all available computational nodes, the GNU parallel can be used to efficiently run multiple single-core jobs on single node. The two approaches may be combined to utilize all available (current and future) resources to execute single core jobs.
 !!! Note
-	Every subjob in an array runs GNU parallel to utilize all cores on the node
+    Every subjob in an array runs GNU parallel to utilize all cores on the node
 ### GNU Parallel, Shared jobscript
@@ -281,7 +282,7 @@ cp output $PBS_O_WORKDIR/$TASK.out
 In this example, the jobscript executes in multiple instances in parallel, on all cores of a computing node.  Variable $TASK expands to one of the input filenames from tasklist. We copy the input file to local scratch, execute the myprog.x and copy the output file back to the submit directory, under the $TASK.out name.  The numtasks file controls how many tasks will be run per subjob. Once an task is finished, new task starts, until the number of tasks  in numtasks file is reached.
 !!! Note
-	Select  subjob walltime and number of tasks per subjob  carefully
+    Select  subjob walltime and number of tasks per subjob  carefully
 When deciding this values, think about following guiding rules:
@@ -300,7 +301,8 @@ $ qsub -N JOBNAME -J 1-992:32 jobscript
 In this example, we submit a job array of 31 subjobs. Note the  -J 1-992:**32**, this must be the same as the number sent to numtasks file. Each subjob will run on full node and process 16 input files in parallel, 32 in total per subjob.  Every subjob is assumed to complete in less than 2 hours.
-Please note the #PBS directives in the beginning of the jobscript file, dont' forget to set your valid PROJECT_ID and desired queue.
+!!! Hint
+    Use #PBS directives in the beginning of the jobscript file, dont' forget to set your valid PROJECT_ID and desired queue.
 ## Examples

--- a/docs.it4i/anselm-cluster-documentation/prace.md
+++ b/docs.it4i/anselm-cluster-documentation/prace.md
@@ -233,9 +233,12 @@ The resources that are currently subject to accounting are the core hours. The c
 PRACE users should check their project accounting using the [PRACE Accounting Tool (DART)](http://www.prace-ri.eu/accounting-report-tool/).
-Users who have undergone the full local registration procedure (including signing the IT4Innovations Acceptable Use Policy) and who have received local password may check at any time, how many core-hours have been consumed by themselves and their projects using the command "it4ifree". Please note that you need to know your user password to use the command and that the displayed core hours are "system core hours" which differ from PRACE "standardized core hours".
+Users who have undergone the full local registration procedure (including signing the IT4Innovations Acceptable Use Policy) and who have received local password may check at any time, how many core-hours have been consumed by themselves and their projects using the command "it4ifree".
 !!! Note
+    You need to know your user password to use the command. Displayed core hours are "system core hours" which differ from PRACE "standardized core hours".
+!!! Hint
    The **it4ifree** command is a part of it4i.portal.clients package, located here: <https://pypi.python.org/pypi/it4i.portal.clients>
 ```bash

--- a/docs.it4i/anselm-cluster-documentation/remote-visualization.md
+++ b/docs.it4i/anselm-cluster-documentation/remote-visualization.md
@@ -192,7 +192,7 @@ $ module load virtualgl/2.4
 $ vglrun glxgears
 ```
-Please note, that if you want to run an OpenGL application which is vailable through modules, you need at first load the respective module. . g. to run the **Mentat** OpenGL application from **MARC** software ackage use:
+If you want to run an OpenGL application which is vailable through modules, you need at first load the respective module. E.g. to run the **Mentat** OpenGL application from **MARC** software ackage use:
 ```bash
 $ module load marc/2013.1

--- a/docs.it4i/anselm-cluster-documentation/software/compilers.md
+++ b/docs.it4i/anselm-cluster-documentation/software/compilers.md
@@ -102,7 +102,10 @@ To use the Berkley UPC compiler and runtime environment to run the binaries use
 As default UPC network the "smp" is used. This is very quick and easy way for testing/debugging, but limited to one node only.
-For production runs, it is recommended to use the native Infiband implementation of UPC network "ibv". For testing/debugging using multiple nodes, the "mpi" UPC network is recommended. Please note, that **the selection of the network is done at the compile time** and not at runtime (as expected)!
+For production runs, it is recommended to use the native Infiband implementation of UPC network "ibv". For testing/debugging using multiple nodes, the "mpi" UPC network is recommended.
+!!! Warning
+    Selection of the network is done at the compile time and not at runtime (as expected)!
 Example UPC code:

--- a/docs.it4i/anselm-cluster-documentation/software/debuggers/total-view.md
+++ b/docs.it4i/anselm-cluster-documentation/software/debuggers/total-view.md
@@ -91,8 +91,8 @@ To debug a serial code use:
 To debug a parallel code compiled with **OpenMPI** you need to setup your TotalView environment:
-!!! Note
+!!! Hint
-	**Please note:** To be able to run parallel debugging procedure from the command line without stopping the debugger in the mpiexec source code you have to add the following function to your **~/.tvdrc** file:
+    To be able to run parallel debugging procedure from the command line without stopping the debugger in the mpiexec source code you have to add the following function to your `~/.tvdrc` file:
 ```bash
    proc mpi_auto_run_starter {loaded_id} {

--- a/docs.it4i/anselm-cluster-documentation/software/intel-xeon-phi.md
+++ b/docs.it4i/anselm-cluster-documentation/software/intel-xeon-phi.md
@@ -103,7 +103,10 @@ For debugging purposes it is also recommended to set environment variable "OFFLO
 export OFFLOAD_REPORT=3
 ```
-A very basic example of code that employs offload programming technique is shown in the next listing. Please note that this code is sequential and utilizes only single core of the accelerator.
+A very basic example of code that employs offload programming technique is shown in the next listing.
+!!! Note
+    This code is sequential and utilizes only single core of the accelerator.
 ```bash
    $ vim source-offload.cpp
@@ -327,7 +330,7 @@ Following example show how to automatically offload an SGEMM (single precision -
 ```
 !!! Note
-	Please note: This example is simplified version of an example from MKL. The expanded version can be found here: **$MKL_EXAMPLES/mic_ao/blasc/source/sgemm.c**
+    This example is simplified version of an example from MKL. The expanded version can be found here: `$MKL_EXAMPLES/mic_ao/blasc/source/sgemm.c`.
 To compile a code using Intel compiler use:
@@ -370,7 +373,7 @@ To compile a code user has to be connected to a compute with MIC and load Intel
 ```
 !!! Note
-	Please note that particular version of the Intel module is specified. This information is used later to specify the correct library paths.
+    Particular version of the Intel module is specified. This information is used later to specify the correct library paths.
 To produce a binary compatible with Intel Xeon Phi architecture user has to specify "-mmic" compiler flag. Two compilation examples are shown below. The first example shows how to compile OpenMP parallel code "vect-add.c" for host only:
@@ -413,7 +416,7 @@ If the code is parallelized using OpenMP a set of additional libraries is requir
 ```
 !!! Note
-	Please note that the path exported in the previous example contains path to a specific compiler (here the version is 5.192). This version number has to match with the version number of the Intel compiler module that was used to compile the code on the host computer.
+    The path exported in the previous example contains path to a specific compiler (here the version is 5.192). This version number has to match with the version number of the Intel compiler module that was used to compile the code on the host computer.
 For your information the list of libraries and their location required for execution of an OpenMP parallel code on Intel Xeon Phi is:
@@ -538,8 +541,8 @@ To see the performance of Intel Xeon Phi performing the DGEMM run the example as
    ...
 ```
-!!! Note
+!!! Warning
-	Please note: GNU compiler is used to compile the OpenCL codes for Intel MIC. You do not need to load Intel compiler module.
+    GNU compiler is used to compile the OpenCL codes for Intel MIC. You do not need to load Intel compiler module.
 ## MPI
@@ -648,9 +651,8 @@ Similarly to execution of OpenMP programs in native mode, since the environmenta
 ```
 !!! Note
-	Please note:
+    - this file sets up both environmental variable for both MPI and OpenMP libraries.
-	\- this file sets up both environmental variable for both MPI and OpenMP libraries.
+    - this file sets up the paths to a particular version of Intel MPI library and particular version of an Intel compiler. These versions have to match with loaded modules.
-	\- this file sets up the paths to a particular version of Intel MPI library and particular version of an Intel compiler. These versions have to match with loaded modules.
 To access a MIC accelerator located on a node that user is currently connected to, use:
@@ -702,9 +704,8 @@ or using mpirun
 ```
 !!! Note
-	Please note:
+    - the full path to the binary has to specified (here: `>~/mpi-test-mic`)
-	\- the full path to the binary has to specified (here: "**>~/mpi-test-mic**")
+    - the `LD_LIBRARY_PATH` has to match with Intel MPI module used to compile the MPI code
-	\- the LD_LIBRARY_PATH has to match with Intel MPI module used to compile the MPI code
 The output should be again similar to:
@@ -716,7 +717,9 @@ The output should be again similar to:
 ```
 !!! Note
-	Please note that the **"mpiexec.hydra"** requires a file the MIC filesystem. If the file is missing please contact the system administrators. A simple test to see if the file is present is to execute:
+    `mpiexec.hydra` requires a file the MIC filesystem. If the file is missing please contact the system administrators.
+A simple test to see if the file is present is to execute:
 ```bash
      $ ssh mic0 ls /bin/pmi_proxy
@@ -749,11 +752,10 @@ For example:
 This output means that the PBS allocated nodes cn204 and cn205, which means that user has direct access to "**cn204-mic0**" and "**cn-205-mic0**" accelerators.
 !!! Note
-	Please note: At this point user can connect to any of the allocated nodes or any of the allocated MIC accelerators using ssh:
+    At this point user can connect to any of the allocated nodes or any of the allocated MIC accelerators using ssh:
+    - to connect to the second node : `$ ssh cn205`
-    - to connect to the second node : ** $ ssh cn205**
+    - to connect to the accelerator on the first node from the first node: `$ ssh cn204-mic0` or `$ ssh mic0`
-    - to connect to the accelerator on the first node from the first node:  **$ ssh cn204-mic0** or **$ ssh mic0**
+    - to connect to the accelerator on the second node from the first node: `$ ssh cn205-mic0`
-    - to connect to the accelerator on the second node from the first node:  **$ ssh cn205-mic0**
 At this point we expect that correct modules are loaded and binary is compiled. For parallel execution the mpiexec.hydra is used. Again the first step is to tell mpiexec that the MPI can be executed on MIC accelerators by setting up the environmental variable "I_MPI_MIC"
@@ -882,7 +884,7 @@ A possible output of the MPI "hello-world" example executed on two hosts and two
 ```
 !!! Note
-	Please note: At this point the MPI communication between MIC accelerators on different nodes uses 1Gb Ethernet only.
+    At this point the MPI communication between MIC accelerators on different nodes uses 1Gb Ethernet only.
 **Using the PBS automatically generated node-files**
@@ -895,7 +897,7 @@ PBS also generates a set of node-files that can be used instead of manually crea
     - /lscratch/${PBS_JOBID}/nodefile-mic Host and MIC node-file:
     - /lscratch/${PBS_JOBID}/nodefile-mix
-Please note each host or accelerator is listed only per files. User has to specify how many jobs should be executed per node using "-n" parameter of the mpirun command.
+Each host or accelerator is listed only per files. User has to specify how many jobs should be executed per node using `-n` parameter of the mpirun command.
 ## Optimization

--- a/docs.it4i/anselm-cluster-documentation/software/numerical-languages/matlab.md
+++ b/docs.it4i/anselm-cluster-documentation/software/numerical-languages/matlab.md
@@ -134,7 +134,7 @@ The last part of the configuration is done directly in the user Matlab script be
 This script creates scheduler object "cluster" of type "local" that starts workers locally.
 !!! Note
-	Please note: Every Matlab script that needs to initialize/use matlabpool has to contain these three lines prior to calling parpool(sched, ...) function.
+    Every Matlab script that needs to initialize/use matlabpool has to contain these three lines prior to calling parpool(sched, ...) function.
 The last step is to start matlabpool with "cluster" object and correct number of workers. We have 24 cores per node, so we start 24 workers.
@@ -217,7 +217,8 @@ You can start this script using batch mode the same way as in Local mode example
 This method is a "hack" invented by us to emulate the mpiexec functionality found in previous MATLAB versions. We leverage the MATLAB Generic Scheduler interface, but instead of submitting the workers to PBS, we launch the workers directly within the running job, thus we avoid the issues with master script and workers running in separate jobs (issues with license not available, waiting for the worker's job to spawn etc.)
-Please note that this method is experimental.
+!!! Warning
+    This method is experimental.
 For this method, you need to use SalomonDirect profile, import it using [the same way as SalomonPBSPro](matlab/#running-parallel-matlab-using-distributed-computing-toolbox---engine)

--- a/docs.it4i/anselm-cluster-documentation/software/numerical-libraries/magma-for-intel-xeon-phi.md
+++ b/docs.it4i/anselm-cluster-documentation/software/numerical-libraries/magma-for-intel-xeon-phi.md
@@ -66,13 +66,11 @@ To test if the MAGMA server runs properly we can run one of examples that are pa
    10304 10304     ---   (  ---  )    500.70 (   1.46)     ---
 ```
-!!! Note
+!!! Hint
-	Please note: MAGMA contains several benchmarks and examples that can be found in:
+    MAGMA contains several benchmarks and examples in `$MAGMAROOT/testing/`
-	**$MAGMAROOT/testing/**
 !!! Note
-	MAGMA relies on the performance of all CPU cores as well as on the performance of the accelerator. Therefore on Anselm number of CPU OpenMP threads has to be set to 16:
+    MAGMA relies on the performance of all CPU cores as well as on the performance of the accelerator. Therefore on Anselm number of CPU OpenMP threads has to be set to 16 with `export OMP_NUM_THREADS=16`.
-	**export OMP_NUM_THREADS=16**
 See more details at [MAGMA home page](http://icl.cs.utk.edu/magma/).

--- a/docs.it4i/anselm-cluster-documentation/software/nvidia-cuda.md
+++ b/docs.it4i/anselm-cluster-documentation/software/nvidia-cuda.md
@@ -281,9 +281,8 @@ SAXPY function multiplies the vector x by the scalar alpha and adds it to the ve
 ```
 !!! Note
-	Please note: cuBLAS has its own function for data transfers between CPU and GPU memory:
+    cuBLAS has its own function for data transfers between CPU and GPU memory:
+    - [cublasSetVector](http://docs.nvidia.com/cuda/cublas/index.html#cublassetvector) - transfers data from CPU to GPU memory
-    - [cublasSetVector](http://docs.nvidia.com/cuda/cublas/index.html#cublassetvector) 		- transfers data from CPU to GPU memory
    - [cublasGetVector](http://docs.nvidia.com/cuda/cublas/index.html#cublasgetvector) - transfers data from GPU to CPU memory
 To compile the code using NVCC compiler a "-lcublas" compiler flag has to be specified:

--- a/docs.it4i/salomon/capacity-computing.md
+++ b/docs.it4i/salomon/capacity-computing.md
@@ -218,7 +218,8 @@ $ qsub -N JOBNAME jobscript
 In this example, we submit a job of 101 tasks. 24 input files will be processed in  parallel. The 101 tasks on 24 cores are assumed to complete in less than 2 hours.
-Please note the #PBS directives in the beginning of the jobscript file, dont' forget to set your valid PROJECT_ID and desired queue.
+!!! Note
+    Use #PBS directives in the beginning of the jobscript file, dont' forget to set your valid PROJECT_ID and desired queue.
 ## Job Arrays and GNU Parallel
@@ -302,7 +303,8 @@ $ qsub -N JOBNAME -J 1-992:32 jobscript
 In this example, we submit a job array of 31 subjobs. Note the  -J 1-992:**48**, this must be the same as the number sent to numtasks file. Each subjob will run on full node and process 24 input files in parallel, 48 in total per subjob.  Every subjob is assumed to complete in less than 2 hours.
-Please note the #PBS directives in the beginning of the jobscript file, dont' forget to set your valid PROJECT_ID and desired queue.
+!!! Note
+    Use #PBS directives in the beginning of the jobscript file, dont' forget to set your valid PROJECT_ID and desired queue.
 ## Examples

--- a/docs.it4i/salomon/prace.md
+++ b/docs.it4i/salomon/prace.md
@@ -202,7 +202,8 @@ Generally both shared file systems are available through GridFTP:
 More information about the shared file systems is available [here](storage/).
-Please note, that for PRACE users a "prace" directory is used also on the SCRATCH file system.
+!!! Hint
+    `prace` directory is used for PRACE users on the SCRATCH file system.
 | Data type                    | Default path                    |
 | ---------------------------- | ------------------------------- |
@@ -245,7 +246,7 @@ The resources that are currently subject to accounting are the core hours. The c
 PRACE users should check their project accounting using the [PRACE Accounting Tool (DART)](http://www.prace-ri.eu/accounting-report-tool/).
-Users who have undergone the full local registration procedure (including signing the IT4Innovations Acceptable Use Policy) and who have received local password may check at any time, how many core-hours have been consumed by themselves and their projects using the command "it4ifree". Please note that you need to know your user password to use the command and that the displayed core hours are "system core hours" which differ from PRACE "standardized core hours".
+Users who have undergone the full local registration procedure (including signing the IT4Innovations Acceptable Use Policy) and who have received local password may check at any time, how many core-hours have been consumed by themselves and their projects using the command "it4ifree". You need to know your user password to use the command and that the displayed core hours are "system core hours" which differ from PRACE "standardized core hours".
 !!! Note
 	The **it4ifree** command is a part of it4i.portal.clients package, located here: <https://pypi.python.org/pypi/it4i.portal.clients>

--- a/docs.it4i/salomon/software/compilers.md
+++ b/docs.it4i/salomon/software/compilers.md
@@ -138,7 +138,10 @@ To use the Berkley UPC compiler and runtime environment to run the binaries use
 As default UPC network the "smp" is used. This is very quick and easy way for testing/debugging, but limited to one node only.
-For production runs, it is recommended to use the native InfiniBand implementation of UPC network "ibv". For testing/debugging using multiple nodes, the "mpi" UPC network is recommended. Please note, that the selection of the network is done at the compile time and not at runtime (as expected)!
+For production runs, it is recommended to use the native InfiniBand implementation of UPC network "ibv". For testing/debugging using multiple nodes, the "mpi" UPC network is recommended.
+!!! Warning
+    Selection of the network is done at the compile time and not at runtime (as expected)!
 Example UPC code:

--- a/docs.it4i/salomon/software/debuggers/total-view.md
+++ b/docs.it4i/salomon/software/debuggers/total-view.md
@@ -80,8 +80,8 @@ To debug a serial code use:
 To debug a parallel code compiled with **OpenMPI** you need to setup your TotalView environment:
-!!! Note
+!!! Hint 
-	**Please note:** To be able to run parallel debugging procedure from the command line without stopping the debugger in the mpiexec source code you have to add the following function to your **~/.tvdrc** file:
+    To be able to run parallel debugging procedure from the command line without stopping the debugger in the mpiexec source code you have to add the following function to your **~/.tvdrc** file.
 ```bash
    proc mpi_auto_run_starter {loaded_id} {

--- a/docs.it4i/salomon/software/intel-xeon-phi.md
+++ b/docs.it4i/salomon/software/intel-xeon-phi.md
@@ -103,7 +103,10 @@ For debugging purposes it is also recommended to set environment variable "OFFLO
    export OFFLOAD_REPORT=3
 ```
-A very basic example of code that employs offload programming technique is shown in the next listing. Please note that this code is sequential and utilizes only single core of the accelerator.
+A very basic example of code that employs offload programming technique is shown in the next listing.
+!!! Note
+    This code is sequential and utilizes only single core of the accelerator.
 ```bash
    $ vim source-offload.cpp
@@ -326,7 +329,7 @@ Following example show how to automatically offload an SGEMM (single precision -
 ```
 !!! Note
-	Please note: This example is simplified version of an example from MKL. The expanded version can be found here: **$MKL_EXAMPLES/mic_ao/blasc/source/sgemm.c**
+    This example is simplified version of an example from MKL. The expanded version can be found here: **$MKL_EXAMPLES/mic_ao/blasc/source/sgemm.c**
 To compile a code using Intel compiler use:
@@ -369,7 +372,7 @@ To compile a code user has to be connected to a compute with MIC and load Intel
 ```
 !!! Note
-	Please note that particular version of the Intel module is specified. This information is used later to specify the correct library paths.
+    Particular version of the Intel module is specified. This information is used later to specify the correct library paths.
 To produce a binary compatible with Intel Xeon Phi architecture user has to specify "-mmic" compiler flag. Two compilation examples are shown below. The first example shows how to compile OpenMP parallel code "vect-add.c" for host only:
@@ -412,7 +415,7 @@ If the code is parallelized using OpenMP a set of additional libraries is requir
 ```
 !!! Note
-	Please note that the path exported in the previous example contains path to a specific compiler (here the version is 5.192). This version number has to match with the version number of the Intel compiler module that was used to compile the code on the host computer.
+    The path exported contains path to a specific compiler (here the version is 5.192). This version number has to match with the version number of the Intel compiler module that was used to compile the code on the host computer.
 For your information the list of libraries and their location required for execution of an OpenMP parallel code on Intel Xeon Phi is:
@@ -537,8 +540,8 @@ To see the performance of Intel Xeon Phi performing the DGEMM run the example as
    ...
 ```
-!!! Note
+!!! Hint
-	Please note: GNU compiler is used to compile the OpenCL codes for Intel MIC. You do not need to load Intel compiler module.
+    GNU compiler is used to compile the OpenCL codes for Intel MIC. You do not need to load Intel compiler module.
 ## MPI
@@ -647,8 +650,6 @@ Similarly to execution of OpenMP programs in native mode, since the environmenta
 ```
 !!! Note
-	Please note:
    - this file sets up both environmental variable for both MPI and OpenMP libraries.
    - this file sets up the paths to a particular version of Intel MPI library and particular version of an Intel compiler. These versions have to match with loaded modules.
@@ -702,9 +703,8 @@ or using mpirun
 ```
 !!! Note
-	Please note:
+    - the full path to the binary has to specified (here: "**>~/mpi-test-mic**")
-    \- the full path to the binary has to specified (here: "**>~/mpi-test-mic**")
+    - the LD_LIBRARY_PATH has to match with Intel MPI module used to compile the MPI code
-    \- the LD_LIBRARY_PATH has to match with Intel MPI module used to compile the MPI code
 The output should be again similar to:
@@ -715,8 +715,10 @@ The output should be again similar to:
    Hello world from process 0 of 4 on host cn207-mic0
 ```
-!!! Note
+!!! Hint 
-	Please note that the **"mpiexec.hydra"** requires a file the MIC filesystem. If the file is missing please contact the system administrators. A simple test to see if the file is present is to execute:
+    **"mpiexec.hydra"** requires a file the MIC filesystem. If the file is missing please contact the system administrators.
+A simple test to see if the file is present is to execute:
 ```bash
      $ ssh mic0 ls /bin/pmi_proxy
@@ -749,11 +751,10 @@ For example:
 This output means that the PBS allocated nodes cn204 and cn205, which means that user has direct access to "**cn204-mic0**" and "**cn-205-mic0**" accelerators.
 !!! Note
-	Please note: At this point user can connect to any of the allocated nodes or any of the allocated MIC accelerators using ssh:
+    At this point user can connect to any of the allocated nodes or any of the allocated MIC accelerators using ssh:
+    - to connect to the second node : `$ ssh cn205`
-    - to connect to the second node : ** $ ssh cn205**
+    - to connect to the accelerator on the first node from the first node:  `$ ssh cn204-mic0` or `$ ssh mic0`
-    - to connect to the accelerator on the first node from the first node:  **$ ssh cn204-mic0** or **$ ssh mic0**
+    - to connect to the accelerator on the second node from the first node: `$ ssh cn205-mic0`
-    - to connect to the accelerator on the second node from the first node:  **$ ssh cn205-mic0**
 At this point we expect that correct modules are loaded and binary is compiled. For parallel execution the mpiexec.hydra is used. Again the first step is to tell mpiexec that the MPI can be executed on MIC accelerators by setting up the environmental variable "I_MPI_MIC"
@@ -882,7 +883,7 @@ A possible output of the MPI "hello-world" example executed on two hosts and two
 ```
 !!! Note
-	Please note: At this point the MPI communication between MIC accelerators on different nodes uses 1Gb Ethernet only.
+    At this point the MPI communication between MIC accelerators on different nodes uses 1Gb Ethernet only.
 **Using the PBS automatically generated node-files**
@@ -895,7 +896,7 @@ PBS also generates a set of node-files that can be used instead of manually crea
     - /lscratch/${PBS_JOBID}/nodefile-mic Host and MIC node-file:
     - /lscratch/${PBS_JOBID}/nodefile-mix
-Please note each host or accelerator is listed only per files. User has to specify how many jobs should be executed per node using "-n" parameter of the mpirun command.
+Each host or accelerator is listed only per files. User has to specify how many jobs should be executed per node using "-n" parameter of the mpirun command.
 ## Optimization

--- a/docs.it4i/salomon/software/numerical-languages/matlab.md
+++ b/docs.it4i/salomon/software/numerical-languages/matlab.md
@@ -129,7 +129,8 @@ The last part of the configuration is done directly in the user Matlab script be
 This script creates scheduler object "cluster" of type "local" that starts workers locally.
-Please note: Every Matlab script that needs to initialize/use matlabpool has to contain these three lines prior to calling parpool(sched, ...) function.
+!!! Hint
+    Every Matlab script that needs to initialize/use matlabpool has to contain these three lines prior to calling parpool(sched, ...) function.
 The last step is to start matlabpool with "cluster" object and correct number of workers. We have 24 cores per node, so we start 24 workers.
@@ -212,7 +213,8 @@ You can start this script using batch mode the same way as in Local mode example
 This method is a "hack" invented by us to emulate the mpiexec functionality found in previous MATLAB versions. We leverage the MATLAB Generic Scheduler interface, but instead of submitting the workers to PBS, we launch the workers directly within the running job, thus we avoid the issues with master script and workers running in separate jobs (issues with license not available, waiting for the worker's job to spawn etc.)
-Please note that this method is experimental.
+!!! Warning
+    This method is experimental.
 For this method, you need to use SalomonDirect profile, import it using [the same way as SalomonPBSPro](matlab.md#running-parallel-matlab-using-distributed-computing-toolbox---engine)