Newer
Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
Octave
======
Introduction
------------
GNU Octave is a high-level interpreted language, primarily intended for
numerical computations. It provides capabilities for the numerical
solution of linear and nonlinear problems, and for performing other
numerical experiments. It also provides extensive graphics capabilities
for data visualization and manipulation. Octave is normally used through
its interactive command line interface, but it can also be used to write
non-interactive programs. The Octave language is quite similar to Matlab
so that most programs are easily portable. Read more on
<http://www.gnu.org/software/octave/>***
Two versions of octave are available on Anselm, via module
Version module
----------------------------------------------------- |---|---|-----------------
Octave 3.8.2, compiled with GCC and Multithreaded MKL Octave/3.8.2-gimkl-2.11.5
Octave 4.0.1, compiled with GCC and Multithreaded MKL Octave/4.0.1-gimkl-2.11.5
Octave 4.0.0, compiled with >GCC and OpenBLAS Octave/4.0.0-foss-2015g
Modules and execution
----------------------
$ module load Octave
The octave on Anselm is linked to highly optimized MKL mathematical
library. This provides threaded parallelization to many octave kernels,
notably the linear algebra subroutines. Octave runs these heavy
calculation kernels without any penalty. By default, octave would
parallelize to 16 threads. You may control the threads by setting the
OMP_NUM_THREADS environment variable.
To run octave interactively, log in with ssh -X parameter for X11
forwarding. Run octave:
$ octave
To run octave in batch mode, write an octave script, then write a bash
jobscript and execute via the qsub command. By default, octave will use
16 threads when running MKL kernels.
#!/bin/bash
# change to local scratch directory
cd /lscratch/$PBS_JOBID || exit
# copy input file to scratch
cp $PBS_O_WORKDIR/octcode.m .
# load octave module
module load octave
# execute the calculation
octave -q --eval octcode > output.out
# copy output file to home
cp output.out $PBS_O_WORKDIR/.
#exit
exit
This script may be submitted directly to the PBS workload manager via
the qsub command. The inputs are in octcode.m file, outputs in
output.out file. See the single node jobscript example in the [Job
execution
section](http://support.it4i.cz/docs/anselm-cluster-documentation/resource-allocation-and-job-execution).
The octave c compiler mkoctfile calls the GNU gcc 4.8.1 for compiling
native c code. This is very useful for running native c subroutines in
octave environment.
$ mkoctfile -v
Octave may use MPI for interprocess communication
This functionality is currently not supported on Anselm cluster. In case
you require the octave interface to MPI, please contact [Anselm
support](https://support.it4i.cz/rt/).
Xeon Phi Support
----------------
Octave may take advantage of the Xeon Phi accelerators. This will only
work on the [Intel Xeon Phi](../intel-xeon-phi.html)
[accelerated nodes](../../compute-nodes.html).
### Automatic offload support
Octave can accelerate BLAS type operations (in particular the Matrix
Matrix multiplications] on the Xeon Phi accelerator, via [Automatic
Offload using the MKL
library](../intel-xeon-phi.html#section-3)
Example
$ export OFFLOAD_REPORT=2
$ export MKL_MIC_ENABLE=1
$ module load octave
$ octave -q
octave:1> A=rand(10000); B=rand(10000);
octave:2> tic; C=A*B; toc
[MKL] [MIC --] [AO Function] DGEMM
[MKL] [MIC --] [AO DGEMM Workdivision] 0.32 0.68
[MKL] [MIC 00] [AO DGEMM CPU Time] 2.896003 seconds
[MKL] [MIC 00] [AO DGEMM MIC Time] 1.967384 seconds
[MKL] [MIC 00] [AO DGEMM CPU->MIC Data] 1347200000 bytes
[MKL] [MIC 00] [AO DGEMM MIC->CPU Data] 2188800000 bytes
Elapsed time is 2.93701 seconds.
In this example, the calculation was automatically divided among the CPU
cores and the Xeon Phi MIC accelerator, reducing the total runtime from
6.3 secs down to 2.9 secs.
### Native support
A version of [native](../intel-xeon-phi.html#section-4)
Octave is compiled for Xeon Phi accelerators. Some limitations apply for
this version:
- Only command line support. GUI, graph plotting etc. is
not supported.
- Command history in interactive mode is not supported.
Octave is linked with parallel Intel MKL, so it best suited for batch
processing of tasks that utilize BLAS, LAPACK and FFT operations. By
default, number of threads is set to 120, you can control this
with > OMP_NUM_THREADS environment
variable.
Calculations that do not employ parallelism (either by using parallel
MKL eg. via matrix operations, fork()
function, [parallel
package](http://octave.sourceforge.net/parallel/) or
other mechanism) will actually run slower than on host CPU.
To use Octave on a node with Xeon Phi:
$ ssh mic0 # login to the MIC card
$ source /apps/tools/octave/3.8.2-mic/bin/octave-env.sh # set up environment variables
$ octave -q /apps/tools/octave/3.8.2-mic/example/test0.m # run an example