Skip to content
Snippets Groups Projects
hardware-overview.md 5.88 KiB
Newer Older
Lukáš Krupčík's avatar
Lukáš Krupčík committed
Hardware Overview 
=================

  

The Anselm cluster consists of 209 computational nodes named cn[1-209]
of which 180 are regular compute nodes, 23 GPU Kepler K20 accelerated
nodes, 4 MIC Xeon Phi 5110 accelerated nodes and 2 fat nodes. Each node
is a  powerful x86-64 computer,
equipped with 16 cores (two eight-core Intel Sandy Bridge processors),
at least 64GB RAM, and local hard drive. The user access to the Anselm
cluster is provided by two login nodes login[1,2]. The nodes are
interlinked by high speed InfiniBand and Ethernet networks. All nodes
share 320TB /home disk storage to store the user files. The 146TB shared
/scratch storage is available for the scratch data.

The Fat nodes are equipped with large amount (512GB) of memory.
Virtualization infrastructure provides resources to run long term
servers and services in virtual mode. Fat nodes and virtual servers may
access 45 TB of dedicated block storage. Accelerated nodes, fat nodes,
and virtualization infrastructure are available [upon
request](https://support.it4i.cz/rt) made by a PI.

Schematic representation of the Anselm cluster. Each box represents a
node (computer) or storage capacity:

User-oriented infrastructure
Storage
Management infrastructure
  --------
  login1
  login2
  dm1
  --------

Rack 01, Switch isw5

  --------  |---|---|---- -------------- -------------- --------------
  cn186          cn187                         cn188          cn189
  cn181          cn182          cn183          cn184          cn185
  --------  |---|---|---- -------------- -------------- --------------

Rack 01, Switch isw4

cn29
cn30
cn31
cn32
cn33
cn34
cn35
cn36
cn19
cn20
cn21
cn22
cn23
cn24
cn25
cn26
cn27
cn28
<col width="100%" />
 | <p> <p>Lustre FS<p>/home320TB<p> <p> \ |
 |Lustre FS<p>/scratch146TB\ |

Management
nodes
Block storage
45 TB
Virtualization
infrastructure
servers
...
Srv node
Srv node
Srv node
...
Rack 01, Switch isw0

cn11
cn12
cn13
cn14
cn15
cn16
cn17
cn18
cn1
cn2
cn3
cn4
cn5
cn6
cn7
cn8
cn9
cn10
Rack 02, Switch isw10

cn73
cn74
cn75
cn76
cn77
cn78
cn79
cn80
cn190
cn191
cn192
cn205
cn206
Rack 02, Switch isw9

cn65
cn66
cn67
cn68
cn69
cn70
cn71
cn72
cn55
cn56
cn57
cn58
cn59
cn60
cn61
cn62
cn63
cn64
Rack 02, Switch isw6

cn47
cn48
cn49
cn50
cn51
cn52
cn53
cn54
cn37
cn38
cn39
cn40
cn41
cn42
cn43
cn44
cn45
cn46
Rack 03, Switch isw15

cn193
cn194
cn195
cn207
cn117
cn118
cn119
cn120
cn121
cn122
cn123
cn124
cn125
cn126
Rack 03, Switch isw14

cn109
cn110
cn111
cn112
cn113
cn114
cn115
cn116
cn99
cn100
cn101
cn102
cn103
cn104
cn105
cn106
cn107
cn108
Rack 03, Switch isw11

cn91
cn92
cn93
cn94
cn95
cn96
cn97
cn98
cn81
cn82
cn83
cn84
cn85
cn86
cn87
cn88
cn89
cn90
Rack 04, Switch isw20

cn173
cn174
cn175
cn176
cn177
cn178
cn179
cn180
cn163
cn164
cn165
cn166
cn167
cn168
cn169
cn170
cn171
cn172
Rack 04, **Switch** isw19

cn155
cn156
cn157
cn158
cn159
cn160
cn161
cn162
cn145
cn146
cn147
cn148
cn149
cn150
cn151
cn152
cn153
cn154
Rack 04, Switch isw16

cn137
cn138
cn139
cn140
cn141
cn142
cn143
cn144
cn127
cn128
cn129
cn130
cn131
cn132
cn133
cn134
cn135
cn136
Rack 05, Switch isw21

  --------  |---|---|---- -------------- -------------- --------------
  cn201          cn202                         cn203          cn204
  cn196          cn197          cn198          cn199          cn200
  --------  |---|---|---- -------------- -------------- --------------

  ----------------
  Fat node cn208
  Fat node cn209
  ...
  ----------------

The cluster compute nodes cn[1-207] are organized within 13 chassis. 

There are four types of compute nodes:

-   180 compute nodes without the accelerator
-   23 compute nodes with GPU accelerator - equipped with NVIDIA Tesla
    Kepler K20
-   4 compute nodes with MIC accelerator - equipped with Intel Xeon Phi
    5110P
-   2 fat nodes - equipped with 512GB RAM and two 100GB SSD drives

[More about Compute nodes](compute-nodes.html).

GPU and accelerated nodes are available upon request, see the [Resources
Allocation
Policy](resource-allocation-and-job-execution/resources-allocation-policy.html).

All these nodes are interconnected by fast 
InfiniBand  class="WYSIWYG_LINK">QDR
network and Ethernet network.  [More about the 
Network](network.html).
Every chassis provides Infiniband switch, marked **isw**, connecting all
nodes in the chassis, as well as connecting the chassis to the upper
level switches.

All nodes share 360TB /home disk storage to store user files. The 146TB
shared /scratch storage is available for the scratch data. These file
systems are provided by Lustre parallel file system. There is also local
disk storage available on all compute nodes /lscratch.  [More about

Storage](storage.html).

The user access to the Anselm cluster is provided by two login nodes
login1, login2, and data mover node dm1. [More about accessing
cluster.](accessing-the-cluster.html)

 The parameters are summarized in the following tables:

In general**
Primary purpose
High Performance Computing
Architecture of compute nodes
x86-64
Operating system
Linux
[**Compute nodes**](compute-nodes.html)
Totally
209
Processor cores
16 (2x8 cores)
RAM
min. 64 GB, min. 4 GB per core
Local disk drive
yes - usually 500 GB
Compute network
InfiniBand QDR, fully non-blocking, fat-tree
w/o accelerator
180, cn[1-180]
GPU accelerated
23, cn[181-203]
MIC accelerated
4, cn[204-207]
Fat compute nodes
2, cn[208-209]
In total**
Total theoretical peak performance  (Rpeak)
94 Tflop/s
Total max. LINPACK performance  (Rmax)
73 Tflop/s
Total amount of RAM
15.136 TB
  |Node|Processor|Memory|Accelerator|
  |---|---|---|---|
  |w/o accelerator|2x Intel Sandy Bridge E5-2665, 2.4GHz|64GB|-|
  |GPU accelerated|2x Intel Sandy Bridge E5-2470, 2.3GHz|96GB|NVIDIA Kepler K20|
  |MIC accelerated|2x Intel Sandy Bridge E5-2470, 2.3GHz|96GB|Intel Xeon Phi P5110|
  |Fat compute node|2x Intel Sandy Bridge E5-2665, 2.4GHz|512GB|-|

  For more details please refer to the [Compute
nodes](compute-nodes.html),
[Storage](storage.html), and
[Network](network.html).