storage.md 21.9 KB
Newer Older
David Hrbáč's avatar
David Hrbáč committed
1
# Storage
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
2

David Hrbáč's avatar
David Hrbáč committed
3
There are two main shared file systems on Anselm cluster, the [HOME](#home) and [SCRATCH](#scratch). All login and compute nodes may access same data on shared file systems. Compute nodes are also equipped with local (non-shared) scratch, ramdisk and tmp file systems.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
4

David Hrbáč's avatar
David Hrbáč committed
5
## Archiving
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
6

Pavel Jirásek's avatar
Pavel Jirásek committed
7
Please don't use shared filesystems as a backup for large amount of data or long-term archiving mean. The academic staff and students of research institutions in the Czech Republic can use [CESNET storage service](#cesnet-data-storage), which is available via SSHFS.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
8

David Hrbáč's avatar
David Hrbáč committed
9
## Shared Filesystems
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
10

Pavel Jirásek's avatar
Pavel Jirásek committed
11
Anselm computer provides two main shared filesystems, the [HOME filesystem](#home) and the [SCRATCH filesystem](#scratch). Both HOME and SCRATCH filesystems are realized as a parallel Lustre filesystem. Both shared file systems are accessible via the Infiniband network. Extended ACLs are provided on both Lustre filesystems for the purpose of sharing data with other users using fine-grained control.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
12
13
14
15
16
17
18

### Understanding the Lustre Filesystems

(source <http://www.nas.nasa.gov>)

A user file on the Lustre filesystem can be divided into multiple chunks (stripes) and stored across a subset of the object storage targets (OSTs) (disks). The stripes are distributed among the OSTs in a round-robin fashion to ensure load balancing.

David Hrbáč's avatar
David Hrbáč committed
19
When a client (a compute node from your job) needs to create or access a file, the client queries the metadata server ( MDS) and the metadata target ( MDT) for the layout and location of the [file's stripes](http://www.nas.nasa.gov/hecc/support/kb/Lustre_Basics_224.html#striping). Once the file is opened and the client obtains the striping information, the MDS is no longer involved in the file I/O process. The client interacts directly with the object storage servers (OSSes) and OSTs to perform I/O operations such as locking, disk allocation, storage, and retrieval.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
20
21
22
23
24

If multiple clients try to read and write the same part of a file at the same time, the Lustre distributed lock manager enforces coherency so that all clients see consistent results.

There is default stripe configuration for Anselm Lustre filesystems. However, users can set the following stripe parameters for their own directories or files to get optimum I/O performance:

Lukáš Krupčík's avatar
Lukáš Krupčík committed
25
26
27
1. stripe_size: the size of the chunk in bytes; specify with k, m, or g to use units of KB, MB, or GB, respectively; the size must be an even multiple of 65,536 bytes; default is 1MB for all Anselm Lustre filesystems
1. stripe_count the number of OSTs to stripe across; default is 1 for Anselm Lustre filesystems one can specify -1 to use all OSTs in the filesystem.
1. stripe_offset The index of the OST where the first stripe is to be placed; default is -1 which results in random selection; using a non-default value is NOT recommended.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
28

David Hrbáč's avatar
David Hrbáč committed
29
!!! note
Lukáš Krupčík's avatar
Lukáš Krupčík committed
30
    Setting stripe size and stripe count correctly for your needs may significantly impact the I/O performance you experience.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
31

Lukáš Krupčík's avatar
Lukáš Krupčík committed
32
Use the lfs getstripe for getting the stripe parameters. Use the lfs setstripe command for setting the stripe parameters to get optimal I/O performance The correct stripe setting depends on your needs and file access patterns.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
33

Lukáš Krupčík's avatar
Lukáš Krupčík committed
34
```console
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
35
$ lfs getstripe dir|filename
Lukáš Krupčík's avatar
Lukáš Krupčík committed
36
$ lfs setstripe -s stripe_size -c stripe_count -o stripe_offset dir|filename
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
37
38
39
40
```

Example:

Lukáš Krupčík's avatar
Lukáš Krupčík committed
41
```console
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
42
43
44
45
46
47
48
49
50
51
52
53
54
55
$ lfs getstripe /scratch/username/
/scratch/username/
stripe_count:   1 stripe_size:    1048576 stripe_offset:  -1

$ lfs setstripe -c -1 /scratch/username/
$ lfs getstripe /scratch/username/
/scratch/username/
stripe_count:  10 stripe_size:    1048576 stripe_offset:  -1
```

In this example, we view current stripe setting of the /scratch/username/ directory. The stripe count is changed to all OSTs, and verified. All files written to this directory will be striped over 10 OSTs

Use lfs check OSTs to see the number and status of active OSTs for each filesystem on Anselm. Learn more by reading the man page

Lukáš Krupčík's avatar
Lukáš Krupčík committed
56
```console
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
57
58
59
60
61
62
$ lfs check osts
$ man lfs
```

### Hints on Lustre Stripping

David Hrbáč's avatar
David Hrbáč committed
63
!!! note
Lukáš Krupčík's avatar
Lukáš Krupčík committed
64
    Increase the stripe_count for parallel I/O to the same file.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
65
66
67
68
69

When multiple processes are writing blocks of data to the same file in parallel, the I/O performance for large files will improve when the stripe_count is set to a larger value. The stripe count sets the number of OSTs the file will be written to. By default, the stripe count is set to 1. While this default setting provides for efficient access of metadata (for example to support the ls -l command), large files should use stripe counts of greater than 1. This will increase the aggregate I/O bandwidth by using multiple OSTs in parallel instead of just one. A rule of thumb is to use a stripe count approximately equal to the number of gigabytes in the file.

Another good practice is to make the stripe count be an integral factor of the number of processes performing the write in parallel, so that you achieve load balance among the OSTs. For example, set the stripe count to 16 instead of 15 when you have 64 processes performing the writes.

David Hrbáč's avatar
David Hrbáč committed
70
!!! note
Lukáš Krupčík's avatar
Lukáš Krupčík committed
71
    Using a large stripe size can improve performance when accessing very large files
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
72
73
74

Large stripe size allows each client to have exclusive access to its own part of a file. However, it can be counterproductive in some cases if it does not match your I/O pattern. The choice of stripe size has no effect on a single-stripe file.

75
Read more on <http://doc.lustre.org/lustre_manual.xhtml#managingstripingfreespace>
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
76
77
78

### Lustre on Anselm

Lukáš Krupčík's avatar
->    
Lukáš Krupčík committed
79
The architecture of Lustre on Anselm is composed of two metadata servers (MDS) and four data/object storage servers (OSS). Two object storage servers are used for file system HOME and another two object storage servers are used for file system SCRATCH.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
80
81
82

 Configuration of the storages

Lukáš Krupčík's avatar
* -> *    
Lukáš Krupčík committed
83
* HOME Lustre object storage
Lukáš Krupčík's avatar
Lukáš Krupčík committed
84
85
86
87
88
  * One disk array NetApp E5400
  * 22 OSTs
  * 227 2TB NL-SAS 7.2krpm disks
  * 22 groups of 10 disks in RAID6 (8+2)
  * 7 hot-spare disks
Lukáš Krupčík's avatar
* -> *    
Lukáš Krupčík committed
89
* SCRATCH Lustre object storage
Lukáš Krupčík's avatar
Lukáš Krupčík committed
90
91
92
93
94
  * Two disk arrays NetApp E5400
  * 10 OSTs
  * 106 2TB NL-SAS 7.2krpm disks
  * 10 groups of 10 disks in RAID6 (8+2)
  * 6 hot-spare disks
Lukáš Krupčík's avatar
* -> *    
Lukáš Krupčík committed
95
* Lustre metadata storage
Lukáš Krupčík's avatar
Lukáš Krupčík committed
96
97
98
99
  * One disk array NetApp E2600
  * 12 300GB SAS 15krpm disks
  * 2 groups of 5 disks in RAID5
  * 2 hot-spare disks
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
100

Lukáš Krupčík's avatar
Lukáš Krupčík committed
101
### HOME
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
102

David Hrbáč's avatar
David Hrbáč committed
103
The HOME filesystem is mounted in directory /home. Users home directories /home/username reside on this filesystem. Accessible capacity is 320TB, shared among all users. Individual users are restricted by filesystem usage quotas, set to 250GB per user. If 250GB should prove as insufficient for particular user, please contact [support](https://support.it4i.cz/rt), the quota may be lifted upon request.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
104

David Hrbáč's avatar
David Hrbáč committed
105
!!! note
Lukáš Krupčík's avatar
Lukáš Krupčík committed
106
    The HOME filesystem is intended for preparation, evaluation, processing and storage of data generated by active Projects.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
107
108
109

The HOME filesystem should not be used to archive data of past Projects or other unrelated data.

David Hrbáč's avatar
David Hrbáč committed
110
The files on HOME filesystem will not be deleted until end of the [users lifecycle](../general/obtaining-login-credentials/obtaining-login-credentials/).
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
111

David Hrbáč's avatar
David Hrbáč committed
112
The filesystem is backed up, such that it can be restored in case of catasthropic failure resulting in significant data loss. This backup however is not intended to restore old versions of user data or to restore (accidentaly) deleted files.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
113
114
115
116

The HOME filesystem is realized as Lustre parallel filesystem and is available on all login and computational nodes.
Default stripe size is 1MB, stripe count is 1. There are 22 OSTs dedicated for the HOME filesystem.

David Hrbáč's avatar
David Hrbáč committed
117
!!! note
Lukáš Krupčík's avatar
Lukáš Krupčík committed
118
    Setting stripe size and stripe count correctly for your needs may significantly impact the I/O performance you experience.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
119

David Hrbáč's avatar
David Hrbáč committed
120
| HOME filesystem      |        |
Lukáš Krupčík's avatar
Lukáš Krupčík committed
121
| -------------------- | ------ |
David Hrbáč's avatar
David Hrbáč committed
122
123
124
125
126
127
128
| Mountpoint           | /home  |
| Capacity             | 320 TB |
| Throughput           | 2 GB/s |
| User quota           | 250 GB |
| Default stripe size  | 1 MB   |
| Default stripe count | 1      |
| Number of OSTs       | 22     |
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
129

Lukáš Krupčík's avatar
Lukáš Krupčík committed
130
### SCRATCH
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
131
132
133

The SCRATCH filesystem is mounted in directory /scratch. Users may freely create subdirectories and files on the filesystem. Accessible capacity is 146TB, shared among all users. Individual users are restricted by filesystem usage quotas, set to 100TB per user. The purpose of this quota is to prevent runaway programs from filling the entire filesystem and deny service to other users. If 100TB should prove as insufficient for particular user, please contact [support](https://support.it4i.cz/rt), the quota may be lifted upon request.

David Hrbáč's avatar
David Hrbáč committed
134
!!! note
Lukáš Krupčík's avatar
->    
Lukáš Krupčík committed
135
    The Scratch filesystem is intended for temporary scratch data generated during the calculation as well as for high performance access to input and output files. All I/O intensive jobs must use the SCRATCH filesystem as their working directory.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
136

Lukáš Krupčík's avatar
Lukáš Krupčík committed
137
    Users are advised to save the necessary data from the SCRATCH filesystem to HOME filesystem after the calculations and clean up the scratch files.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
138

139
!!! warning
David Hrbáč's avatar
David Hrbáč committed
140
    Files on the SCRATCH filesystem that are **not accessed for more than 90 days** will be automatically **deleted**.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
141
142
143

The SCRATCH filesystem is realized as Lustre parallel filesystem and is available from all login and computational nodes. Default stripe size is 1MB, stripe count is 1. There are 10 OSTs dedicated for the SCRATCH filesystem.

David Hrbáč's avatar
David Hrbáč committed
144
!!! note
Lukáš Krupčík's avatar
Lukáš Krupčík committed
145
    Setting stripe size and stripe count correctly for your needs may significantly impact the I/O performance you experience.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
146

David Hrbáč's avatar
David Hrbáč committed
147
| SCRATCH filesystem   |          |
Lukáš Krupčík's avatar
Lukáš Krupčík committed
148
| -------------------- | -------- |
David Hrbáč's avatar
David Hrbáč committed
149
150
151
152
153
154
155
| Mountpoint           | /scratch |
| Capacity             | 146TB    |
| Throughput           | 6GB/s    |
| User quota           | 100TB    |
| Default stripe size  | 1MB      |
| Default stripe count | 1        |
| Number of OSTs       | 10       |
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
156

David Hrbáč's avatar
David Hrbáč committed
157
### Disk Usage and Quota Commands
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
158
159
160

User quotas on the file systems can be checked and reviewed using following command:

Lukáš Krupčík's avatar
Lukáš Krupčík committed
161
```console
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
162
163
164
165
166
$ lfs quota dir
```

Example for Lustre HOME directory:

Lukáš Krupčík's avatar
Lukáš Krupčík committed
167
```console
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
168
169
$ lfs quota /home
Disk quotas for user user001 (uid 1234):
Lukáš Krupčík's avatar
->    
Lukáš Krupčík committed
170
171
    Filesystem kbytes   quota   limit   grace   files   quota   limit   grace
         /home 300096       0 250000000       -    2102       0 500000    -
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
172
Disk quotas for group user001 (gid 1234):
Lukáš Krupčík's avatar
->    
Lukáš Krupčík committed
173
174
    Filesystem kbytes   quota   limit   grace   files   quota   limit   grace
        /home 300096       0       0       -    2102       0       0       -
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
175
176
177
178
179
180
```

In this example, we view current quota size limit of 250GB and 300MB currently used by user001.

Example for Lustre SCRATCH directory:

Lukáš Krupčík's avatar
Lukáš Krupčík committed
181
```console
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
182
183
$ lfs quota /scratch
Disk quotas for user user001 (uid 1234):
Lukáš Krupčík's avatar
->    
Lukáš Krupčík committed
184
     Filesystem kbytes   quota   limit   grace   files   quota   limit   grace
Lukáš Krupčík's avatar
Lukáš Krupčík committed
185
          /scratch       8       0 100000000000       -       3       0       0       -
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
186
187
Disk quotas for group user001 (gid 1234):
 Filesystem kbytes quota limit grace files quota limit grace
Lukáš Krupčík's avatar
Lukáš Krupčík committed
188
 /scratch       8       0       0       -       3       0       0       -
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
189
190
191
192
193
194
```

In this example, we view current quota size limit of 100TB and 8KB currently used by user001.

To have a better understanding of where the space is exactly used, you can use following command to find out.

Lukáš Krupčík's avatar
Lukáš Krupčík committed
195
```console
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
196
197
198
199
200
$ du -hs dir
```

Example for your HOME directory:

Lukáš Krupčík's avatar
Lukáš Krupčík committed
201
```console
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
202
203
204
205
206
207
208
209
210
$ cd /home
$ du -hs * .[a-zA-z0-9]* | grep -E "[0-9]*G|[0-9]*M" | sort -hr
258M     cuda-samples
15M      .cache
13M      .mozilla
5,5M     .eclipse
2,7M     .idb_13.0_linux_intel64_app
```

David Hrbáč's avatar
David Hrbáč committed
211
This will list all directories which are having MegaBytes or GigaBytes of consumed space in your actual (in this example HOME) directory. List is sorted in descending order from largest to smallest files/directories.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
212
213
214

To have a better understanding of previous commands, you can read manpages.

Lukáš Krupčík's avatar
Lukáš Krupčík committed
215
```console
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
216
217
218
$ man lfs
```

Lukáš Krupčík's avatar
Lukáš Krupčík committed
219
```console
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
220
221
222
223
224
$ man du
```

### Extended ACLs

David Hrbáč's avatar
David Hrbáč committed
225
Extended ACLs provide another security mechanism beside the standard POSIX ACLs which are defined by three entries (for owner/group/others). Extended ACLs have more than the three basic entries. In addition, they also contain a mask entry and may contain any number of named user and named group entries.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
226
227
228

ACLs on a Lustre file system work exactly like ACLs on any Linux file system. They are manipulated with the standard tools in the standard manner. Below, we create a directory and allow a specific user access.

Lukáš Krupčík's avatar
Lukáš Krupčík committed
229
```console
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
230
231
232
[vop999@login1.anselm ~]$ umask 027
[vop999@login1.anselm ~]$ mkdir test
[vop999@login1.anselm ~]$ ls -ld test
Lukáš Krupčík's avatar
->    
Lukáš Krupčík committed
233
drwxr-x--- 2 vop999 vop999 4096 Nov 5 14:17 test
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
234
235
236
237
238
239
240
241
242
243
[vop999@login1.anselm ~]$ getfacl test
# file: test
# owner: vop999
# group: vop999
user::rwx
group::r-x
other::---

[vop999@login1.anselm ~]$ setfacl -m user:johnsm:rwx test
[vop999@login1.anselm ~]$ ls -ld test
Lukáš Krupčík's avatar
->    
Lukáš Krupčík committed
244
drwxrwx---+ 2 vop999 vop999 4096 Nov 5 14:17 test
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
245
246
247
248
249
250
251
252
253
254
255
256
257
[vop999@login1.anselm ~]$ getfacl test
# file: test
# owner: vop999
# group: vop999
user::rwx
user:johnsm:rwx
group::r-x
mask::rwx
other::---
```

Default ACL mechanism can be used to replace setuid/setgid permissions on directories. Setting a default ACL on a directory (-d flag to setfacl) will cause the ACL permissions to be inherited by any newly created file or subdirectory within the directory. Refer to this page for more information on Linux ACL:

Lukáš Krupčík's avatar
Lukáš Krupčík committed
258
[http://www.vanemery.com/Linux/ACL/POSIX_ACL_on_Linux.html](http://www.vanemery.com/Linux/ACL/POSIX_ACL_on_Linux.html)
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
259

David Hrbáč's avatar
David Hrbáč committed
260
## Local Filesystems
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
261
262
263

### Local Scratch

David Hrbáč's avatar
David Hrbáč committed
264
!!! note
Lukáš Krupčík's avatar
Lukáš Krupčík committed
265
    Every computational node is equipped with 330GB local scratch disk.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
266
267
268
269
270

Use local scratch in case you need to access large amount of small files during your calculation.

The local scratch disk is mounted as /lscratch and is accessible to user at /lscratch/$PBS_JOBID directory.

Lukáš Krupčík's avatar
->    
Lukáš Krupčík committed
271
The local scratch filesystem is intended for temporary scratch data generated during the calculation as well as for high performance access to input and output files. All I/O intensive jobs that access large number of small files within the calculation must use the local scratch filesystem as their working directory. This is required for performance reasons, as frequent access to number of small files may overload the metadata servers (MDS) of the Lustre filesystem.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
272

David Hrbáč's avatar
David Hrbáč committed
273
!!! note
Lukáš Krupčík's avatar
Lukáš Krupčík committed
274
    The local scratch directory /lscratch/$PBS_JOBID will be deleted immediately after the calculation end. Users should take care to save the output data from within the jobscript.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
275

David Hrbáč's avatar
David Hrbáč committed
276
| local SCRATCH filesystem |                      |
Lukáš Krupčík's avatar
Lukáš Krupčík committed
277
| ------------------------ | -------------------- |
David Hrbáč's avatar
David Hrbáč committed
278
279
280
281
282
| Mountpoint               | /lscratch            |
| Accesspoint              | /lscratch/$PBS_JOBID |
| Capacity                 | 330GB                |
| Throughput               | 100MB/s              |
| User quota               | none                 |
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
283

David Hrbáč's avatar
David Hrbáč committed
284
### RAM Disk
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
285
286
287

Every computational node is equipped with filesystem realized in memory, so called RAM disk.

David Hrbáč's avatar
David Hrbáč committed
288
!!! note
Lukáš Krupčík's avatar
Lukáš Krupčík committed
289
    Use RAM disk in case you need really fast access to your data of limited size during your calculation. Be very careful, use of RAM disk filesystem is at the expense of operational memory.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
290
291
292

The local RAM disk is mounted as /ramdisk and is accessible to user at /ramdisk/$PBS_JOBID directory.

David Hrbáč's avatar
David Hrbáč committed
293
The local RAM disk filesystem is intended for temporary scratch data generated during the calculation as well as for high performance access to input and output files. Size of RAM disk filesystem is limited. Be very careful, use of RAM disk filesystem is at the expense of operational memory.  It is not recommended to allocate large amount of memory and use large amount of data in RAM disk filesystem at the same time.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
294

David Hrbáč's avatar
David Hrbáč committed
295
!!! note
Lukáš Krupčík's avatar
Lukáš Krupčík committed
296
    The local RAM disk directory /ramdisk/$PBS_JOBID will be deleted immediately after the calculation end. Users should take care to save the output data from within the jobscript.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
297

David Hrbáč's avatar
David Hrbáč committed
298
| RAM disk    |                                                                                                         |
Lukáš Krupčík's avatar
Lukáš Krupčík committed
299
| ----------- | ------------------------------------------------------------------------------------------------------- |
David Hrbáč's avatar
David Hrbáč committed
300
301
302
303
304
| Mountpoint  | /ramdisk                                                                                                |
| Accesspoint | /ramdisk/$PBS_JOBID                                                                                     |
| Capacity    | 60GB at compute nodes without accelerator, 90GB at compute nodes with accelerator, 500GB at fat nodes   |
| Throughput  | over 1.5 GB/s write, over 5 GB/s read, single thread, over 10 GB/s write, over 50 GB/s read, 16 threads |
| User quota  | none                                                                                                    |
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
305

David Hrbáč's avatar
David Hrbáč committed
306
### Tmp
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
307
308
309

Each node is equipped with local /tmp directory of few GB capacity. The /tmp directory should be used to work with small temporary files. Old files in /tmp directory are automatically purged.

David Hrbáč's avatar
David Hrbáč committed
310
## Summary
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
311

David Hrbáč's avatar
David Hrbáč committed
312
| Mountpoint | Usage                     | Protocol | Net Capacity   | Throughput | Limitations | Access                  | Services                    |        |
Lukáš Krupčík's avatar
Lukáš Krupčík committed
313
| ---------- | ------------------------- | -------- | -------------- | ---------- | ----------- | ----------------------- | --------------------------- | ------ |
David Hrbáč's avatar
David Hrbáč committed
314
315
316
317
318
319
320
| /home      | home directory            | Lustre   | 320 TiB        | 2 GB/s     | Quota 250GB | Compute and login nodes | backed up                   |        |
| /scratch   | cluster shared jobs' data | Lustre   | 146 TiB        | 6 GB/s     | Quota 100TB | Compute and login nodes | files older 90 days removed |        |
| /lscratch  | node local jobs' data     | local    | 330 GB         | 100 MB/s   | none        | Compute nodes           | purged after job ends       |        |
| /ramdisk   | node local jobs' data     | local    | 60, 90, 500 GB | 5-50 GB/s  | none        | Compute nodes           | purged after job ends       |        |
| /tmp       | local temporary files     | local    | 9.5 GB         | 100 MB/s   | none        | Compute and login nodes | auto                        | purged |

## CESNET Data Storage
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
321
322
323

Do not use shared filesystems at IT4Innovations as a backup for large amount of data or long-term archiving purposes.

David Hrbáč's avatar
David Hrbáč committed
324
!!! note
Lukáš Krupčík's avatar
Lukáš Krupčík committed
325
    The IT4Innovations does not provide storage capacity for data archiving. Academic staff and students of research institutions in the Czech Republic can use [CESNET Storage service](https://du.cesnet.cz/).
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
326
327
328
329
330
331
332

The CESNET Storage service can be used for research purposes, mainly by academic staff and students of research institutions in the Czech Republic.

User of data storage CESNET (DU) association can become organizations or an individual person who is either in the current employment relationship (employees) or the current study relationship (students) to a legal entity (organization) that meets the “Principles for access to CESNET Large infrastructure (Access Policy)”.

User may only use data storage CESNET for data transfer and storage which are associated with activities in science, research, development, the spread of education, culture and prosperity. In detail see “Acceptable Use Policy CESNET Large Infrastructure (Acceptable Use Policy, AUP)”.

Pavel Jirásek's avatar
Links    
Pavel Jirásek committed
333
The service is documented [here](https://du.cesnet.cz/en/start). For special requirements please contact directly CESNET Storage Department via e-mail [du-support(at)cesnet.cz](mailto:du-support@cesnet.cz).
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
334
335
336
337
338

The procedure to obtain the CESNET access is quick and trouble-free.

(source [https://du.cesnet.cz/](https://du.cesnet.cz/wiki/doku.php/en/start "CESNET Data Storage"))

David Hrbáč's avatar
David Hrbáč committed
339
## CESNET Storage Access
David Hrbáč's avatar
David Hrbáč committed
340

David Hrbáč's avatar
David Hrbáč committed
341
### Understanding CESNET Storage
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
342

David Hrbáč's avatar
David Hrbáč committed
343
!!! note
Pavel Jirásek's avatar
links    
Pavel Jirásek committed
344
    It is very important to understand the CESNET storage before uploading data. [Please read](https://du.cesnet.cz/en/navody/home-migrace-plzen/start) first.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
345
346
347
348
349

Once registered for CESNET Storage, you may [access the storage](https://du.cesnet.cz/en/navody/faq/start) in number of ways. We recommend the SSHFS and RSYNC methods.

### SSHFS Access

David Hrbáč's avatar
David Hrbáč committed
350
!!! note
Lukáš Krupčík's avatar
Lukáš Krupčík committed
351
    SSHFS: The storage will be mounted like a local hard drive
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
352

Lukáš Krupčík's avatar
->    
Lukáš Krupčík committed
353
The SSHFS provides a very convenient way to access the CESNET Storage. The storage will be mounted onto a local directory, exposing the vast CESNET Storage as if it was a local removable hard drive. Files can be than copied in and out in a usual fashion.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
354

David Hrbáč's avatar
David Hrbáč committed
355
First, create the mount point
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
356

Lukáš Krupčík's avatar
Lukáš Krupčík committed
357
358
```console
$ mkdir cesnet
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
359
360
361
362
```

Mount the storage. Note that you can choose among the ssh.du1.cesnet.cz (Plzen), ssh.du2.cesnet.cz (Jihlava), ssh.du3.cesnet.cz (Brno) Mount tier1_home **(only 5120M !)**:

Lukáš Krupčík's avatar
Lukáš Krupčík committed
363
364
```console
$ sshfs username@ssh.du1.cesnet.cz:. cesnet/
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
365
366
367
368
```

For easy future access from Anselm, install your public key

Lukáš Krupčík's avatar
Lukáš Krupčík committed
369
370
```console
$ cp .ssh/id_rsa.pub cesnet/.ssh/authorized_keys
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
371
372
373
374
```

Mount tier1_cache_tape for the Storage VO:

Lukáš Krupčík's avatar
Lukáš Krupčík committed
375
376
```console
$ sshfs username@ssh.du1.cesnet.cz:/cache_tape/VO_storage/home/username cesnet/
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
377
378
379
380
```

View the archive, copy the files and directories in and out

Lukáš Krupčík's avatar
Lukáš Krupčík committed
381
382
383
384
```console
$ ls cesnet/
$ cp -a mydir cesnet/.
$ cp cesnet/myfile .
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
385
386
387
388
```

Once done, please remember to unmount the storage

Lukáš Krupčík's avatar
Lukáš Krupčík committed
389
390
```console
$ fusermount -u cesnet
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
391
392
```

David Hrbáč's avatar
David Hrbáč committed
393
### Rsync Access
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
394

David Hrbáč's avatar
David Hrbáč committed
395
!!! note
Lukáš Krupčík's avatar
Lukáš Krupčík committed
396
    Rsync provides delta transfer for best performance, can resume interrupted transfers
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
397

David Hrbáč's avatar
David Hrbáč committed
398
Rsync is a fast and extraordinarily versatile file copying tool. It is famous for its delta-transfer algorithm, which reduces the amount of data sent over the network by sending only the differences between the source files and the existing files in the destination.  Rsync is widely used for backups and mirroring and as an improved copy command for everyday use.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
399

David Hrbáč's avatar
David Hrbáč committed
400
Rsync finds files that need to be transferred using a "quick check" algorithm (by default) that looks for files that have changed in size or in last-modified time.  Any changes in the other preserved attributes (as requested by options) are made on the destination file directly when the quick check indicates that the file's data does not need to be updated.
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
401

Pavel Jirásek's avatar
links    
Pavel Jirásek committed
402
[More about Rsync](https://du.cesnet.cz/en/navody/rsync/start#pro_bezne_uzivatele)
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
403

David Hrbáč's avatar
David Hrbáč committed
404
Transfer large files to/from CESNET storage, assuming membership in the Storage VO
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
405

Lukáš Krupčík's avatar
Lukáš Krupčík committed
406
407
408
```console
$ rsync --progress datafile username@ssh.du1.cesnet.cz:VO_storage-cache_tape/.
$ rsync --progress username@ssh.du1.cesnet.cz:VO_storage-cache_tape/datafile .
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
409
410
```

David Hrbáč's avatar
David Hrbáč committed
411
Transfer large directories to/from CESNET storage, assuming membership in the Storage VO
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
412

Lukáš Krupčík's avatar
Lukáš Krupčík committed
413
414
415
```console
$ rsync --progress -av datafolder username@ssh.du1.cesnet.cz:VO_storage-cache_tape/.
$ rsync --progress -av username@ssh.du1.cesnet.cz:VO_storage-cache_tape/datafolder .
Pavel Jirásek's avatar
Merged  
Pavel Jirásek committed
416
417
```

David Hrbáč's avatar
David Hrbáč committed
418
Transfer rates of about 28 MB/s can be expected.