4 merge requests!368Update prace.md to document the change from qprace to qprod as the default...,!367Update prace.md to document the change from qprace to qprod as the default...,!366Update prace.md to document the change from qprace to qprod as the default...,!323extended-acls-storage-section
Cluster integration in the progress. The resulting settings may vary. The documentation will be updated.
Cluster integration in the progress. The resulting settings may vary. The documentation will be updated.
There are three main shared file systems on Barbora cluster, the [HOME][1], [SCRATCH][2] and [PROJECT][5]. All login and compute nodes may access same data on shared file systems. Compute nodes are also equipped with local (non-shared) scratch, RAM disk and tmp file systems.
There are three main shared file systems on Barbora cluster, the [HOME][1], [SCRATCH][2] and [PROJECT][5]. All login and compute nodes may access same data on shared file systems. Compute nodes are also equipped with local (non-shared) scratch, RAM disk, and tmp file systems.
## Archiving
## Archiving
Don't use shared filesystems as a backup for large amount of data or long-term archiving mean. The academic staff and students of research institutions in the Czech Republic can use [CESNET storage service][3], which is available via SSHFS.
Do not use shared filesystems as a backup for large amount of data or long-term archiving mean. The academic staff and students of research institutions in the Czech Republic can use [CESNET storage service][3], which is available via SSHFS.
## Shared Filesystems
## Shared Filesystems
Barbora computer provides three main shared filesystems, the [HOME filesystem][1], [SCRATCH filesystem][2] and the [PROJECT filesystems][5].
Barbora computer provides three main shared filesystems, the [HOME filesystem][1], [SCRATCH filesystem][2] and the [PROJECT filesystems][5].
*Both HOME and SCRATCH filesystems are realized as a parallel Lustre filesystem. Both shared file systems are accessible via the Infiniband network. Extended ACLs are provided on both Lustre filesystems for the purpose of sharing data with other users using fine-grained control.*
*Both HOME and SCRATCH filesystems are realized as a parallel Lustre filesystem. Both shared file systems are accessible via the Infiniband network. Extended ACLs are provided on both Lustre filesystems for sharing data with other users using fine-grained control.*
### Understanding the Lustre Filesystems
### Understanding the Lustre Filesystems
...
@@ -23,16 +23,16 @@ When a client (a compute node from your job) needs to create or access a file, t
...
@@ -23,16 +23,16 @@ When a client (a compute node from your job) needs to create or access a file, t
If multiple clients try to read and write the same part of a file at the same time, the Lustre distributed lock manager enforces coherency so that all clients see consistent results.
If multiple clients try to read and write the same part of a file at the same time, the Lustre distributed lock manager enforces coherency so that all clients see consistent results.
There is default stripe configuration for barbora Lustre filesystems. However, users can set the following stripe parameters for their own directories or files to get optimum I/O performance:
There is default stripe configuration for Barbora Lustre filesystems. However, users can set the following stripe parameters for their own directories or files to get optimum I/O performance:
1. stripe_size: the size of the chunk in bytes; specify with k, m, or g to use units of KB, MB, or GB, respectively; the size must be an even multiple of 65,536 bytes; default is 1MB for all barbora Lustre filesystems
1. stripe_size: the size of the chunk in bytes; specify with k, m, or g to use units of KB, MB, or GB, respectively; the size must be an even multiple of 65,536 bytes; default is 1MB for all Barbora Lustre filesystems
1. stripe_count the number of OSTs to stripe across; default is 1 for barbora Lustre filesystems one can specify -1 to use all OSTs in the filesystem.
1. stripe_count the number of OSTs to stripe across; default is 1 for Barbora Lustre filesystems one can specify -1 to use all OSTs in the filesystem.
1. stripe_offset The index of the OST where the first stripe is to be placed; default is -1 which results in random selection; using a non-default value is NOT recommended.
1. stripe_offset The index of the OST where the first stripe is to be placed; default is -1 which results in random selection; using a non-default value is NOT recommended.
!!! note
!!! note
Setting stripe size and stripe count correctly for your needs may significantly impact the I/O performance you experience.
Setting stripe size and stripe count correctly for your needs may significantly affect the I/O performance.
Use the lfs getstripe for getting the stripe parameters. Use the lfs setstripe command for setting the stripe parameters to get optimal I/O performance The correct stripe setting depends on your needs and file access patterns.
Use the lfs getstripe for getting the stripe parameters. Use the lfs setstripe command for setting the stripe parameters to get optimal I/O performance. The correct stripe setting depends on your needs and file access patterns.
In this example, we view current stripe setting of the /scratch/username/ directory. The stripe count is changed to all OSTs, and verified. All files written to this directory will be striped over 10 OSTs
In this example, we view the current stripe setting of the /scratch/username/ directory. The stripe count is changed to all OSTs, and verified. All files written to this directory will be striped over 10 OSTs
Use lfs check OSTs to see the number and status of active OSTs for each filesystem on Barbora. Learn more by reading the man page
Use lfs check OSTs to see the number and status of active OSTs for each filesystem on Barbora. Learn more by reading the man page:
```console
```console
$lfs check osts
$lfs check osts
...
@@ -66,7 +66,7 @@ $ man lfs
...
@@ -66,7 +66,7 @@ $ man lfs
!!! note
!!! note
Increase the stripe_count for parallel I/O to the same file.
Increase the stripe_count for parallel I/O to the same file.
When multiple processes are writing blocks of data to the same file in parallel, the I/O performance for large files will improve when the stripe_count is set to a larger value. The stripe count sets the number of OSTs the file will be written to. By default, the stripe count is set to 1. While this default setting provides for efficient access of metadata (for example to support the ls -l command), large files should use stripe counts of greater than 1. This will increase the aggregate I/O bandwidth by using multiple OSTs in parallel instead of just one. A rule of thumb is to use a stripe count approximately equal to the number of gigabytes in the file.
When multiple processes are writing blocks of data to the same file in parallel, the I/O performance for large files will improve when the stripe_count is set to a larger value. The stripe count sets the number of OSTs to which the file will be written. By default, the stripe count is set to 1. While this default setting provides for efficient access of metadata (for example to support the ls -l command), large files should use stripe counts of greater than 1. This will increase the aggregate I/O bandwidth by using multiple OSTs in parallel instead of just one. A rule of thumb is to use a stripe count approximately equal to the number of gigabytes in the file.
Another good practice is to make the stripe count be an integral factor of the number of processes performing the write in parallel, so that you achieve load balance among the OSTs. For example, set the stripe count to 16 instead of 15 when you have 64 processes performing the writes.
Another good practice is to make the stripe count be an integral factor of the number of processes performing the write in parallel, so that you achieve load balance among the OSTs. For example, set the stripe count to 16 instead of 15 when you have 64 processes performing the writes.
...
@@ -79,7 +79,7 @@ Read more [here][c].
...
@@ -79,7 +79,7 @@ Read more [here][c].
### Lustre on Barbora
### Lustre on Barbora
The architecture of Lustre on barbora is composed of two metadata servers (MDS) and four data/object storage servers (OSS). Two object storage servers are used for file system HOME and another two object storage servers are used for file system SCRATCH.
The architecture of Lustre on Barbora is composed of two metadata servers (MDS) and four data/object storage servers (OSS). Two object storage servers are used for file system HOME and another two object storage servers are used for file system SCRATCH.
Configuration of the storages
Configuration of the storages
...
@@ -108,15 +108,15 @@ The HOME filesystem is mounted in directory /home. Users home directories /home/
...
@@ -108,15 +108,15 @@ The HOME filesystem is mounted in directory /home. Users home directories /home/
The HOME filesystem should not be used to archive data of past Projects or other unrelated data.
The HOME filesystem should not be used to archive data of past Projects or other unrelated data.
The files on HOME filesystem will not be deleted until end of the [users lifecycle][4].
The files on HOME filesystem will not be deleted until the end of the [user's lifecycle][4].
The filesystem is backed up, such that it can be restored in case of catasthropic failure resulting in significant data loss. This backup however is not intended to restore old versions of user data or to restore (accidentaly) deleted files.
The filesystem is backed up, so that it can be restored in case of a catastrophic failure resulting in significant data loss. However, this backup is not intended to restore old versions of user data or to restore (accidentaly) deleted files.
The HOME filesystem is realized as Lustre parallel filesystem and is available on all login and computational nodes.
The HOME filesystem is realized as Lustre parallel filesystem and is available on all login and computational nodes.
Default stripe size is 1MB, stripe count is 1. There are 22 OSTs dedicated for the HOME filesystem.
Default stripe size is 1MB, stripe count is 1. There are 22 OSTs dedicated for the HOME filesystem.
!!! note
!!! note
Setting stripe size and stripe count correctly for your needs may significantly impact the I/O performance you experience.
Setting stripe size and stripe count correctly for your needs may significantly affect the I/O performance.
| HOME filesystem | |
| HOME filesystem | |
| -------------------- | ------ |
| -------------------- | ------ |
...
@@ -143,7 +143,7 @@ The SCRATCH filesystem is mounted in directory /scratch. Users may freely create
...
@@ -143,7 +143,7 @@ The SCRATCH filesystem is mounted in directory /scratch. Users may freely create
The SCRATCH filesystem is realized as Lustre parallel filesystem and is available from all login and computational nodes. Default stripe size is 1MB, stripe count is 1. There are 10 OSTs dedicated for the SCRATCH filesystem.
The SCRATCH filesystem is realized as Lustre parallel filesystem and is available from all login and computational nodes. Default stripe size is 1MB, stripe count is 1. There are 10 OSTs dedicated for the SCRATCH filesystem.
!!! note
!!! note
Setting stripe size and stripe count correctly for your needs may significantly impact the I/O performance you experience.
Setting stripe size and stripe count correctly for your needs may significantly affect the I/O performance.
| SCRATCH filesystem | |
| SCRATCH filesystem | |
| -------------------- | -------- |
| -------------------- | -------- |
...
@@ -161,7 +161,7 @@ to do...
...
@@ -161,7 +161,7 @@ to do...
### Disk Usage and Quota Commands
### Disk Usage and Quota Commands
Disk usage and user quotas can be checked and reviewed using following command:
Disk usage and user quotas can be checked and reviewed using the following command:
This will list all directories which are having MegaBytes or GigaBytes of consumed space in your actual (in this example HOME) directory. List is sorted in descending order from largest to smallest files/directories.
This will list all directories which are having MegaBytes or GigaBytes of consumed space in your actual (in this example HOME) directory. List is sorted in descending order from largest to smallest files/directories.
To have a better understanding of previous commands, you can read manpages.
To have a better understanding of previous commands, you can read manpages:
```console
```console
$man lfs
$man lfs
...
@@ -223,7 +223,7 @@ $ man du
...
@@ -223,7 +223,7 @@ $ man du
### Extended ACLs
### Extended ACLs
Extended ACLs provide another security mechanism beside the standard POSIX ACLs which are defined by three entries (for owner/group/others). Extended ACLs have more than the three basic entries. In addition, they also contain a mask entry and may contain any number of named user and named group entries.
Extended ACLs provide another security mechanism beside the standard POSIX ACLs, which are defined by three entries (for owner/group/others). Extended ACLs have more than the three basic entries. In addition, they also contain a mask entry and may contain any number of named user and named group entries.
ACLs on a Lustre file system work exactly like ACLs on any Linux file system. They are manipulated with the standard tools in the standard manner. Below, we create a directory and allow a specific user access.
ACLs on a Lustre file system work exactly like ACLs on any Linux file system. They are manipulated with the standard tools in the standard manner. Below, we create a directory and allow a specific user access.
...
@@ -265,17 +265,17 @@ Each node is equipped with local /tmp directory of few GB capacity. The /tmp dir
...
@@ -265,17 +265,17 @@ Each node is equipped with local /tmp directory of few GB capacity. The /tmp dir
Do not use shared filesystems at IT4Innovations as a backup for large amount of data or long-term archiving purposes.
Do not use shared filesystems at IT4Innovations as a backup for large amount of data or long-term archiving purposes.
!!! note
!!! note
The IT4Innovations does not provide storage capacity for data archiving. Academic staff and students of research institutions in the Czech Republic can use [CESNET Storage service][f].
IT4Innovations does not provide storage capacity for data archiving. Academic staff and students of research institutions in the Czech Republic can use [CESNET Storage service][f].
The CESNET Storage service can be used for research purposes, mainly by academic staff and students of research institutions in the Czech Republic.
The CESNET Storage service can be used for research purposes, mainly by academic staff and students of research institutions in the Czech Republic.
User of data storage CESNET (DU) association can become organizations or an individual person who is either in the current employment relationship (employees) or the current study relationship (students) to a legal entity (organization) that meets the “Principles for access to CESNET Large infrastructure (Access Policy)”.
User of data storage CESNET (DU) association can become organizations or an individual person who is either in the current employment relationship (employees) or the current study relationship (students) to a legal entity (organization) that meets the “Principles for access to CESNET Large infrastructure (Access Policy)”.
User may only use data storage CESNET for data transfer and storage which are associated with activities in science, research, development, the spread of education, culture and prosperity. In detail see “Acceptable Use Policy CESNET Large Infrastructure (Acceptable Use Policy, AUP)”.
User may only use data storage CESNET for data transfer and storage associated with activities in science, research, development, the spread of education, culture and prosperity. For details, see “Acceptable Use Policy CESNET Large Infrastructure (Acceptable Use Policy, AUP)”.
The service is documented [here][g]. For special requirements contact directly CESNET Storage Department via e-mail [du-support(at)cesnet.cz][h].
The service is documented [here][g]. For special requirements contact directly CESNET Storage Department via e-mail [du-support(at)cesnet.cz][h].
The procedure to obtain the CESNET access is quick and trouble-free.
The procedure to obtain the CESNET access is quick and simple.
## CESNET Storage Access
## CESNET Storage Access
...
@@ -299,13 +299,13 @@ First, create the mount point
...
@@ -299,13 +299,13 @@ First, create the mount point
$mkdir cesnet
$mkdir cesnet
```
```
Mount the storage. Note that you can choose among the ssh.du1.cesnet.cz (Plzen), ssh.du2.cesnet.cz (Jihlava), ssh.du3.cesnet.cz (Brno) Mount tier1_home **(only 5120 MB!)**:
Mount the storage. Note that you can choose among the ssh.du1.cesnet.cz (Plzen), ssh.du2.cesnet.cz (Jihlava), ssh.du3.cesnet.cz (Brno) Mount tier1_home **(only 5120 MB!)**:
```console
```console
$sshfs username@ssh.du1.cesnet.cz:. cesnet/
$sshfs username@ssh.du1.cesnet.cz:. cesnet/
```
```
For easy future access from barbora, install your public key
For easy future access from Barbora, install your public key:
```console
```console
$cp .ssh/id_rsa.pub cesnet/.ssh/authorized_keys
$cp .ssh/id_rsa.pub cesnet/.ssh/authorized_keys
...
@@ -317,7 +317,7 @@ Mount tier1_cache_tape for the Storage VO:
...
@@ -317,7 +317,7 @@ Mount tier1_cache_tape for the Storage VO: