Please note that the automated workflow for processing SPIM data on a cluster is based on a publication. If you use it successfully for your research please be so kind to cite the following work:
* C. Schmied, P. Steinbach, T. Pietzsch, S. Preibisch, P. Tomancak (2015) An automated workflow for parallel processing of large multiview SPIM recordings. Bioinformatics, Dec 1; doi: 10.1093/bioinformatics/btv706 http://bioinformatics.oxfordjournals.org/content/early/2015/12/30/bioinformatics.btv706.long
The automated workflow is based on the Fiji plugins Multiview Reconstruction and BigDataViewer|BigDataViewer. Please refer to and cite the following publications:
* S. Preibisch, S. Saalfeld, J. Schindelin and P. Tomancak (2010) Software for bead-based registration of selective plane illumination microscopy data, Nature Methods, 7(6):418-419. http://www.nature.com/nmeth/journal/v7/n6/full/nmeth0610-418.html
* S. Preibisch, F. Amat, E. Stamataki, M. Sarov, R.H. Singer, E. Myers and P. Tomancak (2014) Efficient Bayesian-based Multiview Deconvolution, Nature Methods, 11(6):645-648. http://www.nature.com/nmeth/journal/v11/n6/full/nmeth.2929.html
* T. Pietzsch, S. Saalfeld, S. Preibisch, P. Tomancak (2015) BigDataViewer: visualization and processing for large image data sets. Nature Methods, 12(6)481–483. http://www.nature.com/nmeth/journal/v12/n6/full/nmeth.3392.html
Datasets
========================
The scripts are now supporting multiple angles, multiple channels and multiple illumination direction without adjusting the Snakefile or .bsh scripts.
...
...
@@ -39,17 +57,23 @@ Clone the repository:
The repository contains the example configuration scripts for single and dual channel datasets, the Snakefile which defines the workflow, the beanshell scripts which drive the processing via Fiji and a cluster.json file which contains information for the cluster queuing system.
```bash
/path/to/repo/timelapse
├── single_test.yaml
├── dual_OneChannel.yaml
/path/to/repository/spim_registration/timelapse/
├── README.md
├── Snakefile
├── cluster.json
├── define_tif_zip.bsh
├── define_czi.bsh
├── registration.bsh
├── config.yaml
├── deconvolution.bsh
├── transform.bsh
├── registration.bsh
├── define_czi.bsh
├── define_output.bsh
├── define_tif_zip.bsh
├── duplicate_transformations.bsh
├── export.bsh
├── export_output.bsh
├── fusion.bsh
├── registration.bsh
├── timelapse_registration.bsh
├── timelapse_utils.py
├── transform.bsh
└── xml_merge.bsh
```
...
...
@@ -58,17 +82,17 @@ A data directory e.g. looks like this:
It contains the .yaml file for the specific dataset. You can either copy it, if you want to keep it together with the dataset, or make a symlink from the processing repository.
```bash
/path/to/data
├── dataset.czi
├── dataset(1).czi
├── dataset(2).czi
├── dataset(3).czi
├── dataset(4).czi
└── dataset.yaml # copied/symlinked from this repo
/path/to/data/
├── exampleSingleChannel.czi
├── exampleSingleChannel(1).czi
├── exampleSingleChannel(2).czi
├── exampleSingleChannel(3).czi
├── exampleSingleChannel(4).czi
└── config.yaml # copied/symlinked from this repo
```
*`tomancak.yaml` contains the parameters that configure the beanshell scripts found in the data directory
*`config.yaml` contains the parameters that configure the beanshell scripts found in the data directory
*`Snakefile` from this directory
*`cluster.json` that resides in the same directory as the `Snakefile`
* cluster runs LSF
...
...
@@ -80,115 +104,91 @@ The tool directory contains scripts for common file format pre-processing.
Some datasets are currently only usable when resaving them into .tif:
* discontinous .czi datasets
* .czi dataset with multiple groups
* .ome.tiff files
The master_preprocesing.sh file is the configuration script that contains the information about the dataset that needs to be resaved. In the czi_resave directory you will find the the create-resaving-jobs.sh script that creates a job for each TP. The submit-jobs script sends these jobs to the cluster were they call the resaving.bsh script. The beanshell then uses executes the Fiji macro and resaves the files. The resaving of czi files is using LOCI bioformats and preserves the metadata.
```bash
/path/to/repo/tools
├── master_preprocessing.sh
├── czi_resave
/path/to/repository/spim_registration/tools/
├── czi_resave/
├── create-resaving-jobs.sh
├── resaving.bsh
└── submit-jobs
├── ometiff_resave/
├── create-ometiff_resave.sh
├── ometiff_resave.bsh
└── submit-jobs
└── master_preprocessing.sh
```
cluster_tools directory
--------------
The cluster tools directory contains the libraries for GPU deconvolution and the virtual frame buffer (xvfb) for running Fiji headless.
workflow
```bash
libFourierConvolutionCUDALib.so
xvfb-run
```
sysconfcpus
--------------
The current workflow consists of the following steps. It covers the prinicipal processing for timelapse multiview SPIM processing:
We use Libsysconfcpus (http://www.kev.pulo.com.au/libsysconfcpus/) to restrict how many cores Fiji is using on the cluster.
* define czi or tif dataset
* resave into hdf5
* detect and register interespoints
* merge xml
* timelapse registration
* optional for dual channel dataset: dublicate transformations
* optional for deconvolution: external transformation
* average-weight fusion/deconvolution
* define output
* resave output into hdf5
Compile with:
```bash
CFLAGS=-ansi ./configure --prefix=$PREFIX
make
make install
```
where PREFIX is the installation directory.
ANSI mode is necessary when compiling with our default GCC version, 4.9.2.
It may or may not be necessary with older versions.
Command line
--------------
It is very likely that the cluster computer does not run ANY Graphical User Interface and relies exclusively on the command line. Steering a cluster from the command line is fairly easy - I use about 10 different commands to do everything I need to do. Since the Linux command line may be unfamiliar to most biologists we start a separate http://imagej.net/Linux_command_line_tutorial and http://swcarpentry.github.io/shell-novice/ page that explains the bare essentials.
Preparations for processing
workflow
--------------
The entire processing is controlled via the yaml file.
In the first part (common) of the yaml file the key parameters for the processing are found.
These parameters are usually dataset and user dependent.
The second part contains the advanced and manual overrides for each processing step. These steps correspond to the rules in the snakefile.
The current workflow consists of the following steps. It covers the prinicipal processing for timelapse multiview SPIM processing:
1.Software directories
1.define czi or tif dataset.
2.Processing switches
2.resave into hdf5.
2.1. Switch between all channels contain beads and one channel of two contains beads
2.2. Switch between fusion and deconvolution
3. Define dataset
3.1. General Settings
3. detect and register interest points.
3.2. Settings for .czi files
3.3 Settings for .tif datasets
4. Detection and registration
4. merge xml, creates XML for registered dataset.
5.Timelapse registration
5.timelapse registration.
6.Weighted-average fusion
6.optional for dual channel dataset: duplicate transformations
7.Multiview deconvolution
7.optional for deconvolution: external transformation
7.1. External transformation
7.2. Deconvolution settings
8. average-weight fusion/deconvolution
8.Advanced settings
9.define output
8.1. define_xml_czi
8.2. define_xml_tif
8.3. resave_hdf5
8.4. registration
8.5. xml_merge
8.6. timelapse
8.7. dublicate_transformations
8.8. fusion
8.9. external_transform
8.10. deconvolution
8.11. hdf5_output
10. resave output into hdf5, creates XML for fused dataset.
Initial setup of the workflow
--------------
After you cloned the snakemake-workflows repository you need to configure the config.yaml for your setup.
This means you need to specify the directory for your Fiji, the location of the xvfb-run and the location for the GPU deconvolution libraries.
Go into the timelapse directory of the snakemake-workflows and open the config.yaml with your preferred editor for example nano and change the settings in section 7. Software directories:
We recommend to execute Snakemake within screen (https://www.gnu.org/software/screen/manual/screen.html).
To execute Snakemake you need to call Snakemake, specify the number of jobs, the location of the data and to dispatch jobs to a cluster with the information for the queuing system. Here is a list of commands and flags that are used for the Snakemake workflow: