Skip to content
Snippets Groups Projects
README.md 5.33 KiB
Newer Older
schmiedc's avatar
schmiedc committed
Datasets
========================
schmiedc's avatar
schmiedc committed
The scripts are now supporting multiple angles, multiple channels and multiple illumination direction without adjusting the Snakefile or .bsh scripts.
schmiedc's avatar
schmiedc committed

schmiedc's avatar
schmiedc committed
Using spimdata version: 0.9-revision
Using SPIM registration version 3.3.9
schmiedc's avatar
schmiedc committed

Supported datasets are in the following format:

schmiedc's avatar
schmiedc committed
Using Zeiss Lightsheet Z.1 Dataset (LOCI)

    Multiple timepoints:  YES (one file per timepoint) or (all time-points in one file)
    Multiple channels:  YES (one file per timepoint) or (all time-points in one file)
    Multiple illumination directions: YES (one file per illumination direction)
    Multiple angles: YES (one file per angle)
    
Using LOCI Bioformats opener (.tif)

    Multiple timepoints: YES (one file per timepoint) or (all time-points in one file)
    Multiple channels: YES (one file per timepoint) or (all time-points in one file)
    Multiple illumination directions: YES (one file per illumination direction) => not tested yet
    Multiple angles: YES (one file per angle)
    
Using ImageJ Opener (resave to .tif):
schmiedc's avatar
schmiedc committed

    Multiple timepoints: YES (one file per timepoint)
    Multiple channels: YES (one file per channel)
    Multiple illumination directions: YES (one file per illumination direction) => not tested yet
    Multiple angles: YES (one file per angle)

Timelapse based workflow
========================

Expected setup
--------------
schmiedc's avatar
schmiedc committed
Clone the repository:

The repository contains the example configuration scripts for single and dual channel datasets, the Snakefile which defines the workflow, the beanshell scripts which drive the processing via Fiji and a cluster.json file which contains information for the cluster queuing system. 
schmiedc's avatar
schmiedc committed
```bash
schmiedc's avatar
schmiedc committed
/path/to/repo/timelapse
schmiedc's avatar
schmiedc committed
├── single_test.yaml
├── dual_OneChannel.yaml
├── Snakefile
├── cluster.json
├── define_tif_zip.bsh
├── define_czi.bsh
├── registration.bsh
├── deconvolution.bsh
schmiedc's avatar
schmiedc committed
├── transform.bsh	 		
├── registration.bsh 		
└── xml_merge.bsh	 		
```

schmiedc's avatar
schmiedc committed
A data directory e.g. looks like this:

schmiedc's avatar
schmiedc committed
It contains the .yaml file for the specific dataset. You can either copy it, if you want to keep it together with the dataset, or make a symlink from the processing repository. 
schmiedc's avatar
schmiedc committed

```bash
/path/to/data
schmiedc's avatar
schmiedc committed
├── dataset.czi
├── dataset(1).czi
├── dataset(2).czi
├── dataset(3).czi
├── dataset(4).czi
└── dataset.yaml	 		# copied/symlinked from this repo
schmiedc's avatar
schmiedc committed

* `tomancak.yaml` contains the parameters that configure the beanshell scripts found in the data directory
* `Snakefile` from this directory
* `cluster.json` that resides in the same directory as the `Snakefile`
* cluster runs LSF
schmiedc's avatar
schmiedc committed
Tools: 
schmiedc's avatar
schmiedc committed
--------------
schmiedc's avatar
schmiedc committed

The tool directory contains scripts for common file format pre-processing.
Some datasets are currently only usable when resaving them into .tif:
* discontinous .czi datasets
* .czi dataset with multiple groups

schmiedc's avatar
schmiedc committed
The master_preprocesing.sh file is the configuration script that contains the information about the dataset that needs to be resaved. In the czi_resave directory you will find the the create-resaving-jobs.sh script that creates a job for each TP. The submit-jobs script sends these jobs to the cluster were they call the resaving.bsh script. The beanshell then uses executes the Fiji macro and resaves the files. The resaving of czi files is using LOCI bioformats and preserves the metadata. 
schmiedc's avatar
schmiedc committed

```bash
/path/to/repo/tools
├── master_preprocessing.sh
├── czi_resave
    ├── create-resaving-jobs.sh
    ├── resaving.bsh
    └── submit-jobs
```
schmiedc's avatar
schmiedc committed
Processing
--------------

The current workflow consists of the following steps. It covers the prinicipal processing for timelapse multiview SPIM processing:

* define czi or tif dataset
* resave into hdf5
* detect and register interespoints
* merge xml
* timelapse registration
* optional for dual channel dataset: dublicate transformations
* optional for deconvolution: external transformation
* average-weight fusion/deconvolution
* define output
* resave output into hdf5

The entire processing is controlled via the yaml file. 

Preparations for processing
--------------


Submitting Jobs
---------------

If DRMAA is supported on your cluster:

schmiedc's avatar
schmiedc committed
/path/to/snakemake/snakemake -j2 -d /path/to/data/ --cluster-config ./cluster.json --drmaa " -q {cluster.lsf_q} {cluster.lsf_extra}"
schmiedc's avatar
schmiedc committed
/path/to/snakemake/snakemake -j2 -d /path/to/data/ --cluster-config ./cluster.json --cluster "bsub -q {cluster.lsf_q} {cluster.lsf_extra}"
schmiedc's avatar
schmiedc committed

For error and output of the cluser add -o test.out -e test.err e.g.:
schmiedc's avatar
schmiedc committed

schmiedc's avatar
schmiedc committed
DRMAA
```bash
schmiedc's avatar
schmiedc committed
/path/to/snakemake/snakemake -j2 -d /path/to/data/ --cluster-config ./cluster.json --drmaa " -q {cluster.lsf_q} {cluster.lsf_extra} -o test.out -e test.err"
schmiedc's avatar
schmiedc committed
```
schmiedc's avatar
schmiedc committed

schmiedc's avatar
schmiedc committed
LSF
```bash
schmiedc's avatar
schmiedc committed
/path/to/snakemake/snakemake -j2 -d /path/to/data/ --cluster-config ./cluster.json --cluster "bsub -q {cluster.lsf_q} {cluster.lsf_extra} -o test.out -e test.err"
schmiedc's avatar
schmiedc committed
```

schmiedc's avatar
schmiedc committed
Note:  the error and output of the cluster of all jobs are written into these files. 
schmiedc's avatar
schmiedc committed

schmiedc's avatar
schmiedc committed
Log files and supervision of the pipeline
schmiedc's avatar
schmiedc committed
---------------

The log files are written into a new directory in the data directory called "logs".
The log files are ordered according to their position in the workflow. Multiple or alternative steps in the pipeline are indicated by numbers. 
schmiedc's avatar
schmiedc committed

force certain rules:
use the -R flag to rerun a particular rule and everything downstream
-R <name of rule>
schmiedc's avatar
schmiedc committed