Conditionals
Nextflow uses conditionals as a ‘flow control’ mechanism to skip process blocks or operations. The most obvious example is when re-running an alignment workflow where we have already created or downloaded the index file. We don’t want to keep re-running the labor intensive indexing process.
There are some strict rules regarding the use of conditionals:
Boolean parameters should be set to
true/falseinnextflow.config.File Paths / Strings / Integers / Floats / Lists / Maps should be set to
nullinnextflow.config.
Nextflow also makes heavy use of ternary operators. The code line A ? B : C reads if A is true, choose B, else C.
Note
We will continue with our RNA-Seq workflow example in this section.
Let’s flesh out our nextflow.config:
process{
container = "barryd237/test:dev"
containerOptions = ' -B /data/'
}
params{
input = "/data/test/test-datasets/fastq/*_{1,2}.fastq.gz"
fasta = "/data/test/test-datasets/reference/chrI.fa"
gtf = "/data/test/test-datasets/reference/chrI.gtf"
transcriptome = null
outdir = "/data/test/"
save_qc_intermediates = true
save_transcriptome = true
run_qc = true
}
singularity.enabled = true
singularity.autoMounts = true
singularity.cacheDir = "/data/containers"
Update .gitignore
Update your .gitignore file so you don’t upload the directories output by our script. As of writing the documentation, this is what mine looks like:
*.img
test-datasets/
work/
.nextflow.*
.nextflow/
dummy_files/
fastqc/
multiqc/
Update Script
Overwrite the contents of main.nf with the following, and push to GitHub:
#!/usr/bin/env nextflow
Channel.fromFilePairs("${params.input}", checkIfExists: true)
.into{ ch_qc_reads; ch_alignment_reads }
ch_fasta = Channel.value(file(params.fasta))
ch_gtf = Channel.value(file(params.gtf))
process FASTQC{
tag "${base}"
publishDir params.outdir, mode: 'copy',
saveAs: { params.save_qc_intermediates ? "fastqc/${it}" : null }
when:
params.run_qc
input:
tuple val(base), file(reads) from ch_qc_reads
output:
file("*.{html,zip}") into ch_multiqc
script:
"""
fastqc -q $reads
"""
}
process MULTIQC{
publishDir "${params.outdir}/multiqc", mode: 'copy'
when:
params.run_qc
input:
file(htmls) from ch_multiqc.collect()
output:
file("multiqc_report.html") into multiqc_out
script:
"""
multiqc .
"""
}
process TX{
publishDir params.outdir, mode: 'copy',
saveAs: { params.save_transcriptome ? "reference/transcriptome/${it}" : null }
when:
!params.transcriptome && params.fasta
input:
file(fasta) from ch_fasta
file(gtf) from ch_gtf
output:
file("${fasta.baseName}.tx.fa") into transcriptome_created
script:
"""
gffread -F -w "${fasta.baseName}.tx.fa" -g $fasta $gtf
"""
}
ch_transcriptome = params.transcriptome ? Channel.value(file(params.transcriptome)) : transcriptome_created
Push to changes to github and run the workflow:
git add .
git commit -m "Update repo"
git push
nextflow pull <username>/rtp_workshop
nextflow run -r dev <username>/rtp_workshop
nextflow run main.nf -profile docker -c nextflow.config
Note
For those curious, workflows are staged under ~/.nextflow/assets/<github-username>/
cool.
Go to Assignment II Part 3 :)