bwa

Summary

Software Description

BWA is a software package for mapping low-divergent sequences against a large reference genome, such as the human genome. It consists of three algorithms: BWA-backtrack, BWA-SW and BWA-MEM. The first algorithm is designed for Illumina sequence reads up to 100bp, while the rest two for longer sequences ranged from 70bp to 1Mbp. BWA-MEM and BWA-SW share similar features such as long-read support and split alignment, but BWA-MEM, which is the latest, is generally recommended for high-quality queries as it is faster and more accurate. BWA-MEM also has better performance than BWA-backtrack for 70-100bp Illumina reads.

General Linux

To run this software interactively in a Linux environment run the commands:

module load bwa
bwa

For each of the BWA algorithms, you must first index the genome with the following command:

bwa index [options] input.fasta

BWA programs may also be submitted to a queue using Slurm script such as the one below:

#!/bin/bash
#SBATCH --nodes=1 --ntasks-per-node=16
#SBATCH --mem-per-cpu=1000M
#SBATCH --time=8:00:00
#SBATCH --mail-type=ALL
#SBATCH --mail-user=sample_email@umn.edu

module load bwa
bwa index input.fasta
bwa aln -t $SLURM_NTASKS input.fasta input.fq > output.sai

Note the use of -t $SLURM_NTASKS; this option is used to specify the number of threads used by BWA.

A list of options is available at the BWA manual page.

To run the BWA-backtrack alignment algorithm, use any of the following commands:

bwa aln
bwa samse
bwa sampe

The BWA-SW algorithm can be run using

bwa bwasw

The BWA-MEM algorithm can be run using

bwa mem

Slurm Example

#!/bin/bash
#SBATCH --job-name="rfm_RunBWATest_job"
#SBATCH --ntasks=1
#SBATCH --ntasks-per-node=1
#SBATCH --output=rfm_RunBWATest_job.out
#SBATCH --error=rfm_RunBWATest_job.err
#SBATCH --time=0:10:0
#SBATCH -p msismall
module load bwa/0.7.17
wget https://public.s3.msi.umn.edu/reframe/sw/bwa/sample.fq
wget https://public.s3.msi.umn.edu/reframe/sw/bwa/ref.fasta
bwa index ref.fasta
bwa aln -t 4 ref.fasta sample.fq > output.sai