-
Notifications
You must be signed in to change notification settings - Fork 5
Expand file tree
/
Copy pathconsensus.sh
More file actions
executable file
·95 lines (79 loc) · 3.27 KB
/
consensus.sh
File metadata and controls
executable file
·95 lines (79 loc) · 3.27 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
#!/bin/bash
usage(){
echo "
Written by Brian Bushnell
Last modified February 10, 2020
Description: Generates the consensus sequence of a reference
using aligned sequences. This can be used for polishing assemblies,
making representative ribosomal sub units, correcting PacBio reads, etc.
If unaligned sequences are used as input, they should be in fasta or fastq
format, and they will be aligned to the first reference sequence.
Usage: consensus.sh in=mapped.sam ref=ref.fa out=consensus.fa
Recommended settings for assembly polishing via Illumina reads: mafsub=0.5
Standard parameters:
in=<file> Reads mapped to the reference; should be sam or bam.
ref=<file> Reference; may be fasta or fastq.
out=<file> Modified reference; may be fasta or fastq.
outm=<file> Optional output for binary model file.
Preferred extension is .alm.
inm=<file> Optional input model file for statistics.
hist=<file> Optional score histogram output.
overwrite=f (ow) Set to false to force the program to abort rather than
overwrite an existing file.
Processing parameters:
mindepth=2 Do not change to alleles present at depth below this.
mafsub=0.25 Do not incorporate substitutions below this allele fraction.
mafdel=0.50 Do not incorporate deletions below this allele fraction.
mafins=0.50 Do not incorporate insertions below this allele fraction.
mafn=0.40 Do not change Ns (noref) to calls below this allele fraction.
usemapq=f Include mapq as a positive factor in edge weight.
nonly=f Only change Ns to different bases.
noindels=f Don't allow indels.
ceiling= If set, alignments will be weighted by their inverse identity.
For example, at ceiling=105, a read with 96% identity will get
bonus weight of 105-96=9 while a read with 70% identity will
get 105-70=35. This favors low-identity reads.
name= Set the output sequence name (for a single output sequence).
Java Parameters:
-Xmx This will set Java's memory usage, overriding autodetection.
-Xmx20g will specify 20 gigs of RAM, and -Xmx200m will
specify 200 megs. The max is typically 85% of physical memory.
-eoom This flag will cause the process to exit if an out-of-memory
exception occurs. Requires Java 8u92+.
-da Disable assertions.
Please contact Brian Bushnell at bbushnell@lbl.gov if you encounter any problems.
For documentation and the latest version, visit: https://bbmap.org
"
}
if [ -z "$1" ] || [[ $1 == -h ]] || [[ $1 == --help ]]; then
usage
exit
fi
resolveSymlinks(){
SCRIPT="$(cd "$(dirname "$0")" && pwd)/$(basename "$0")"
while [ -h "$SCRIPT" ]; do
DIR="$(dirname "$SCRIPT")"
SCRIPT="$(readlink "$SCRIPT")"
[ "${SCRIPT#/}" = "$SCRIPT" ] && SCRIPT="$DIR/$SCRIPT"
done
DIR="$(cd "$(dirname "$SCRIPT")" && pwd)"
if [ -f "$DIR/bbtools.jar" ]; then
CP="$DIR/bbtools.jar"
else
CP="$DIR/current/"
fi
}
setEnv(){
. "$DIR/javasetup.sh"
. "$DIR/memdetect.sh"
parseJavaArgs "--xmx=4g" "--xms=4g" "--percent=84" "--mode=auto" "$@"
setEnvironment
}
launch() {
CMD="java $EA $EOOM $SIMD $XMX $XMS -Xss8m -cp $CP consensus.ConsensusMaker $@"
echo "$CMD" >&2
eval $CMD
}
resolveSymlinks
setEnv "$@"
launch "$@"