CALF (Compact ALignment Format)
Phred
The phred software reads DNA sequencing trace files, calls bases, and assigns a quality value to each called base.
The quality value is a log-transformed error probability, specifically
Q = -10 log10( Pe )
where Q and Pe are respectively the quality value and error probability of a particular base call.
The phred quality values have been thoroughly tested for both accuracy and power to discriminate between correct and incorrect base-calls.
Phred can use the quality values to perform sequence trimming.
Phred works well with trace files from the following manufacturers' sequencing machines: Amersham Biosciences, Applied Biosystems, Beckman Instruments, and LI-COR Life Sciences. See the phred documentation for specific compatibility information.
Phred runs on most computers and operating systems including Apple Mac OS X, *BSD, Hewlett-Packard HP-UX, HP-Compaq Tru64, IBM AIX, Linux, Microsoft Windows, Silicon Graphics IRIX, and SUN Solaris.
We distribute phred as 'C' source code: in order to run it you need a 'C' compiler.
See the phred documentation for additional information.
Phred Quality Values and ABI 3700 Data
References:
Ewing B, Green P: Basecalling
of automated sequencer traces using phred. II. Error probabilities. Genome
Research 8:186-194 (1998).
Ewing B, Hillier L, Wendl M, Green P:
Basecalling of automated sequencer traces using phred. I. Accuracy assessment.
Genome Research 8:175-185 (1998).
Phrap/Cross_match/Swat
phrap is a program for assembling shotgun DNA sequence data. Among other features, it allows use of the entire read and not just the trimmed high quality part, it uses a combination of user-supplied and internally computed data quality information to improve assembly accuracy in the presence of repeats, it constructs the contig sequence as a mosaic of the highest quality read segments rather than a consensus, it provides extensive assembly information to assist in trouble-shooting assembly problems, and it handles large datasets. See the phrap/cross_match/swat documentation and phrap documentation for additional information.
cross_match is a general purpose utility for comparing any two DNA sequence sets using a 'banded' version of swat. For example, it can be used to compare a set of reads to a set of vector sequences and produce vector-masked versions of the reads, a set of cDNA sequences to a set of cosmids, contig sequences found by two alternative assembly procedures (for example, phrap and xbap) to each other, or phrap contigs to the final edited cosmid sequence. It is slower but more sensitive than BLAST. See the phrap/cross_match/swat documentation and phrap documentation for additional information.
swat is a program for searching one or more DNA or protein query sequences, or a query profile, against a sequence database, using an efficient implementation of the Smith-Waterman or Needleman-Wunsch algorithms with linear (affine) gap penalties. For each match an empirical measure of statistical significance derived from the observed score distribution is computed. See the phrap/cross_match/swat documentation and swat documentation for additional information.
phrap/cross_match/swat runs on most computers and operating systems including Apple Mac OS X, *BSD, Hewlett-Packard HP-UX, HP-Compaq Tru64, IBM AIX, Linux, Microsoft Windows, Silicon Graphics IRIX, and SUN Solaris.
We distribute phrap/cross_match/swat as 'C' source code: in order to run them you need a 'C' compiler.
Consed/Autofinish
Consed/Autofinish is a tool for viewing, editing, and finishing sequence assemblies created with phrap. Finishing capabilities include allowing the user to pick primers and templates, suggesting additional sequencing reactions to perform, and facilitating checking the accuracy of the assembly using digest and forward/reverse pair information.
See the consed page for additional information.
References:
Gordon, David. "Viewing
and Editing Assembled Sequences Using Consed", in Current Protocols
in Bioinformatics,A. D. Baxevanis and D. B. Davison, eds, New York: John
Wiley & Co., 2004, 11.2.1-11.2.43.
Gordon D, Desmarais C, Green P: Automated
finishing with Autofinish. Genome Res 11:614-625 (2001).
Gordon D, Abajian C, Green P:
Consed: a graphical tool for sequence finishing.
Genome Research 8:195-202 (1998).