MAVIANT
Multi-purpose Alignment VIewing and ANnotation Tool
v1.0
GNU General Public License
Download Maviant .zip .tar.gz
------------------
1. What is Maviant
------------------
Maviant is a multi-purpose alignment viewing and annotation tool. It reads
alignments from assembly files like ACE and generate a web based view using html,
javascript and png files, which makes it independant from the data source. The
output generated by Maviant is referred to as a view and is opened using a
browser. The web based solution supports a cross-platform environment and is
independant of operative systems.
Maviant supports phred and chromatogram files used to generate chromatogram
images aligned under each sequence in a view. Features like SNPs, repeats and
comments are added to a view using a feature list file. These are later
annotated by users internally and externally.
Maviant is used as a standalone program for small and simple tasks or in
large-scale with web server using a database for annotation purposes.
-------------
1.1. Features
-------------
Maviant has following features implemented:
- Maviant is written Perl using StengaardBio and GD graphics library. Maviant
generates views build from html, png image and javascript files to support a
cross-platform environment.
- All webpages generated are XHTML Strict and CSS W3C validation valid
http://validator.w3.org/ to increase cross-browser support.
- Completely verified to support following browsers: Internet Explorer, Mozilla
Firefox, Netscape, Opera, Konqurer (Linux), Safari (Mac).
- Maviant output is pregenerated which gives browsers fast loading speed and makes
it completely independent from data source.
- Alignment overview of sequences and features in 3 different modes: data
(consensus and sequence overview), filter (consensus and sequences filter
overview) and quality (consensus and sequence quality overview).
- Alignment fragments generated for each 50 bases in 3 different modes: data
(sequence bases colored with transparent background), background (sequence
base color as background) and quality (sequence bases colored and background
indicating sequence quality). All 3 modes are also generated in a filtered
version, which indicates base differences between consensus and sequences.
- Reads alignments from Cap3, Phrap and PolyBayes ace files.
- Generation of chromatogram images using Phred .phd files and Applied
Biosystems .abi, .abd or .ab1 chromatogram files.
- Indication of features like SNPs, Repeats and custom annotation using a
feature file containing sequence name, annotation type (SNP, Repeat), start
and end position and color for each individual feature.
- Annotation of features defined in feature file by clicking on them in Maviant
output opening a dynamic website to put and retrieve annotation directly from
flat files or databases.
- Navigation panel to easily navigate in through alignment and collapse or expand
chromatogram and sequence feature images in alignment fragments for individual
or all sequences to e.g. increase overview of interesting features for
annotation.
------------
1.2. License
------------
Maviant - Multi-purpose Alignment VIewing and ANnotation Tool
Copyright (C) 2007 Henrik Stengaard
This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
-----------
1.3. Author
-----------
Maviant is developed, written and maintained by Henrik Stengaard. For any questions
about Maviant, please send them to:
Henrik Stengaard
Computer System Developer
Henrik.Stengaard@agrsci.dk
Molecular Genetics and Systems Biology Group
Department of Genetics and Biotechnology
Faculty of Agrscicultural Science
University of Aarhus - Denmark
Research Centre Foulum
Blichers Allé
Postbox 50
DK-8830 Tjele
-----------------
1.4. Requirements
-----------------
Maviant requires following programs and libraries installed:
- Perl v5.8
- GD graphics library.
- Perl-GD.
-----------------
1.5. Installation
-----------------
Install Maviant and Perl with following instructions depending on operative system.
Maviant:
1. Download and uncompress Maviant package from http://snp.agrsci.dk/maviant to a
given directory.
Windows:
1. Download ActiveState Perl from http://www.activestate.com/products/activeperl/.
2. Install ActiveState Perl with PPM which is used to install Perl-GD
automatically.
3. Run "install.bat" in Installation/Windows folder in Maviant package to install
Perl-GD using PPM. Internet connection is required.
Linux:
1. A standard installation of Linux should have Perl installed. If Perl for some
reason is not installed follow your distribution installation guidelines. For
Fedora this is commonly done using yum install. Other distributions may require
RPM packages from installation cd.
2. Install GD graphics library from http://www.libgd.org and follow readme files
for installation instructions.
3. Install Perl-GD using "yum install Perl-GD" or download and compile it from CPAN
at http://search.cpan.org/~lds/GD-2.35/GD.pm.
--------------
2. Input files
--------------
Using Maviant requires knowledge to the different input files. Most of them are
plain text files with the exception of chromatogram files, which are binary.
--------
2.1. Ace
--------
ACE files contains an alignment for each contig. Maviant reads alignments from ACE
files and uses them to generate views. Maviant support ACE files from Cap, Phrap
and PolyBayes.
----------
2.2. Phred
----------
Phred PHD files can be read by Maviant and are used to indicate quality scores for
each sequence in views using different colors. Phred files are also required for
chromatogram files.
-----------------
2.3. Chromatogram
-----------------
Chromatogram files can read by Maviant, generating chromatogram images aligned
under each sequence in a view. Maviant supports following chromatogram files:
- ABI files. Applied Biosystems (ABI) chromatograms. Support includes ABI, ABD
and AB1 files.
- SCF files. Standard Chromatogram File. Support includes the SCF files v2.0
and v3.0.
Maviant only reads chromatograms, if phred files exist for each sequence in a
contig.
--------------------------
2.4. PHD, ABI and SCF list
--------------------------
The PHD, ABI and SCF list files can be used when phred or chromatogram files for a
contig is located in multiple directories. A file list is tab delimited with 2
columns: target name and file. Maviant use list files to read phred or
chromatogram files for each sequence in a contig. Here's an example of a SCF list
file:
seq1 /usr/local/data/sequences1/seq1.scf
seq2 /usr/local/data/sequences1/seq2.scf
seq3 /usr/local/data/sequences2/seq3.scf
seq4 /usr/local/data/sequences2/seq4.scf
-----------------
2.5. Feature list
-----------------
The feature list file is used to define feature like SNPs, repeats,
masked sequence or other custom annotations in a view. The feature list is tab
delimited with 6 columns: target name, feature id, feature type, feature start
position, feature end position and parameters.
The target name column defines which sequence the feature is related to and the
feature id defines a unique id for each feature in a view.
The feature type column defined which type a feature is. Each feature type is
defined in the feature type list.
The feature start and end position defines where a feature starts and ends in the
sequence.
The parameter column defines parameters for a feature. Parameters are used to
together with feature and click urls in the feature type list to show more detailed
information or annotation. Multiple parameters are seperated with semicolons.
Here's an example of a feature type list:
contig1 contig1.45 snp 45 45 snpid=5
contig1 contig1.89 snp 89 89 snpid=6
----------------------
2.6. Feature type list
----------------------
The feature type list file is used to describes each different feature type in the
feature list. The feature type list is tab delimited with 6 columns: type name,
color red (0-255), color green (0-255), color blue (0-255), feature url and click
url.
The type name column defines the feature type name and has to match the feature
list column 3 in order to set the color and urls for each feature in the feature
list.
The 3 color columns red, green and blue defines the feature type color.
Feature urls are shown in a tooltip window when a user clicks on a feature in a
view and click urls are opened a new browser window. The feature and click urls
can contain parameter names, which are replaced with the parameter values from
the feature list's parameter column. This makes it possible for Maviant to show a
webpage with more detailed information or annotation of a feature. A parameter name
has the format [:parameter_name:]. For example a feature has the parameter snpid=5
and the feature url is snp[:SNPID:].html. When a user clicks on the feature in a
view, Maviant replaces the parameter name with its value, which means Maviant shows
a webpage using the feature url snp5.html.
Here's an example of a feature type list:
snp 255 0 0 snp[:SNPID:].html snp[:SNPID:].html
--------
3. Usage
--------
Maviant runs in console environment. This is dos-prompt for Windows and Bash or
other shell for Linux. To run Maviant with own data or example data examine the
options first.
The example directory in the Maviant package contains examples ready for evaluation
usage. Each example is based on chromatograms downloaded from NCBI trace archive at
http://www.ncbi.nlm.nih.gov/Traces/trace.cgi? and has batch and script files to run
Maviant. The pipeline used to prepare each example consists of phred, phd2fasta,
cap3 and polybayes. Pipeline script is also included for further study of parameter
settings.
------------
3.1. Options
------------
Maviant has a set of options where some are required and others are optional.
Required:
-ace=[file]
ACE file to read contig alignment from.
-contig=[text]
Name of the contig in the ACE file to generate Maviant view from.
-output=[directory]
Output directory to write Maviant view.
Optional:
-phd=[directory]
Directory to read PHD files from. Maviant tries to read PHD files with
sequence name with ".phd.1" extension in the specified directory.
-abi=[directory]
Directory to read ABI files from. Maviant tries to read ABI files with
sequence name in the specified directory.
-scf=[directory]
Directory to read SCF files from. Maviant tries to read SCF files with
sequence name in the specified directory.
-phdlist=[file]
PHD list file specifing where to read PHD files located in multiple
directories.
-abilist=[file]
ABI list file specifing where to read ABI files located in multiple
directories.
-scflist=[file]
SCF list file specifing where to read SCF files located in multiple
directories.
-featurelist=[file]
File to read feature list from.
-featuretype=[file]
File to read feature type list from.
-description=[file]
File to read description from and enabling description box on the overview
page.
-title=[text]
Title for webpages generated in the Maviant view.
-nostat
Disables statistics box on the overview page.
-noinput
Disables input box on the overview page.
--------------------
3.2. Running Maviant
--------------------
Maviant runs in a dos-prompt for Windows and Bash or another shell. Maviant operates
on a single contig at a time. If an ace file contains 3 contigs, Maviant has to run 3
times with different input files and output directories. For example:
maviant.pl -ace=example1.ace -contig=Contig1 -output=Contig1
maviant.pl -ace=example1.ace -contig=Contig2 -output=Contig2
maviant.pl -ace=example1.ace -contig=Contig3 -output=Contig3
To add features in a Maviant view, a feature list file is required. A Perl script
"build_example_features.pl" is included for the examples to illustrate how this can be
done:
build_example_features.pl Example1
build_example_features.pl Example2
build_example_features.pl Example3
-----------------
3.3. Maviant view
-----------------
To open a view generated by Maviant, open "index.html" in Maviant output directory.
Otherwise the javascript in the webpages will fail upon execution, since "index.html"
creates hidden frames used by javascript.
The first page in a Maviant view is an overview page with description, statistics and
input information depending in the options, when the view was generated. The overview
page gives a complete overview of the entire alignment in 3 different modes. A data
mode indicating each sequence's nucleotides. A filter mode to spot differences between
sequences and the consensus and a quality mode to indicate quality scores.
Nagivation to a specific fragment is done by clicking on a sequence of interest. This
opens a fragment page. The alignment is currently divided in to several fragments by
the length of 50 basepairs. Each fragment has a navigation panel and a fragment view.
The navigation panel lets the user navigate between the fragments and the overview,
switch to a different mode similar to the overview and expand or collapse chromatogram
or features. The fragment view lists sequence aligned under each other present in the
given fragment. Each sequence also has information about length and direction
indicated with a small arrow next to the sequence name. Chromatogram images and
features are present depending on options. Each feature is clickable and will show a
tooltip window when single clicked and open in a new window when double clicked. The
webpage shown can contain more detailed information or be used to annotate features.
--------------------
3.4. Troubleshooting
--------------------
For any problems running and using Maviant, please contact the author Henrik Stengaard.
--------------
4. Development
--------------
Maviant v1.0 is far from complete and with a lot of requests, new features and
improvements, Maviant v2.0 is under development. Here is a list of new features
planned:
- Rewritten in c# running on platforms with Mono or Microsoft .NET runtime. This
will improve generation speed and reduce memory usage drastically.
- Minimal use of javascript to increase load speed in browsers.
- XML files used for structuring all input data, making it easier rerun or extend
existing views with additional data without having to regenerate the entire view.
- Add support for BLAST, SAHAA, DiAlign, ClustalW.
- Add support for amino acid sequences.
- Features are dynamically shown in views with both local and remote solution. The
becomes quite useful for feature annotation. For example an unannotated SNP in
a view can be indicated using a yellow color. When the SNP later is annotated
as accepted or rejected, this can be indicated using a green or red color. This
does not require regeneration of a view.
- New selection and annotation of multiple features at the same time. For example
this makes it easier to annotate multiple SNPs as accepted or rejected.
- With dynamic features users can manually add new features to the view. For
example a user found a valid SNP and add this using the selection and annotation
feature.
- Navigation between features using feature buttons in navigation panel to avoid
having to go back to overview page to find features in another fragment.
- New feature overview page showing a list of features in a view with their
current annotation and click jumps to fragment with the feature.
- External linking making it possible to jump into a view in a browser window and
go directly to a specific fragment or feature of interest.
- Simple text output without chromatogram support for use as a quick view and in
console based environments.