MACSE Programs – GE²pop

trimNonHomologousFragments

Triming non-homologous fragments before alignment The TrimNonHomologousFragments subprogram was developed to specifically filter long insertions that often result from annotation errors of introns introduced in CDSs or from alternative splicing. Indeed, positioning long insertions in one or several sequences could drastically slow down the alignment process. Long indels may olaso often prove finally useless since … Lire la suite

trimAlignment

Alignment trimming Because protein-coding sequence extremities may contain some UTR fragments, could be less reliable due to errors in the sequencing process, or start at different positions because of different PCR primers being used, alignment extremities are often gappy and therefore not very reliable part of alignments. Warning This program only uses the number/fraction of … Lire la suite

translateNT2AA

Translate nucleotide sequences into amino acid sequences The MACSE subprogram translateNT2AA translates nucleotide sequences into amino acid ones using the specified genetic codes. 1. Basic usage The only mandatory option of this program is the seq option that indicates the name of the FASTA file containing the sequences to be translated: If your input file … Lire la suite

splitAlignment

Spliting alignment or extracting a sub-alignment (subset of species and/or sites) Given a large alignment, one can be interested in only a subset of the aligned sequences or in a subset of the sites (i.e. some specific regions/domains of the CDS). The MACSE subprogram splitAlignment is designed to extract sub-alignments you are really interested in … Lire la suite

reportMaskAA2NT

Report amino acids mask on nucleotides sequences If your analyses are sensitive to alignment errors (e.g. dN/dS estimation), we strongly advice to use a post filtering of your alignment at the amino acid level (e.g. using HMMCleaner, BMGE or trimAl) and to report this AA masking/filtering at the nucleotide level using. This subprogram is dedicated … Lire la suite

reportGapsAA2NT

Deriving a nucleotide alignment from an amino acid alignment The alignSequences subprogram could be time consuming for large datasets. A possible, still efficient, strategy for large datasets is the following: This pipeline as been successfully used to produce OrthoMaM alignments and is available through a dedicated web service at http://mbb.univ-montp2.fr/MBB/. 1. Reporting gaps from aligned … Lire la suite

refineAlignment

Refining alignments The refineAlignment subprogram tries to further improve an existing nucleotide alignment. It aligns sequences at the nucleotide level while scoring the considered nucleotide alignments based on their amino acid translation. It thus favors nucleotide gap stretches that are multiple of three but it also considers those inducing frameshifts, when they allow to recover … Lire la suite

multiPrograms

Run multiple programs URL : samples/multiPrograms/ This subprogram allows to sequentially executes multiple MACSE commands contained in a text file (one per line). This allows basic scripting for non bioinformaticians. The main option of this subprogram is a file containing a list of MACSE commands. Each line of this file must contains a single MACSE … Lire la suite

exportAlignment

Alignment export MACSE pinpoints frameshifts using the « ! » character. However this is not standard usage and alignment with such characters will be rejected by most software that take a multiple sequence alignment as input. This MACSE subprogram allows to replace « ! » characters in nucleotide and amino acid alignments. It also allows computing some basic statistics … Lire la suite

enrichAlignment

Adding sequences to a previously computed alignment Folder: samples/enrichAlignment/ If you have previously computed a protein-coding nucleotide alignment respecting the reading frame, you can use the enrichAlignment subprogram to (conditionally) add new sequences to this alignment. The options are the same as those existing for alignSequences. You can (1) specify a subset of less reliable … Lire la suite