McRate (VERSION 1.0)
McRate is a program for detecting conserved amino-acid sites by computing the relative evolutionary rate for each site in the multiple sequence alignment (MSA). In contrast to other existing programs McRate considers all possible phylogenetic scenarios among the sequences under study, rather than relying on a single best tree.

Download the program (for WINDOWS):
Current version is from 15.8.2004.
For support and questions please email me: itaymay@post.tau.ac.il
McRate.exe (version 1.0)
seq.aln (Clustal file format).
tree.txt (Newick tree file format).

You can try the program by typing McRate.exe -s seq.aln -t tree.txt


Source code and copyrights:
Source code (C++) for UNIX and LINUX is also available for download here: [McRate.1.0.source.zip].
The makefile within can be used to compile the executable (by typing the make command). Alternatively, type: g++ -o rate4site.exe -O3 *.cpp.

If there are problems with the compilations (occasionally, with old version of g++) - please email me and I'll try to help. To modify the code, or use parts of it for other purposes, permission is requested. Please contact Tal Pupko at talp@post.tau.ac.il


Overview:
The degree to which an amino-acid site is free to vary is strongly dependent on its structural and functional importance. An amino acid that plays an essential role, such as one within the active site of the protein, is unlikely to change over evolutionary time. Hence, the evolutionary rate at an amino-acid site is indicative of how conserved this site is, and in turn, allows evaluating the importance of this site in maintaining the structure or function of the protein. McRate calculates the relative evolutionary rate at each site using a probabilistic-based evolutionary model. This allows taking into account the stochastic process underlying sequence evolution within protein families. Most importantly, McRate uses Markov chain Monte Carlo (MCMC) methodology to integrate over the space of all possible trees. Hence, McRate does not assume a pre-existing phylogenetic tree under which the sequences relate. McRate was found to be superior over na?ve methods that rely on a single tree only.

Methodology:
The sole obligatory input to McRate is an MSA file. It then uses MCMC to sample trees (and evolutionary model parameters) according to their posterior probabilities. It then calculates site-specific evolutionary rates for each sampled point using an empirical Bayesian method (Mayrose et al. 2004. MBE). The result estimate is the average rate over all sampled points.


In citing the McRate program please refer to:

Mayrose I, Mitchell A, Pupko T. 2005. Site-specific evolutionary rate inference: taking phylogenetic uncertainty into account. J Mol Evol. 60(3): 345-353. [pdf] [abs]

Usage:

Flag Description Default
-s [MSA file] The input sequence file name.
The following formats are supported: Mase, Molphy, Phylip, Clustal, Fasta
Obligatory
-o [output Directory] The results output directory. Result files include:
mcRate4Site.txt – The final rate estimates over all chains.
For each chain, numbered XX below, 4 result files are created:
chainXXr4s.txt – the rate estimates for this chain only.
chainXX_log.txt – the acceptance rate for each proposal mechanism.
chainXX_trees.txt – a list of all trees sampled.
chainXX_res.txt – the likelihood and alpha parameter for each sampled point.
McRateRes/
-a [sequence name] Reference sequence name in the MSA. The conservation scores are printed based on the amino-acids in this sequence. First sequence in the MSA
-n [chains number] Number of independent MCMC chains to run. 1
-k [categories number] The number of discrete Gamma categories 16
-m [evolutionary model] The following amino-acids models are supported:
DAY (-md), JTT (-mj), REV (-mr), aaJC (-ma).
For nucleotides, currently only the JC model is supported (-mn)
-mj
-e [thinning rounds] -eXX; Chain will be sampled every XXth step. 10
-b [burnin time] -bXX; The chain will not sample during the first XX steps. 10,000
-i [inference time] -iXX; The chain will run for at most XX steps. 100,000
-g Remove positions with gaps. Off
-f Homogenous rate model. Default is Gamma. Off
-h help